question

MuhammadFaizanBadar-1809 avatar image
0 Votes"
MuhammadFaizanBadar-1809 asked GiftA-MSFT answered

Sending audio to Direct Line Speech from ASP.NET application?

We're building out a custom integration for a Telefony provider to integrate with Azure Cognitive Services bot via the Direct Line channel.

The REST relay endpoint, which will basically be the client for the bot, is able to receive the audio as a in-memory variable (encoded in base64).

How can this audio be transmitted to the bot into the Direct Line Speech channel and fetch a response with the response from the Bot in audio?

The catch is that, the relay has to be maintained in REST: querying bot and sending response back. We've concluded that the websocket connection will not persist throughout the interaction.

azure-cognitive-servicesazure-bot-service
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi, thanks for reaching out. The following tutorial goes into detail on how to voice enable your bot using speech SDK. Let us know if you have any further questions.


0 Votes 0 ·

Hi!

Thanks for getting back to me

What we're looking for is confirmation from the ACS dev team that this request exchange can happen with a simple audio transfer (bundling STT, NLU and TTS).

The Speech SDK is based on a persistent websocket connection which the telefony architecture does not support.


Architecture%20Requirement.png


0 Votes 0 ·

1 Answer

GiftA-MSFT avatar image
0 Votes"
GiftA-MSFT answered

Hi, quick follow-up. Your scenario may not be fully supported as our systems are currently architected. DirectLine Speech doesn't have provisions for non-streamed audio. However, if you're looking for telephony, consider using our existing public preview telephony channel (which uses ACS & Speech Services already). Hope this helps.


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.