question

CharlesWalker-0781 avatar image
1 Vote"
CharlesWalker-0781 asked ramr-msft answered

Using preexisting audio in Bot Framework Text to Speech

We currently have a seperate api to serve text-to-speech responses to our users however we are looking at alternative approaches and noted that Direct Line does have an audio option that we could use.

Looking into it, it will try to always use the azure text-to-speech resource. We have tried to manage this in the current version by storing the audio response so it doesn't need to be regenerate each time and to decrease cost.

Is there a way of using Direct Line Speech with a custom method to try to pull existing audio, before trying to generate a new audio file?

azure-bot-service
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

ramr-msft avatar image
0 Votes"
ramr-msft answered

@CharlesWalker-0781 Thanks for the question. Can you please add more details about the sample that you are trying. We have forwarded to the product team to check on this.
Known issues in Direct line speech : https://github.com/microsoft/BotFramework-WebChat/blob/main/docs/DIRECT_LINE_SPEECH.md#known-issues
Please follow the document for Direct Line Speech and samples using the SDK, Voice Enable bot with Speech SDK. We would recommend raise an issue in the BotFramework- WebChat.



· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi, we're not using a sample at the moment. More trying to find somewhere to look.
We have production code using C# bot framework that works fine with text, but we would also like to return audio.
We know Microsoft has support for audio to be returned, but it uses the text-to-speech api service which we would ideally only like to call if the audio file does not exist in our store.
hat we would like to do, is modify the bot framework process to allow us to return an audio file, rather than ask text-to-speech to generate SSML and return the generated output.

0 Votes 0 ·