question

AhmedBravo-2347 avatar image
0 Votes"
AhmedBravo-2347 asked AhmedBravo-2347 answered

Microsoft.CognitiveServices.Speech Text to Speech save to Azure Blob or Convert to Wav Byte Array

I am trying to convert the Microsoft.CognitiveServices.Speech Text to Speech to and Wav file Byte Array or Save a Wav File to Azure Blob.

I have read the documentation and the only available methods are to save to a wav file on a local machine, or a Byte array that is in PCM format and not a Wav file format.

Any direction on converting PCM to WAV or Saving File directly to Azure blob would be helpful.

azure-cognitive-servicesazure-speech
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@AhmedBravo-2347 Thanks for the question. Can you please add more details about the Text to Speech SDK or API that you are trying.
If you are using the API samples, the X-Microsoft-OutputFormat used was riff-24khz-16bit-mono-pcm.This should be in alignment with the audio type that we want to save and play back. Output parameter in request header. X-Microsoft-OutputFormat defines the type of audio that will be returned from the API.

Doc for audio outputs: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/rest-text-to-speech#audio-outputs


Please follow the following document for sample Text to Speech.


0 Votes 0 ·

@ramr-msft thanks for your reply.


I am using the Azure Cognitve Speech Sdk for .net Core, i am trying a few different angles here.

the Issue i am having is that when i try to save the result.AudioData to wav it sounds like Noise.

I used the SaveToWaveFileAsync and the file saves perfectly, but i wan to save the file to Azure blob, onyl thing i can think of is saving the byte array to memorystream and saving the stream to Azure blob, but when i do this the file is not playable.

0 Votes 0 ·
AhmedBravo-2347 avatar image
1 Vote"
AhmedBravo-2347 answered

found my answer here using the restful api, i was ale to recieve a stream and convert the stream to a file saved in Azure blob:

https://stackoverflow.com/questions/57915170/how-to-use-azures-text-to-speech-to-create-an-audio-file-instead-of-live-text-t

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

AhmedBravo-2347 avatar image
0 Votes"
AhmedBravo-2347 answered AhmedBravo-2347 rolled back

@ramr-msft thanks for your reply.

 public async Task<byte[]> AzureSynthesisToBytesAsync(string audioText, string Id, ILogger<AzureSynthesisController> _logger)
         {
             byte[] buffer = new byte[160000000];
             var config = SpeechConfig.FromSubscription("key", "westus");
             config.SpeechSynthesisVoiceName = "en-US-AriaNeural";
             config.SetSpeechSynthesisOutputFormat(SpeechSynthesisOutputFormat.Raw8Khz16BitMonoPcm);
             // Creates a speech synthesizer using the default speaker as audio output.
             using (var synthesizer = new SpeechSynthesizer(config, null))
             {
                 while (true)
                 {
                     using (SpeechSynthesisResult result = await synthesizer.SpeakSsmlAsync(audioText))
                     {
                         if (result.Reason == ResultReason.SynthesizingAudioCompleted)
                         {
                             AudioDataStream stream = AudioDataStream.FromResult(result);  // to return in Memory
                             await stream.SaveToWaveFileAsync("c://temp/TestAudio_"+ DateTime.Now.Hour.ToString() + "_" + DateTime.Now.Minute.ToString()+".wav");
     
                             //_logger.LogInformation("AzureSynthesisController_AzureSynthesisToBytesAsync", new Dictionary<string, string> { { "Id", Id }, { "ResultReasonMessage", "SynthesizingAudioCompleted" }, { "OriginalText", audioText } });
     
    
                             var buffer2 = result.AudioData; 
                             Stream stream2 = new MemoryStream(buffer2);
     
                             return buffer2;
     
                             //return result.AudioData;
                         }
                         else if (result.Reason == ResultReason.Canceled)
                         {
                             var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
                             _logger.LogInformation("AzureSynthesisController_AzureSynthesisToBytesAsync", new Dictionary<string, string> { { "Id", Id }, { "message", $"CANCELED: Reason={cancellation.Reason}" }, { "result.Reason", "ResultReason.Canceled" } });
     
                             if (cancellation.Reason == CancellationReason.Error)
                             {
                                 _logger.LogError("ResultReason.Canceled", new Dictionary<string, string>
                                 {
                                     { "Id", Id }
                                     , { "ErrorMethod", "AzureSynthesisController_AzureSynthesisToBytesAsync" }
                                     , { "message", $"CANCELED: ErrorCode={cancellation.ErrorCode}" }
                                     , { "messageDetails",$"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]" }
                                     , { "messageDetails2", $"CANCELED: Did you update the subscription info?" }
                                     , { "result.Reason", "CancellationReason.Error" }
     
                                 });
     
                             }
                         }
                     }
                 }
     
    
             }
         }

I am using the Azure Cognitve Speech Sdk for .net Core, i am trying a few different angles here.

the Issue i am having is that when i try to save the result.AudioData to wav it sounds like Noise.

I used the SaveToWaveFileAsync and the file saves perfectly, but i wan to save the file to Azure blob, onyl thing i can think of is saving the byte array to memorystream and saving the stream to Azure blob, but when i do this the file is not playable.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.