Azure Speech-to-Text REST API

Question

I'm using Azure Speech-to-text REST API in python to transcribe audio to text. I set the output format to be "detailed" and expect to get multiple results for a input audio. But I only can get one text. Here is the code I ran:

url = "https://" + location + ".stt.speech.microsoft.com/speech/recognition/conversation/cognitiveservices/v1?{}".format(
urlencode({
"language": language,
"format": "detailed",
"profanity": profanity
}))

Could you let me know how I can get multiple text results for an input audio using Azure Speech-to-text REST API? Thanks.

Answer

@Lynn Thanks for the question. Can you please share the audio input that you are trying, Since the parameter "format=detailed" returns this in NBest list of JSONs.

Please follow our documentation for batch transcription: Speech service - Azure Cognitive Services | Microsoft Learn

Azure Speech-to-Text REST API

1 answer