question

JunbumKwon-2990 avatar image
0 Votes"
JunbumKwon-2990 asked YinheWei-3673 answered

20 second Video limit for Pronunciation Assessment

Hello,

Today, we tried to analyze Video speech using the "Pronunciation Assessment". But, at one time, we were allowed to analyze only 20 seconds' videos.

How we can analyze the entire video (about 3 minutes)?

azure-speech
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello,

Is there any code sample you are referring to for your project? We have a video to explain the sample code here https://www.youtube.com/watch?v=zFlwm7N4Awc .

Hope this helps.

Regards,
Yutong

1 Vote 1 ·
YutongTie-MSFT avatar image
0 Votes"
YutongTie-MSFT answered

Hello,

I have tried again on my side and it works for me well. As below screenshot, it's successfully with my 30sec audio. This is the sample code repo I am using, all of them are very convenient. https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/PronunciationAssessment/CSharp/Console

120549-image.png

Please let me any block for this and share the code sample you are using.

Regards,
Yutong


image.png (24.0 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

YinheWei-3673 avatar image
0 Votes"
YinheWei-3673 answered

Hi, @JunbumKwon-2990

To handle long speech for pronunciation assessment, you can refer to below sample code:
https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/python/console/speech_sample.py#L643

It is based on continuous recognition and therefore it doesn't have limitation on length.

Regards,
Yinhe

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.