question

DjafariNainiKaweh-8025 avatar image
0 Votes"
DjafariNainiKaweh-8025 asked YutongTie-MSFT edited

Speech Studio - (Speech to Text for Dutch) - New baseline models show low performance and only support Text + Pronunciation. How can I access older versions of the models?

My data on the new baseline models shows poor performance.
Previous baseline model was also supporting Audio.
This model worked best in our use cases (Audio, Text, Pronunciation) (20201015)

Is there a way to access older baseline models for training?

Or

Can we expect upcoming new models supporting Audio?



azure-cognitive-servicesazure-speech
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @DjafariNainiKaweh-8025

Thansk for reaching out to us, could you please give me more details or any example about the low performance so that we can look into it? Every time we update the model we are aiming to a better performance. I am sorry for the experience happens to you.

Regards,
Yutong

0 Votes 0 ·

Hi Yutong,

I am working on a use case for postal codes:

196167-image.png



I have collected audio files spoken by many different users and fine-tuned it on different baselines and compared the performance on new and old baseline.
The old baseline is recognising 15%-20% more cases than the new baseline. I can't compare the old baseline and the new baseline directly since the old one is removed.
But my fine-tuned models on the old baseline are still working.

Can you explain what is the difference between (audio, text, pronunciation) and (text, pronunciation)?

There is no audio, text, pronunciation model anymore for Dutch. I also realised similar issues for German in which audio, text, pronunciation performed best.

Best regards,
Kaweh



0 Votes 0 ·
image.png (7.1 KiB)

0 Answers