How can I make my own TTS(Text to Speech)? What should we do?

Gates 1 Reputation point
2020-12-10T09:54:39.203+00:00

I am an announcer. And I want make my own DIY Text to Speech of Azure. How many steps should we do?
And I have a lot of friend who are some voice actors or announcers of some languages like Uyghur and Chahar Mongolian. Is there any way to order our own TTS?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,383 questions
Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
352 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. romungi-MSFT 41,861 Reputation points Microsoft Employee
    2020-12-10T14:02:33.647+00:00

    @Gates Azure speech service offers a demo page to test the audio content creation tool which is part of the Azure speech studio where users can create their own audio content in any language with controls over rate, pitch, volume, Intonation & Pronunciation.

    If the demo is something you are looking for you need to signup with Azure and create a speech resource from Azure portal and then access the speech studio to create your audio content. The steps for the same are mentioned in this document.

    The full version of audio content creation feature from speech studio which is simple and easy to use and its usage is billed against your speech resource. An azure account created for the first time comes with a free credit of 200$ for 30 days which can be used against your speech resource for the usage of this service. You can then subsequently upgrade your speech resource tier to a standard version and pay as you go as per the pricing of speech resource usage.


  2. romungi-MSFT 41,861 Reputation points Microsoft Employee
    2020-12-11T10:13:20.817+00:00

    @Gates For custom voice scenarios you would have to upload the transcript and the voice files and then create a custom voice model. This usually starts with recording your voice first and then uploading the transcripts and audio files to train a custom model and then deploy it as a endpoint. This feature of speech service is a gated technology and you would need to request for access before you start to use it. This feature of speech service is ideal if you would like to use your experience to create new voices. The language support for this feature are listed here.

    For the audio content creation feature you would only need to upload text files or key in text where the audio samples will be available in the voices that are listed for the language you have selected.

    0 comments No comments