Text-to-speech REST API
The Speech service allows you to convert text into synthesized speech and get a list of supported voices for a region by using a REST API. In this article, you'll learn about authorization options, query options, how to structure a request, and how to interpret a response.
The text-to-speech REST API supports neural text-to-speech voices, which support specific languages and dialects that are identified by locale. Each available endpoint is associated with a region. A subscription key for the endpoint or region that you plan to use is required. Here are links to more information:
- For a complete list of voices, see Language and voice support for the Speech service.
- For information about regional availability, see Speech service supported regions.
- For Azure Government and Azure China endpoints, see this article about sovereign clouds.
Important
Costs vary for prebuilt neural voices (called Neural on the pricing page) and custom neural voices (called Custom Neural on the pricing page). For more information, see Speech service pricing.
Before you use the text-to-speech REST API, understand that you need to complete a token exchange as part of authentication to access the service.
Authentication
Each request requires an authorization header. This table illustrates which headers are supported for each feature:
| Supported authorization header | Speech-to-text | Text-to-speech |
|---|---|---|
Ocp-Apim-Subscription-Key |
Yes | Yes |
Authorization: Bearer |
Yes | Yes |
When you're using the Ocp-Apim-Subscription-Key header, you're only required to provide your subscription key. For example:
'Ocp-Apim-Subscription-Key': 'YOUR_SUBSCRIPTION_KEY'
When you're using the Authorization: Bearer header, you're required to make a request to the issueToken endpoint. In this request, you exchange your subscription key for an access token that's valid for 10 minutes.
How to get an access token
To get an access token, you need to make a request to the issueToken endpoint by using Ocp-Apim-Subscription-Key and your subscription key.
The issueToken endpoint has this format:
https://<REGION_IDENTIFIER>.api.cognitive.microsoft.com/sts/v1.0/issueToken
Replace <REGION_IDENTIFIER> with the identifier that matches the region of your subscription from this table:
| Geography | Region | Region identifier |
|---|---|---|
| Africa | South Africa North | southafricanorth |
| Asia Pacific | East Asia | eastasia |
| Asia Pacific | Southeast Asia | southeastasia 1 |
| Asia Pacific | Australia East | australiaeast 1 |
| Asia Pacific | Central India | centralindia 1 |
| Asia Pacific | Japan East | japaneast |
| Asia Pacific | Japan West | japanwest |
| Asia Pacific | Korea Central | koreacentral |
| Canada | Canada Central | canadacentral 1 |
| Europe | North Europe | northeurope 1 |
| Europe | West Europe | westeurope 1 |
| Europe | France Central | francecentral |
| Europe | Germany West Central | germanywestcentral |
| Europe | Norway East | norwayeast |
| Europe | Switzerland North | switzerlandnorth |
| Europe | Switzerland West | switzerlandwest |
| Europe | UK South | uksouth 1 |
| Middle East | UAE North | uaenorth |
| South America | Brazil South | brazilsouth |
| US | Central US | centralus |
| US | East US | eastus 1 |
| US | East US 2 | eastus2 1 |
| US | North Central US | northcentralus 1 |
| US | South Central US | southcentralus 1 |
| US | US Gov Arizona | usgovarizona 1 |
| US | US Gov Virginia | usgovvirginia 1 |
| US | West Central US | westcentralus |
| US | West US | westus |
| US | West US 2 | westus2 1 |
| US | West US 3 | westus3 |
1 The region has dedicated hardware for Custom Speech training. In regions with dedicated hardware for Custom Speech training, the Speech service will use up to 20 hours of your audio training data, and can process about 10 hours of data per day. In other regions, the Speech service uses up to 8 hours of your audio data, and can process about 1 hour of data per day.
Use the following samples to create your access token request.
HTTP sample
This example is a simple HTTP request to get a token. Replace YOUR_SUBSCRIPTION_KEY with your subscription key for the Speech service. If your subscription isn't in the West US region, replace the Host header with your region's host name.
POST /sts/v1.0/issueToken HTTP/1.1
Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY
Host: westus.api.cognitive.microsoft.com
Content-type: application/x-www-form-urlencoded
Content-Length: 0
The body of the response contains the access token in JSON Web Token (JWT) format.
PowerShell sample
This example is a simple PowerShell script to get an access token. Replace YOUR_SUBSCRIPTION_KEY with your subscription key for the Speech service. Make sure to use the correct endpoint for the region that matches your subscription. This example is currently set to West US.
$FetchTokenHeader = @{
'Content-type'='application/x-www-form-urlencoded';
'Content-Length'= '0';
'Ocp-Apim-Subscription-Key' = 'YOUR_SUBSCRIPTION_KEY'
}
$OAuthToken = Invoke-RestMethod -Method POST -Uri https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken
-Headers $FetchTokenHeader
# show the token received
$OAuthToken
cURL sample
cURL is a command-line tool available in Linux (and in the Windows Subsystem for Linux). This cURL command illustrates how to get an access token. Replace YOUR_SUBSCRIPTION_KEY with your subscription key for the Speech service. Make sure to use the correct endpoint for the region that matches your subscription. This example is currently set to West US.
curl -v -X POST \
"https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken" \
-H "Content-type: application/x-www-form-urlencoded" \
-H "Content-Length: 0" \
-H "Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY"
C# sample
This C# class illustrates how to get an access token. Pass your subscription key for the Speech service when you instantiate the class. If your subscription isn't in the West US region, change the value of FetchTokenUri to match the region for your subscription.
public class Authentication
{
public static readonly string FetchTokenUri =
"https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken";
private string subscriptionKey;
private string token;
public Authentication(string subscriptionKey)
{
this.subscriptionKey = subscriptionKey;
this.token = FetchTokenAsync(FetchTokenUri, subscriptionKey).Result;
}
public string GetAccessToken()
{
return this.token;
}
private async Task<string> FetchTokenAsync(string fetchUri, string subscriptionKey)
{
using (var client = new HttpClient())
{
client.DefaultRequestHeaders.Add("Ocp-Apim-Subscription-Key", subscriptionKey);
UriBuilder uriBuilder = new UriBuilder(fetchUri);
var result = await client.PostAsync(uriBuilder.Uri.AbsoluteUri, null);
Console.WriteLine("Token Uri: {0}", uriBuilder.Uri.AbsoluteUri);
return await result.Content.ReadAsStringAsync();
}
}
}
Python sample
# Request module must be installed.
# Run pip install requests if necessary.
import requests
subscription_key = 'REPLACE_WITH_YOUR_KEY'
def get_token(subscription_key):
fetch_token_url = 'https://westus.api.cognitive.microsoft.com/sts/v1.0/issueToken'
headers = {
'Ocp-Apim-Subscription-Key': subscription_key
}
response = requests.post(fetch_token_url, headers=headers)
access_token = str(response.text)
print(access_token)
How to use an access token
The access token should be sent to the service as the Authorization: Bearer <TOKEN> header. Each access token is valid for 10 minutes. You can get a new token at any time, but to minimize network traffic and latency, we recommend using the same token for nine minutes.
Here's a sample HTTP request to the speech-to-text REST API for short audio:
POST /cognitiveservices/v1 HTTP/1.1
Authorization: Bearer YOUR_ACCESS_TOKEN
Host: westus.stt.speech.microsoft.com
Content-type: application/ssml+xml
Content-Length: 199
Connection: Keep-Alive
// Message body here...
Get a list of voices
You can use the voices/list endpoint to get a full list of voices for a specific region or endpoint:
| Region | Endpoint |
|---|---|
| Australia East | https://australiaeast.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Brazil South | https://brazilsouth.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Canada Central | https://canadacentral.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Central US | https://centralus.tts.speech.microsoft.com/cognitiveservices/voices/list |
| China East 2 | https://chinaeast2.tts.speech.azure.cn/cognitiveservices/voices/list |
| China North 2 | https://chinanorth2.tts.speech.azure.cn/cognitiveservices/voices/list |
| East Asia | https://eastasia.tts.speech.microsoft.com/cognitiveservices/voices/list |
| East US | https://eastus.tts.speech.microsoft.com/cognitiveservices/voices/list |
| East US 2 | https://eastus2.tts.speech.microsoft.com/cognitiveservices/voices/list |
| France Central | https://francecentral.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Germany West Central | https://germanywestcentral.tts.speech.microsoft.com/cognitiveservices/voices/list |
| India Central | https://centralindia.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Japan East | https://japaneast.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Japan West | https://japanwest.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Jio India West | https://jioindiawest.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Korea Central | https://koreacentral.tts.speech.microsoft.com/cognitiveservices/voices/list |
| North Central US | https://northcentralus.tts.speech.microsoft.com/cognitiveservices/voices/list |
| North Europe | https://northeurope.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Norway East | https://norwayeast.tts.speech.microsoft.com/cognitiveservices/voices/list |
| South Central US | https://southcentralus.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Southeast Asia | https://southeastasia.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Switzerland North | https://switzerlandnorth.tts.speech.microsoft.com/cognitiveservices/voices/list |
| Switzerland West | https://switzerlandwest.tts.speech.microsoft.com/cognitiveservices/voices/list |
| US Gov Arizona | https://usgovarizona.tts.speech.azure.us/cognitiveservices/voices/list |
| US Gov Virginia | https://usgovvirginia.tts.speech.azure.us/cognitiveservices/voices/list |
| UK South | https://uksouth.tts.speech.microsoft.com/cognitiveservices/voices/list |
| West Central US | https://westcentralus.tts.speech.microsoft.com/cognitiveservices/voices/list |
| West Europe | https://westeurope.tts.speech.microsoft.com/cognitiveservices/voices/list |
| West US | https://westus.tts.speech.microsoft.com/cognitiveservices/voices/list |
| West US 2 | https://westus2.tts.speech.microsoft.com/cognitiveservices/voices/list |
| West US 3 | https://westus3.tts.speech.microsoft.com/cognitiveservices/voices/list |
Tip
Voices in preview are available in only these three regions: East US, West Europe, and Southeast Asia.
Request headers
This table lists required and optional headers for text-to-speech requests:
| Header | Description | Required or optional |
|---|---|---|
Ocp-Apim-Subscription-Key |
Your subscription key for the Speech service. | Either this header or Authorization is required. |
Authorization |
An authorization token preceded by the word Bearer. For more information, see Authentication. |
Either this header or Ocp-Apim-Subscription-Key is required. |
Request body
A body isn't required for GET requests to this endpoint.
Sample request
This request requires only an authorization header:
GET /cognitiveservices/voices/list HTTP/1.1
Host: westus.tts.speech.microsoft.com
Ocp-Apim-Subscription-Key: YOUR_SUBSCRIPTION_KEY
Sample response
This response has been truncated to illustrate the structure of a response.
Note
Voice availability varies by region or endpoint.
[
{
"Name": "Microsoft Server Speech Text to Speech Voice (en-US, JennyNeural)",
"DisplayName": "Jenny",
"LocalName": "Jenny",
"ShortName": "en-US-JennyNeural",
"Gender": "Female",
"Locale": "en-US",
"StyleList": [
"chat",
"customerservice",
"newscast-casual",
"assistant",
],
"SampleRateHertz": "24000",
"VoiceType": "Neural",
"Status": "GA"
},
...
{
"Name": "Microsoft Server Speech Text to Speech Voice (en-US, JennyMultilingualNeural)",
"ShortName": "en-US-JennyMultilingualNeural",
"DisplayName": "Jenny Multilingual",
"LocalName": "Jenny Multilingual",
"Gender": "Female",
"Locale": "en-US",
"SampleRateHertz": "24000",
"VoiceType": "Neural",
"SecondaryLocaleList": [
"de-DE",
"en-AU",
"en-CA",
"en-GB",
"es-ES",
"es-MX",
"fr-CA",
"fr-FR",
"it-IT",
"ja-JP",
"ko-KR",
"pt-BR",
"zh-CN"
],
"Status": "Preview"
},
...
{
"Name": "Microsoft Server Speech Text to Speech Voice (ga-IE, OrlaNeural)",
"DisplayName": "Orla",
"LocalName": "Orla",
"ShortName": "ga-IE-OrlaNeural",
"Gender": "Female",
"Locale": "ga-IE",
"SampleRateHertz": "24000",
"VoiceType": "Neural",
"Status": "GA"
},
...
{
"Name": "Microsoft Server Speech Text to Speech Voice (zh-CN, YunxiNeural)",
"DisplayName": "Yunxi",
"LocalName": "云希",
"ShortName": "zh-CN-YunxiNeural",
"Gender": "Male",
"Locale": "zh-CN",
"StyleList": [
"Calm",
"Fearful",
"Cheerful",
"Disgruntled",
"Serious",
"Angry",
"Sad",
"Depressed",
"Embarrassed"
],
"SampleRateHertz": "24000",
"VoiceType": "Neural",
"Status": "GA"
},
...
]
HTTP status codes
The HTTP status code for each response indicates success or common errors.
| HTTP status code | Description | Possible reason |
|---|---|---|
| 200 | OK | The request was successful. |
| 400 | Bad request | A required parameter is missing, empty, or null. Or, the value passed to either a required or optional parameter is invalid. A common reason is a header that's too long. |
| 401 | Unauthorized | The request is not authorized. Make sure your subscription key or token is valid and in the correct region. |
| 429 | Too many requests | You have exceeded the quota or rate of requests allowed for your subscription. |
| 502 | Bad gateway | There's a network or server-side problem. This status might also indicate invalid headers. |
Convert text to speech
The v1 endpoint allows you to convert text to speech by using Speech Synthesis Markup Language (SSML).
Regions and endpoints
These regions are supported for text-to-speech through the REST API. Be sure to select the endpoint that matches your subscription region.
Prebuilt neural voices
Use this table to determine availability of neural voices by region or endpoint:
| Region | Endpoint |
|---|---|
| Australia East | https://australiaeast.tts.speech.microsoft.com/cognitiveservices/v1 |
| Brazil South | https://brazilsouth.tts.speech.microsoft.com/cognitiveservices/v1 |
| Canada Central | https://canadacentral.tts.speech.microsoft.com/cognitiveservices/v1 |
| Central US | https://centralus.tts.speech.microsoft.com/cognitiveservices/v1 |
| China East 2 | https://chinaeast2.tts.speech.azure.cn/cognitiveservices/v1 |
| China North 2 | https://chinanorth2.tts.speech.azure.cn/cognitiveservices/v1 |
| East Asia | https://eastasia.tts.speech.microsoft.com/cognitiveservices/v1 |
| East US | https://eastus.tts.speech.microsoft.com/cognitiveservices/v1 |
| East US 2 | https://eastus2.tts.speech.microsoft.com/cognitiveservices/v1 |
| France Central | https://francecentral.tts.speech.microsoft.com/cognitiveservices/v1 |
| Germany West Central | https://germanywestcentral.tts.speech.microsoft.com/cognitiveservices/v1 |
| India Central | https://centralindia.tts.speech.microsoft.com/cognitiveservices/v1 |
| Japan East | https://japaneast.tts.speech.microsoft.com/cognitiveservices/v1 |
| Japan West | https://japanwest.tts.speech.microsoft.com/cognitiveservices/v1 |
| Jio India West | https://jioindiawest.tts.speech.microsoft.com/cognitiveservices/v1 |
| Korea Central | https://koreacentral.tts.speech.microsoft.com/cognitiveservices/v1 |
| North Central US | https://northcentralus.tts.speech.microsoft.com/cognitiveservices/v1 |
| North Europe | https://northeurope.tts.speech.microsoft.com/cognitiveservices/v1 |
| Norway East | https://norwayeast.tts.speech.microsoft.com/cognitiveservices/v1 |
| South Central US | https://southcentralus.tts.speech.microsoft.com/cognitiveservices/v1 |
| Southeast Asia | https://southeastasia.tts.speech.microsoft.com/cognitiveservices/v1 |
| Sweden Central | https://swedencentral.tts.speech.microsoft.com/cognitiveservices/v1 |
| Switzerland North | https://switzerlandnorth.tts.speech.microsoft.com/cognitiveservices/v1 |
| Switzerland West | https://switzerlandwest.tts.speech.microsoft.com/cognitiveservices/v1 |
| UAE North | https://uaenorth.tts.speech.microsoft.com/cognitiveservices/v1 |
| US Gov Arizona | https://usgovarizona.tts.speech.azure.us/cognitiveservices/v1 |
| US Gov Virginia | https://usgovvirginia.tts.speech.azure.us/cognitiveservices/v1 |
| UK South | https://uksouth.tts.speech.microsoft.com/cognitiveservices/v1 |
| West Central US | https://westcentralus.tts.speech.microsoft.com/cognitiveservices/v1 |
| West Europe | https://westeurope.tts.speech.microsoft.com/cognitiveservices/v1 |
| West US | https://westus.tts.speech.microsoft.com/cognitiveservices/v1 |
| West US 2 | https://westus2.tts.speech.microsoft.com/cognitiveservices/v1 |
| West US 3 | https://westus3.tts.speech.microsoft.com/cognitiveservices/v1 |
Tip
Voices in preview are available in only these three regions: East US, West Europe, and Southeast Asia.
Custom neural voices
If you've created a custom neural voice font, use the endpoint that you've created. You can also use the following endpoints. Replace {deploymentId} with the deployment ID for your neural voice model.
| Region | Training | Deployment | Endpoint |
|---|---|---|---|
| Australia East | Yes | Yes | https://australiaeast.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Brazil South | No | Yes | https://brazilsouth.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Canada Central | No | Yes | https://canadacentral.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Central US | No | Yes | https://centralus.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| East Asia | No | Yes | https://eastasia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| East US | Yes | Yes | https://eastus.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| East US 2 | Yes | Yes | https://eastus2.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| France Central | No | Yes | https://francecentral.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Germany West Central | No | Yes | https://germanywestcentral.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| India Central | Yes | Yes | https://centralindia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Japan East | Yes | Yes | https://japaneast.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Japan West | No | Yes | https://japanwest.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Jio India West | No | Yes | https://jioindiawest.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Korea Central | Yes | Yes | https://koreacentral.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| North Central US | No | Yes | https://northcentralus.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| North Europe | Yes | Yes | https://northeurope.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Norway East | No | Yes | https://norwayeast.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| South Africa North | No | Yes | https://southafricanorth.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| South Central US | Yes | Yes | https://southcentralus.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Southeast Asia | Yes | Yes | https://southeastasia.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Switzerland North | No | Yes | https://switzerlandnorth.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| Switzerland West | No | Yes | https://switzerlandwest.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| UAE North | No | Yes | https://uaenorth.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| UK South | Yes | Yes | https://uksouth.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| West Central US | No | Yes | https://westcentralus.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| West Europe | Yes | Yes | https://westeurope.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| West US | Yes | Yes | https://westus.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| West US 2 | Yes | Yes | https://westus2.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
| West US 3 | No | Yes | https://westus3.voice.speech.microsoft.com/cognitiveservices/v1?deploymentId={deploymentId} |
Note
The preceding regions are available for neural voice model hosting and real-time synthesis. Custom neural voice training is only available in some regions. But users can easily copy a neural voice model from these regions to other regions in the preceding list.
Long Audio API
The Long Audio API is available in multiple regions with unique endpoints:
| Region | Endpoint |
|---|---|
| Australia East | https://australiaeast.customvoice.api.speech.microsoft.com |
| East US | https://eastus.customvoice.api.speech.microsoft.com |
| India Central | https://centralindia.customvoice.api.speech.microsoft.com |
| South Central US | https://southcentralus.customvoice.api.speech.microsoft.com |
| Southeast Asia | https://southeastasia.customvoice.api.speech.microsoft.com |
| UK South | https://uksouth.customvoice.api.speech.microsoft.com |
| West Europe | https://westeurope.customvoice.api.speech.microsoft.com |
Request headers
This table lists required and optional headers for text-to-speech requests:
| Header | Description | Required or optional |
|---|---|---|
Authorization |
An authorization token preceded by the word Bearer. For more information, see Authentication. |
Required |
Content-Type |
Specifies the content type for the provided text. Accepted value: application/ssml+xml. |
Required |
X-Microsoft-OutputFormat |
Specifies the audio output format. For a complete list of accepted values, see Audio outputs. | Required |
User-Agent |
The application name. The provided value must be fewer than 255 characters. | Required |
Audio outputs
This is a list of supported audio formats that are sent in each request as the X-Microsoft-OutputFormat header. Each format incorporates a bit rate and encoding type. The Speech service supports 24-kHz, 16-kHz, and 8-kHz audio outputs.
raw-16khz-16bit-mono-pcm riff-16khz-16bit-mono-pcm
raw-24khz-16bit-mono-pcm riff-24khz-16bit-mono-pcm
raw-48khz-16bit-mono-pcm riff-48khz-16bit-mono-pcm
raw-8khz-8bit-mono-mulaw riff-8khz-8bit-mono-mulaw
raw-8khz-8bit-mono-alaw riff-8khz-8bit-mono-alaw
audio-16khz-32kbitrate-mono-mp3 audio-16khz-64kbitrate-mono-mp3
audio-16khz-128kbitrate-mono-mp3 audio-24khz-48kbitrate-mono-mp3
audio-24khz-96kbitrate-mono-mp3 audio-24khz-160kbitrate-mono-mp3
audio-48khz-96kbitrate-mono-mp3 audio-48khz-192kbitrate-mono-mp3
raw-16khz-16bit-mono-truesilk raw-24khz-16bit-mono-truesilk
webm-16khz-16bit-mono-opus webm-24khz-16bit-mono-opus
ogg-16khz-16bit-mono-opus ogg-24khz-16bit-mono-opus
ogg-48khz-16bit-mono-opus
Note
If your selected voice and output format have different bit rates, the audio is resampled as necessary. You can decode the ogg-24khz-16bit-mono-opus format by using the Opus codec.
Request body
If you're using a custom neural voice, the body of a request can be sent as plain text (ASCII or UTF-8). Otherwise, the body of each POST request is sent as SSML. SSML allows you to choose the voice and language of the synthesized speech that the text-to-speech feature returns. For a complete list of supported voices, see Language and voice support for the Speech service.
Sample request
This HTTP request uses SSML to specify the voice and language. If the body length is long, and the resulting audio exceeds 10 minutes, it's truncated to 10 minutes. In other words, the audio length can't exceed 10 minutes.
POST /cognitiveservices/v1 HTTP/1.1
X-Microsoft-OutputFormat: riff-24khz-16bit-mono-pcm
Content-Type: application/ssml+xml
Host: westus.tts.speech.microsoft.com
Content-Length: <Length>
Authorization: Bearer [Base64 access_token]
User-Agent: <Your application name>
<speak version='1.0' xml:lang='en-US'><voice xml:lang='en-US' xml:gender='Male'
name='en-US-ChristopherNeural'>
Microsoft Speech Service Text-to-Speech API
</voice></speak>
* For the Content-Length, you should use your own content length. In most cases, this value is calculated automatically.
HTTP status codes
The HTTP status code for each response indicates success or common errors:
| HTTP status code | Description | Possible reason |
|---|---|---|
| 200 | OK | The request was successful. The response body is an audio file. |
| 400 | Bad request | A required parameter is missing, empty, or null. Or, the value passed to either a required or optional parameter is invalid. A common reason is a header that's too long. |
| 401 | Unauthorized | The request is not authorized. Make sure your subscription key or token is valid and in the correct region. |
| 415 | Unsupported media type | It's possible that the wrong Content-Type value was provided. Content-Type should be set to application/ssml+xml. |
| 429 | Too many requests | You have exceeded the quota or rate of requests allowed for your subscription. |
| 502 | Bad gateway | There's a network or server-side problem. This status might also indicate invalid headers. |
If the HTTP status is 200 OK, the body of the response contains an audio file in the requested format. This file can be played as it's transferred, saved to a buffer, or saved to a file.
Next steps
Tilbakemeldinger
Send inn og vis tilbakemelding for