Quickstart: Synthesize speech into an audio file

In this quickstart, you will use the Speech SDK to convert text to synthesized speech in an audio file. The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

You can view or download all Speech SDK C# Samples on GitHub.

Prerequisites

Before you get started, make sure to:

Open your project in Visual Studio

The first step is to make sure that you have your project open in Visual Studio.

  1. Launch Visual Studio 2019.
  2. Load your project and open Program.cs.

Start with some boilerplate code

Let's add some code that works as a skeleton for our project. Make note that you've created an async method called SynthesisToAudioFileAsync().


using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;

namespace helloworld
{
    class Program
    {
        public static async Task SynthesisToAudioFileAsync()
        {
        }

        static void Main()
        {
            SynthesisToAudioFileAsync().Wait();
        }
    }
}

Create a Speech configuration

Before you can initialize a SpeechSynthesizer object, you need to create a configuration that uses your subscription key and subscription region. Insert this code in the SynthesisToAudioFileAsync() method.

// Replace with your own subscription key and region identifier from here: https://aka.ms/speech/sdkregion
// The default language is "en-us".
var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");

Create an Audio configuration

Now, you need to create an AudioConfig object that points to your audio file. This object is created inside of a using statement to ensure the proper release of unmanaged resources. Insert this code in the SynthesisToAudioFileAsync() method, right below your Speech configuration.

var fileName = "helloworld.wav";
using (var fileOutput = AudioConfig.FromWavFileOutput(fileName))
{
}

Initialize a SpeechSynthesizer

Now, let's create the SpeechSynthesizer object using the SpeechConfig and AudioConfig objects created earlier. This object is also created inside of a using statement to ensure the proper release of unmanaged resources. Insert this code in the SynthesisToAudioFileAsync() method, inside the using statement that wraps your AudioConfig object.

using (var synthesizer = new SpeechSynthesizer(config, fileOutput))
{
}

Synthesize text using SpeakTextAsync

From the SpeechSynthesizer object, you're going to call the SpeakTextAsync() method. This method sends your text to the Speech service which converts it to audio. The SpeechSynthesizer will use the default voice if config.VoiceName isn't explicitly specified.

Inside the using statement, add this code:

var text = "Hello world!";
var result = await synthesizer.SpeakTextAsync(text);

Check for errors

When the synthesis result is returned by the Speech service, you should check to make sure your text was successfully synthesized.

Inside the using statement, below SpeakTextAsync(), add this code:

if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
    Console.WriteLine($"Speech synthesized to [{fileName}] for text [{text}]");
}
else if (result.Reason == ResultReason.Canceled)
{
    var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

    if (cancellation.Reason == CancellationReason.Error)
    {
        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
        Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
        Console.WriteLine($"CANCELED: Did you update the subscription info?");
    }
}

Check your code

At this point, your code should look like this:

//
// Copyright (c) Microsoft. All rights reserved.
// Licensed under the MIT license. See LICENSE.md file in the project root for full license information.
//

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;

namespace helloworld
{
    class Program
    {
        public static async Task SynthesisToAudioFileAsync()
        {
            var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");

            var fileName = "helloworld.wav";
            using (var fileOutput = AudioConfig.FromWavFileOutput(fileName))
            {
                using (var synthesizer = new SpeechSynthesizer(config, fileOutput))
                {
                    var text = "Hello world!";
                    var result = await synthesizer.SpeakTextAsync(text);

                    if (result.Reason == ResultReason.SynthesizingAudioCompleted)
                    {
                        Console.WriteLine($"Speech synthesized to [{fileName}] for text [{text}]");
                    }
                    else if (result.Reason == ResultReason.Canceled)
                    {
                        var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
                        Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                        if (cancellation.Reason == CancellationReason.Error)
                        {
                            Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                            Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
                            Console.WriteLine($"CANCELED: Did you update the subscription info?");
                        }
                    }
                }
            }
        }

        static void Main()
        {
            SynthesisToAudioFileAsync().Wait();
        }
    }
}

Build and run your app

Now you're ready to build your app and test our speech synthesis using the Speech service.

  1. Compile the code - From the menu bar of Visual Studio, choose Build > Build Solution.

  2. Start your app - From the menu bar, choose Debug > Start Debugging or press F5.

  3. Start synthesis - Your text is converted to speech, and saved in the audio data specified.

    Speech synthesized to [helloworld.wav] for text [Hello world!]
    

Next steps

With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

See also


In this quickstart, you will use the Speech SDK to convert text to synthesized speech in an audio file. The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

You can view or download all Speech SDK C++ Samples on GitHub.

Prerequisites

Before you get started, make sure to:

Add sample code

  1. Open the source file helloworld.cpp.

  2. Replace all the code with the following snippet:

    
     // Creates an instance of a speech config with specified subscription key and service region.
     // Replace with your own subscription key and region identifier from here: https://aka.ms/speech/sdkregion
     auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
     // Creates a speech synthesizer using file as audio output.
     // Replace with your own audio file name.
     auto fileName = "helloworld.wav";
     auto fileOutput = AudioConfig::FromWavFileOutput(fileName);
     auto synthesizer = SpeechSynthesizer::FromConfig(config, fileOutput);
    
     // Converts the specified text to speech, saving the audio data in the file specified above.
     // Replace with your own text.
     auto text = "Hello world!";
     auto result = synthesizer->SpeakTextAsync(text).get();
    
     // Checks result for successful completion or errors.
     if (result->Reason == ResultReason::SynthesizingAudioCompleted)
     {
         cout << "Speech synthesized to [" << fileName << "] for text [" << text << "]" << std::endl;
     }
     else if (result->Reason == ResultReason::Canceled)
     {
         auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
         cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
         if (cancellation->Reason == CancellationReason::Error)
         {
             cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
             cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
             cout << "CANCELED: Did you update the subscription info?" << std::endl;
         }
     }
    
    
  3. In the same file, replace the string YourSubscriptionKey with your subscription key.

  4. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  5. Replace the string helloworld.wav with your own filename.

  6. From the menu bar, choose File > Save All.

Build and run the application

  1. From the menu bar, select Build > Build Solution to build the application. The code should compile without errors now.

  2. Choose Debug > Start Debugging (or press F5) to start the helloworld application.

  3. Your text is converted to speech, and saved in the audio data specified.

    Speech synthesized to [helloworld.wav] for text [Hello world!]
    

Next steps

With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

See also


In this quickstart, you will use the Speech SDK to convert text to synthesized speech in an audio file. The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

You can view or download all Speech SDK Java Samples on GitHub.

Prerequisites

Add sample code

  1. To add a new empty class to your Java project, select File > New > Class.

  2. In the New Java Class window, enter speechsdk.quickstart into the Package field, and Main into the Name field.

    Screenshot of New Java Class window

  3. Replace all code in Main.java with the following snippet:

    package speechsdk.quickstart;
    
    import java.util.concurrent.Future;
    import com.microsoft.cognitiveservices.speech.*;
    
    /**
     * Quickstart: recognize speech using the Speech SDK for Java.
     */
    public class Main {
    
        /**
         * @param args Arguments are ignored in this sample.
         */
        public static void main(String[] args) {
            try {
                // Replace below with your own subscription key
                String speechSubscriptionKey = "YourSubscriptionKey";
    
                // Replace below with your own region identifier from here: https://aka.ms/speech/sdkregion
                String serviceRegion = "YourServiceRegion";
    
                // Replace below with your own filename.
                String audioFileName = "helloworld.wav";
    
                // Replace below with your own filename.
                String text = "Hello world!";
    
                int exitCode = 1;
                SpeechConfig config = SpeechConfig.fromSubscription(speechSubscriptionKey, serviceRegion);
                assert(config != null);
    
                AudioConfig audioOutput = AudioConfig.fromWavFileInput(audioFileName);
                assert(audioOutput != null);
    
                SpeechSynthesizer synth = new SpeechSynthesizer(config, audioOutput);
                assert(synth != null);
    
                Future<SpeechSynthesisResult> task = synth.SpeakTextAsync(text);
                assert(task != null);
    
                SpeechSynthesisResult result = task.get();
                assert(result != null);
    
                if (result.getReason() == ResultReason.SynthesizingAudioCompleted) {
                    System.out.println("Speech synthesized to [" + audioFilename + "] for text [" + text + "]");
                    exitCode = 0;
                }
                else if (result.getReason() == ResultReason.Canceled) {
                    SpeechSynthesisCancellationDetails cancellation = SpeechSynthesisCancellationDetails.fromResult(result);
                    System.out.println("CANCELED: Reason=" + cancellation.getReason());
    
                    if (cancellation.getReason() == CancellationReason.Error) {
                        System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                        System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                        System.out.println("CANCELED: Did you update the subscription info?");
                    }
                }
    
                result.close();
                synth.close();
    
                System.exit(exitCode);
            } catch (Exception ex) {
                System.out.println("Unexpected exception: " + ex.getMessage());
    
                assert(false);
                System.exit(1);
            }
        }
    }
    
  4. Replace the string YourSubscriptionKey with your subscription key.

  5. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  6. Replace the string helloworld.wav with your own filename.

  7. Replace the string Hello world! with your own text.

  8. Save changes to the project.

Build and run the app

Press F11, or select Run > Debug. Your text is converted to speech, and saved in the audio data specified.

Speech synthesized to [helloworld.wav] for text [Hello world!]

Next steps

With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

See also


In this quickstart, you will use the Speech SDK to convert text to synthesized speech in an audio file. The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

You can view or download all Speech SDK Python Samples on GitHub.

Prerequisites

  • An Azure subscription key for the Speech service. Get one for free.
  • Python 3.5 to 3.8.
  • The Python Speech SDK package is available for these operating systems:
    • Windows: x64 and x86.
    • Mac: macOS X version 10.12 or later.
    • Linux: Ubuntu 16.04/18.04, Debian 9, RHEL 7/8, CentOS 7/8 on x64.
  • On Linux, run these commands to install the required packages:
sudo apt-get update
sudo apt-get install build-essential libssl1.0.0 libasound2

Install the Speech SDK

Important

By downloading any of the Azure Cognitive Services Speech SDKs, you acknowledge its license. For more information, see:

This command installs the Python package from PyPI for the Speech SDK:

pip install azure-cognitiveservices-speech

Support and updates

Updates to the Speech SDK Python package are distributed via PyPI and announced in the Release notes. If a new version is available, you can update to it with the command pip install --upgrade azure-cognitiveservices-speech. Check which version is currently installed by inspecting the azure.cognitiveservices.speech.__version__ variable.

If you have a problem, or you're missing a feature, see Support and help options.

Create a Python application that uses the Speech SDK

Run the sample

You can copy the sample code from this quickstart to a source file quickstart.py and run it in your IDE or in the console:

python quickstart.py

Or you can download this quickstart tutorial as a Jupyter notebook from the Speech SDK sample repository and run it as a notebook.

Sample code

import azure.cognitiveservices.speech as speechsdk

# Replace with your own subscription key and region identifier from here: https://aka.ms/speech/sdkregion
speech_key, service_region = "YourSubscriptionKey", "YourServiceRegion"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

# Creates an audio configuration that points to an audio file.
# Replace with your own audio filename.
audio_filename = "helloworld.wav"
audio_output = speechsdk.audio.AudioOutputConfig(filename=audio_filename)

# Creates a synthesizer with the given settings
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_output)

# Synthesizes the text to speech.
# Replace with your own text.
text = "Hello world!"
result = speech_synthesizer.speak_text_async(text).get()

# Checks result.
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized to [{}] for text [{}]".format(audio_filename, text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        if cancellation_details.error_details:
            print("Error details: {}".format(cancellation_details.error_details))
    print("Did you update the subscription info?")

Install and use the Speech SDK with Visual Studio Code

  1. Download and install a 64-bit version of Python, 3.5 to 3.8, on your computer.

  2. Download and install Visual Studio Code.

  3. Open Visual Studio Code and install the Python extension. Select File > Preferences > Extensions from the menu. Search for Python.

    Install the Python extension

  4. Create a folder to store the project in. An example is by using Windows Explorer.

  5. In Visual Studio Code, select the File icon. Then open the folder you created.

    Open a folder

  6. Create a new Python source file, speechsdk.py, by selecting the new file icon.

    Create a file

  7. Copy, paste, and save the Python code to the newly created file.

  8. Insert your Speech service subscription information.

  9. If selected, a Python interpreter displays on the left side of the status bar at the bottom of the window. Otherwise, bring up a list of available Python interpreters. Open the command palette (Ctrl+Shift+P) and enter Python: Select Interpreter. Choose an appropriate one.

  10. You can install the Speech SDK Python package from within Visual Studio Code. Do that if it's not installed yet for the Python interpreter you selected. To install the Speech SDK package, open a terminal. Bring up the command palette again (Ctrl+Shift+P) and enter Terminal: Create New Integrated Terminal. In the terminal that opens, enter the command python -m pip install azure-cognitiveservices-speech or the appropriate command for your system.

  11. To run the sample code, right-click somewhere inside the editor. Select Run Python File in Terminal. Your text is converted to speech, and saved in the audio data specified.

    Speech synthesized to [helloworld.wav] for text [Hello world!]
    

If you have issues following these instructions, refer to the more extensive Visual Studio Code Python tutorial.

Next steps

With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

See also

In this quickstart, you will use the Speech SDK to convert text to synthesized speech in an audio file. The text-to-speech service provides numerous options for synthesized voices, under text-to-speech language support. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

You can view or download all Speech SDK JavaScript Samples on GitHub.

Prerequisites

Before you get started:

Start with some boilerplate code

Let's add some code that works as a skeleton for our project. Create an index.js file and add this code.

Be sure to fill in your values for subscriptionKey, servcieRegion, and filename.

(function() {
  // <code>
  "use strict";
  
  // pull in the required packages.
  var sdk = require("microsoft-cognitiveservices-speech-sdk");
  var fs = require("fs");
  
  // replace with your own subscription key,
  // service region (e.g., "westus"), and
  // the name of the file you want to run
  // through the speech synthesizer.
  var subscriptionKey = "YourSubscriptionKey";
  var serviceRegion = "YourServiceRegion"; // e.g., "westus"
  var filename = "YourAudioFile.wav"; // 16000 Hz, Mono
 
}());
  

Load the file into an PullAudioOutputStream

For NodeJS the Speech SDK doesn't natively support file access directly, so we'll open the file and write to it using a PullAudioOutputStream.

// create the push stream we need for the speech sdk.
  var pullStream = sdk.AudioOutputStream.createPullStream();
  
  // open the file and push it to the push stream.
  fs.createWriteStream(filename).on('data', function(arrayBuffer) {
    pullStream.read(arrayBuffer.slice());
  }).on('end', function() {
    pullStream.close();
  });

Create a Speech configuration

Before you can initialize a SpeechSynthesizer object, you need to create a configuration that uses your subscription key and subscription region. Insert this code next.

Note

The Speech SDK will default to recognizing using en-us for the language, see Specify source language for speech to text for information on choosing the source language.

  // now create the audio-config pointing to our stream and
 // the speech config specifying the language.
 var speechConfig = sdk.SpeechConfig.fromSubscription(subscriptionKey, serviceRegion);
 
 // setting the recognition language to English.
 speechConfig.speechRecognitionLanguage = "en-US";
 

Create an Audio configuration

Now, you need to create an AudioConfig object that points to your PullAudioOutputStream. Insert this code right below your Speech configuration.

    var audioConfig = sdk.AudioConfig.fromStreamInput(pullStream);

Initialize a SpeechSynthesizer

Now, let's create the SpeechSynthesizer object using the SpeechConfig and AudioConfig objects created earlier.

  // create the speech synthesizer.
  var synthesizer = new sdk.SpeechSynthesizer(speechConfig, audioConfig);
  

Recognize a phrase and display results

From the SpeechSynthesizer object, you're going to call the speakTextAsync() method. This method lets the Speech service know that you're sending text for synthesis.

We'll also write the returned result, or any errors, to the console and finally close the synthesizer.

 // we are done with the setup
  var text = "Hello World"
  console.log("Now sending text '" + text + "' to: " + filename);
  
  // start the synthesizer and wait for a result.
  synthesizer.speakTextAsync(
    text,
    function (result) {
      console.log(result);
  
      synthesizer.close();
      synthesizer = undefined;
    },
    function (err) {
      console.trace("err - " + err);
  
      synthesizer.close();
      synthesizer = undefined;
    },
    filename);

Check your code

(function() {
  "use strict";
  
  // pull in the required packages.
  var sdk = require("microsoft-cognitiveservices-speech-sdk");
  var fs = require("fs");
  
  // replace with your own subscription key,
  // service region (e.g., "westus"), and
  // the name of the file you want to run
  // through the speech synthesizer.
  var subscriptionKey = "YourSubscriptionKey";
  var serviceRegion = "YourServiceRegion"; // e.g., "westus"
  var filename = "YourAudioFile.wav"; // 16000 Hz, Mono
  
  // create the pull stream we need for the speech sdk.
  var pullStream = sdk.AudioOutputStream.createPullStream();
  
  // open the file and write it to the pull stream.
  fs.createWriteStream(filename).on('data', function(arrayBuffer) {
    pullStream.read(arrayBuffer.slice());
  }).on('end', function() {
    pullStream.close();
  });
 
  // now create the audio-config pointing to our stream and
  // the speech config specifying the language.
  var speechConfig = sdk.SpeechConfig.fromSubscription(subscriptionKey, serviceRegion);
  
  // setting the recognition language to English.
  speechConfig.speechRecognitionLanguage = "en-US";
  
  var audioConfig = sdk.AudioConfig.fromStreamOutput(pullStream);
  
  // create the speech synthesizer.
  var synthesizer = new sdk.SpeechSynthesizer(speechConfig, audioConfig);
  
 // we are done with the setup
  var text = "Hello World"
  console.log("Now sending text '" + text + "' to: " + filename);
  
  // start the synthesizer and wait for a result.
  synthesizer.speakTextAsync(
    text,
    function (result) {
      console.log(result);
  
      synthesizer.close();
      synthesizer = undefined;
    },
    function (err) {
      console.trace("err - " + err);
  
      synthesizer.close();
      synthesizer = undefined;
    },
    filename);

}());

Run the sample locally

Execute the code using NodeJs

node index.js

Next steps

With this base knowledge of speech synthesis, continue exploring the basics to learn about common functionality and tasks within the Speech SDK.

In this quickstart, you use the Speech CLI from the command line to convert text to speech stored in an audio file. The text-to-speech service provides many options for synthesized voices, under text-to-speech language support. After a one-time configuration, the Speech CLI lets you synthesize speech from text using commands from the command line.

Prerequisites

The only prerequisite is an Azure Speech subscription. See the guide on creating a new subscription if you don't already have one.

Download and install

Follow these steps to install the Speech CLI on Windows:

  1. Install either .NET Framework 4.7 or .NET Core 3.0
  2. Download the Speech CLI zip archive, then extract it.
  3. Go to the root directory spx-zips that you extracted from the download, and extract the subdirectory that you need (spx-net471 for .NET Framework 4.7, or spx-netcore-win-x64 for .NET Core 3.0 on an x64 CPU).

In the command prompt, change directory to this location, and then type spx to see help for the Speech CLI.

Note

On Windows, the Speech CLI can only show fonts available to the command prompt on the local computer. Windows Terminal supports all fonts produced interactively by the Speech CLI. If you output to a file, a text editor like Notepad or a web browser like Microsoft Edge can also show all fonts.

Note

Powershell does not check the local directory when looking for a command. In Powershell, change directory to the location of spx and call the tool by entering .\spx. If you add this directory to your path, Powershell and the Windows command prompt will find spx from any directory without including the .\ prefix.

Create subscription config

To start using the Speech CLI, you first need to enter your Speech subscription key and region information. See the region support page to find your region identifier. Once you have your subscription key and region identifier (ex. eastus, westus), run the following commands.

spx config @key --set YOUR-SUBSCRIPTION-KEY
spx config @region --set YOUR-REGION-ID

Your subscription authentication is now stored for future SPX requests. If you need to remove either of these stored values, run spx config @region --clear or spx config @key --clear.

Run the Speech CLI

Now you're ready to run the Speech CLI to synthesize speech from text into a new audio file.

From the command line, change to the directory that contains the Speech CLI binary file, and type:

spx synthesize --text "The speech synthesizer greets you!" --audio output greetings.wav

The Speech CLI will produce natural language in English into the greetings.wav audio file. In Windows, you can play the audio file by entering start greetings.wav.

Next steps

Continue exploring the basics to learn about other features of the Speech CLI.

View or download all Speech SDK Samples on GitHub.

Additional language and platform support

If you've clicked this tab, you probably didn't see a quickstart in your favorite programming language. Don't worry, we have additional quickstart materials and code samples available on GitHub. Use the table to find the right sample for your programming language and platform/OS combination.

Language Additional Quickstarts Code samples
C# To a speaker .NET Framework, .NET Core, UWP, Unity, Xamarin
C++ To a speaker Windows, Linux, macOS
Java To a speaker Android, JRE
JavaScript Node.js to an audio file Windows, Linux, macOS
Objective-C iOS to speaker, macOS to speaker iOS, macOS
Python To a speaker Windows, Linux, macOS
Swift iOS to speaker, macOS to speaker iOS, macOS