Quickstart: Synthesize speech into an audio file

In this quickstart you will use the Speech SDK to convert text to synthesized speech in an audio file. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

If you prefer to jump right in, view or download all Speech SDK C# Samples on GitHub. Otherwise, let's get started.

Prerequisites

Before you get started, make sure to:

Open your project in Visual Studio

The first step is to make sure that you have your project open in Visual Studio.

  1. Launch Visual Studio 2019.
  2. Load your project and open Program.cs.

Start with some boilerplate code

Let's add some code that works as a skeleton for our project. Make note that you've created an async method called SynthesisToAudioFileAsync().


using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;

namespace helloworld
{
    class Program
    {
        public static async Task SynthesisToAudioFileAsync()
        {
        }

        static void Main()
        {
            SynthesisToAudioFileAsync().Wait();
        }
    }
}

Create a Speech configuration

Before you can initialize a SpeechSynthesizer object, you need to create a configuration that uses your subscription key and subscription region. Insert this code in the SynthesisToAudioFileAsync() method.

var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");

Create an Audio configuration

Now, you need to create an AudioConfig object that points to your audio file. This object is created inside of a using statement to ensure the proper release of unmanaged resources. Insert this code in the SynthesisToAudioFileAsync() method, right below your Speech configuration.

var fileName = "helloworld.wav";
using (var fileOutput = AudioConfig.FromWavFileOutput(fileName))
{
}

Initialize a SpeechSynthesizer

Now, let's create the SpeechSynthesizer object using the SpeechConfig and AudioConfig objects created earlier. This object is also created inside of a using statement to ensure the proper release of unmanaged resources. Insert this code in the SynthesisToAudioFileAsync() method, inside the using statement that wraps your AudioConfig object.

using (var synthesizer = new SpeechSynthesizer(config, fileOutput))
{
}

Synthesize text using SpeakTextAsync

From the SpeechSynthesizer object, you're going to call the SpeakTextAsync() method. This method sends your text to the Speech Service which converts it to audio. The SpeechSynthesizer will use the default voice if config.VoiceName isn't explicitly specified.

Inside the using statement, add this code:

var text = "Hello world!";
var result = await synthesizer.SpeakTextAsync(text);

Check for errors

When the synthesis result is returned by the Speech service, you should check to make sure your text was successfully synthesized.

Inside the using statement, below SpeakTextAsync(), add this code:

if (result.Reason == ResultReason.SynthesizingAudioCompleted)
{
    Console.WriteLine($"Speech synthesized to [{fileName}] for text [{text}]");
}
else if (result.Reason == ResultReason.Canceled)
{
    var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

    if (cancellation.Reason == CancellationReason.Error)
    {
        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
        Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
        Console.WriteLine($"CANCELED: Did you update the subscription info?");
    }
}

Check your code

At this point, your code should look like this:

//
// Copyright (c) Microsoft. All rights reserved.
// Licensed under the MIT license. See LICENSE.md file in the project root for full license information.
//

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;

namespace helloworld
{
    class Program
    {
        public static async Task SynthesisToAudioFileAsync()
        {
            var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");

            var fileName = "helloworld.wav";
            using (var fileOutput = AudioConfig.FromWavFileOutput(fileName))
            {
                using (var synthesizer = new SpeechSynthesizer(config, fileOutput))
                {
                    var text = "Hello world!";
                    var result = await synthesizer.SpeakTextAsync(text);

                    if (result.Reason == ResultReason.SynthesizingAudioCompleted)
                    {
                        Console.WriteLine($"Speech synthesized to [{fileName}] for text [{text}]");
                    }
                    else if (result.Reason == ResultReason.Canceled)
                    {
                        var cancellation = SpeechSynthesisCancellationDetails.FromResult(result);
                        Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                        if (cancellation.Reason == CancellationReason.Error)
                        {
                            Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                            Console.WriteLine($"CANCELED: ErrorDetails=[{cancellation.ErrorDetails}]");
                            Console.WriteLine($"CANCELED: Did you update the subscription info?");
                        }
                    }
                }
            }
        }

        static void Main()
        {
            SynthesisToAudioFileAsync().Wait();
        }
    }
}

Build and run your app

Now you're ready to build your app and test our speech synthesis using the Speech service.

  1. Compile the code - From the menu bar of Visual Stuio, choose Build > Build Solution.

  2. Start your app - From the menu bar, choose Debug > Start Debugging or press F5.

  3. Start synthesis - Your text is converted to speech, and saved in the audio data specified.

    Speech synthesized to [helloworld.wav] for text [Hello world!]
    

Next steps

See also


In this quickstart you will use the Speech SDK to convert text to synthesized speech in an audio file. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

If you prefer to jump right in, view or download all Speech SDK C++ Samples on GitHub. Otherwise, let's get started.

Prerequisites

Before you get started, make sure to:

Add sample code

  1. Open the source file helloworld.cpp.

  2. Replace all the code with the following snippet:

    
     // Creates an instance of a speech config with specified subscription key and service region.
     // Replace with your own subscription key and service region (e.g., "westus").
     auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
     // Creates a speech synthesizer using file as audio output.
     // Replace with your own audio file name.
     auto fileName = "helloworld.wav";
     auto fileOutput = AudioConfig::FromWavFileOutput(fileName);
     auto synthesizer = SpeechSynthesizer::FromConfig(config, fileOutput);
    
     // Converts the specified text to speech, saving the audio data in the file specified above.
     // Replace with your own text.
     auto text = "Hello world!";
     auto result = synthesizer->SpeakTextAsync(text).get();
    
     // Checks result for successful completion or errors.
     if (result->Reason == ResultReason::SynthesizingAudioCompleted)
     {
         cout << "Speech synthesized to [" << fileName << "] for text [" << text << "]" << std::endl;
     }
     else if (result->Reason == ResultReason::Canceled)
     {
         auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
         cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
         if (cancellation->Reason == CancellationReason::Error)
         {
             cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
             cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
             cout << "CANCELED: Did you update the subscription info?" << std::endl;
         }
     }
    
    
  3. In the same file, replace the string YourSubscriptionKey with your subscription key.

  4. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  5. Replace the string helloworld.wav with your own filename.

  6. From the menu bar, choose File > Save All.

Build and run the application

  1. From the menu bar, select Build > Build Solution to build the application. The code should compile without errors now.

  2. Choose Debug > Start Debugging (or press F5) to start the helloworld application.

  3. Your text is converted to speech, and saved in the audio data specified.

    Speech synthesized to [helloworld.wav] for text [Hello world!]
    

Next steps

See also


In this quickstart you will use the Speech SDK to convert text to synthesized speech in an audio file. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

If you prefer to jump right in, view or download all Speech SDK Java Samples on GitHub. Otherwise, let's get started.

Prerequisites

Add sample code

  1. To add a new empty class to your Java project, select File > New > Class.

  2. In the New Java Class window, enter speechsdk.quickstart into the Package field, and Main into the Name field.

    Screenshot of New Java Class window

  3. Replace all code in Main.java with the following snippet:

    package speechsdk.quickstart;
    
    import java.util.concurrent.Future;
    import com.microsoft.cognitiveservices.speech.*;
    
    /**
     * Quickstart: recognize speech using the Speech SDK for Java.
     */
    public class Main {
    
        /**
         * @param args Arguments are ignored in this sample.
         */
        public static void main(String[] args) {
            try {
                // Replace below with your own subscription key
                String speechSubscriptionKey = "YourSubscriptionKey";
                // Replace below with your own service region (e.g., "westus").
                String serviceRegion = "YourServiceRegion";
                // Replace below with your own filename.
                String audioFileName = "helloworld.wav";
                // Replace below with your own filename.
                String text = "Hello world!";
    
                int exitCode = 1;
                SpeechConfig config = SpeechConfig.fromSubscription(speechSubscriptionKey, serviceRegion);
                assert(config != null);
    
                AudioConfig audioOutput = AudioConfig.fromWavFileInput(audioFileName);
                assert(audioOutput != null);
    
                SpeechSynthesizer synth = new SpeechSynthesizer(config, audioOutput);
                assert(synth != null);
    
                Future<SpeechSynthesisResult> task = synth.SpeakTextAsync(text);
                assert(task != null);
    
                SpeechSynthesisResult result = task.get();
                assert(result != null);
    
                if (result.getReason() == ResultReason.SynthesizingAudioCompleted) {
                    System.out.println("Speech synthesized to [" + audioFilename + "] for text [" + text + "]");
                    exitCode = 0;
                }
                else if (result.getReason() == ResultReason.Canceled) {
                    SpeechSynthesisCancellationDetails cancellation = SpeechSynthesisCancellationDetails.fromResult(result);
                    System.out.println("CANCELED: Reason=" + cancellation.getReason());
    
                    if (cancellation.getReason() == CancellationReason.Error) {
                        System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                        System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                        System.out.println("CANCELED: Did you update the subscription info?");
                    }
                }
    
                result.close();
                synth.close();
    
                System.exit(exitCode);
            } catch (Exception ex) {
                System.out.println("Unexpected exception: " + ex.getMessage());
    
                assert(false);
                System.exit(1);
            }
        }
    }
    
  4. Replace the string YourSubscriptionKey with your subscription key.

  5. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  6. Replace the string helloworld.wav with your own filename.

  7. Replace the string Hello world! with your own text.

  8. Save changes to the project.

Build and run the app

Press F11, or select Run > Debug. Your text is converted to speech, and saved in the audio data specified.

Speech synthesized to [helloworld.wav] for text [Hello world!]

Next steps

See also


In this quickstart you will use the Speech SDK to convert text to synthesized speech in an audio file. After satisfying a few prerequisites, synthesizing speech into a file only takes five steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create an Audio Configuration object that specifies the .WAV file name.
  • Create a SpeechSynthesizer object using the configuration objects from above.
  • Using the SpeechSynthesizer object, convert your text into synthesized speech, saving it into the audio file specified.
  • Inspect the SpeechSynthesizer returned for errors.

If you prefer to jump right in, view or download all Speech SDK Python Samples on GitHub. Otherwise, let's get started.

Prerequisites

  • An Azure subscription key for the Speech Services. Get one for free.

  • Python 3.5 or later.

  • The Python Speech SDK package is available for these operating systems:

    • Windows: x64 and x86.
    • Mac: macOS X version 10.12 or later.
    • Linux: Ubuntu 16.04, Ubuntu 18.04, Debian 9 on x64.
  • On Linux, run these commands to install the required packages:

    • On Ubuntu:

      sudo apt-get update
      sudo apt-get install build-essential libssl1.0.0 libasound2
      
    • On Debian 9:

      sudo apt-get update
      sudo apt-get install build-essential libssl1.0.2 libasound2
      
  • On Windows, you need the Microsoft Visual C++ Redistributable for Visual Studio 2019 for your platform.

Install the Speech SDK

Important

By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. See the Microsoft Software License Terms for the Speech SDK.

This command installs the Python package from PyPI for the Speech SDK:

pip install azure-cognitiveservices-speech

Support and updates

Updates to the Speech SDK Python package are distributed via PyPI and announced in the Release notes. If a new version is available, you can update to it with the command pip install --upgrade azure-cognitiveservices-speech. Check which version is currently installed by inspecting the azure.cognitiveservices.speech.__version__ variable.

If you have a problem, or you're missing a feature, see Support and help options.

Create a Python application that uses the Speech SDK

Run the sample

You can copy the sample code from this quickstart to a source file quickstart.py and run it in your IDE or in the console:

python quickstart.py

Or you can download this quickstart tutorial as a Jupyter notebook from the Speech SDK sample repository and run it as a notebook.

Sample code


import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and service region (e.g., "westus").
speech_key, service_region = "YourSubscriptionKey", "YourServiceRegion"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

# Creates an audio configuration that points to an audio file.
# Replace with your own audio filename.
audio_filename = "helloworld.wav"
audio_output = speechsdk.AudioOutputConfig(filename=audio_filename)

# Creates a synthesizer with the given settings
speech_synthesizer = speechsdk.SpeechSynthesizer(speech_config=speech_config, audio_config=audio_output)

# Synthesizes the text to speech.
# Replace with your own text.
text = "Hello world!"
result = speech_synthesizer.speak_text_async(text).get()

# Checks result.
if result.reason == speechsdk.ResultReason.SynthesizingAudioCompleted:
    print("Speech synthesized to [{}] for text [{}]".format(audio_filename, text))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech synthesis canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        if cancellation_details.error_details:
            print("Error details: {}".format(cancellation_details.error_details))
    print("Did you update the subscription info?")

Install and use the Speech SDK with Visual Studio Code

  1. Download and install a 64-bit version of Python, 3.5 or later, on your computer.

  2. Download and install Visual Studio Code.

  3. Open Visual Studio Code and install the Python extension. Select File > Preferences > Extensions from the menu. Search for Python.

    Install the Python extension

  4. Create a folder to store the project in. An example is by using Windows Explorer.

  5. In Visual Studio Code, select the File icon. Then open the folder you created.

    Open a folder

  6. Create a new Python source file, speechsdk.py, by selecting the new file icon.

    Create a file

  7. Copy, paste, and save the Python code to the newly created file.

  8. Insert your Speech Services subscription information.

  9. If selected, a Python interpreter displays on the left side of the status bar at the bottom of the window. Otherwise, bring up a list of available Python interpreters. Open the command palette (Ctrl+Shift+P) and enter Python: Select Interpreter. Choose an appropriate one.

  10. You can install the Speech SDK Python package from within Visual Studio Code. Do that if it's not installed yet for the Python interpreter you selected. To install the Speech SDK package, open a terminal. Bring up the command palette again (Ctrl+Shift+P) and enter Terminal: Create New Integrated Terminal. In the terminal that opens, enter the command python -m pip install azure-cognitiveservices-speech or the appropriate command for your system.

  11. To run the sample code, right-click somewhere inside the editor. Select Run Python File in Terminal. Your text is converted to speech, and saved in the audio data specified.

    Speech synthesized to [helloworld.wav] for text [Hello world!]
    

If you have issues following these instructions, refer to the more extensive Visual Studio Code Python tutorial.

Next steps

See also

View or download all Speech SDK Samples on GitHub.

Additional language and platform support

If you've clicked this tab, you probably didn't see a quickstart in your favorite programming language. Don't worry, we have additional quickstart materials and code samples available on GitHub. Use the table to find the right sample for your programming language and platform/OS combination.

Language Additional Quickstarts Code samples
C++ Windows, Linux, macOS
C# .NET Framework, .NET Core, UWP, Unity, Xamarin
Java Android, JRE
Javascript Browser
Node.js Windows, Linux, macOS
Objective-C macOs, iOS iOS, macOS
Python Windows, Linux, macOS
Swift macOs, iOS iOS, macOS