Quickstart: Synthesize speech in C++ on Windows by using the Speech SDK

Quickstarts are also available for speech recognition and speech translation.

In this article, you create a C++ console application for Windows. You use the Cognitive Services Speech SDK to synthesize speech from text in real time and play the speech on your PC's speaker. The application is built with the Speech SDK NuGet package and Microsoft Visual Studio 2019 (any edition).

For a complete list of languages/voices available for speech synthesis, see language support.

Prerequisites

You need a Speech Services subscription key to complete this Quickstart. You can get one for free. See Try the Speech Services for free for details.

Create a Visual Studio project

To create a Visual Studio project for C++ desktop development, you need to set up Visual Studio development options, create the project, select the target architecture, and install the Speech SDK.

Set up Visual Studio development options

To start, make sure you're set up correctly in Visual Studio for C++ desktop development:

  1. Open Visual Studio 2019 to display the Start window.

    Start window - Visual Studio

  2. Select Continue without code to go to the Visual Studio IDE.

  3. From the Visual Studio menu bar, select Tools > Get Tools and Features to open Visual Studio Installer and view the Modifying dialog box.

    Workloads tab, Modifying dialog box, Visual Studio Installer

  4. In the Workloads tab, under Windows, find the Desktop development with C++ workload. If the check box next to that workload isn't already selected, select it.

  5. In the Individual components tab, find the Nuget package manager check box. If the check box isn't already selected, select it.

  6. Select the button in the corner labeled either Close or Modify. (The button name varies depending on whether you selected any features for installation.) If you select Modify, installation begins, which may take a while.

  7. Close Visual Studio Installer.

Create the project and select the target architecture

Next, create your project:

  1. In the Visual Studio menu bar, choose File > New > Project to display the Create a new project window.

    Create a new project, C++ - Visual Studio

  2. Find and select Console App. Make sure that you select the C++ version of this project type (as opposed to C# or Visual Basic).

  3. Select Next to display the Configure your new project screen.

    Configure your new project, C++ - Visual Studio

  4. In Project name, enter helloworld.

  5. In Location, navigate to and select or create the folder to save your project in.

Now select your target platform architecture. In the Visual Studio toolbar, find the Solution Platforms drop-down box. (If you don't see it, choose View > Toolbars > Standard to display the toolbar containing Solution Platforms.) If you're running 64-bit Windows, choose x64 in the drop-down box. 64-bit Windows can also run 32-bit applications, so you can choose x86 if you prefer.

Install the Speech SDK

Finally, install the Speech SDK NuGet package, and reference the Speech SDK in your project:

  1. In Solution Explorer, right-click your solution, and choose Manage NuGet Packages for Solution to go to the Nuget - Solution window.

  2. Select Browse.

    NuGet - Solution tab, Visual Studio

  3. In Package source, choose nuget.org.

  4. In the Search box, enter Microsoft.CognitiveServices.Speech, and then choose that package after it appears in the search results.

    Microsoft.CognitiveServices.Speech C++ package install - Visual Studio

  5. In the package status pane next to the search results, select your helloworld project.

  6. Select Install.

  7. In the Preview Changes dialog box, select OK.

  8. In the License Acceptance dialog box, view the license, and then select I Accept. The package installation begins, and when installation is complete, the Output pane displays a message similar to the following text: Successfully installed 'Microsoft.CognitiveServices.Speech 1.6.0' to helloworld.

Add sample code

  1. Open the source file helloworld.cpp.

  2. Replace all the code with the following snippet:

    #include <iostream>
    #include <speechapi_cxx.h>
    
    using namespace std;
    using namespace Microsoft::CognitiveServices::Speech;
    
    void synthesizeSpeech()
    {
        // Creates an instance of a speech config with specified subscription key and service region.
        // Replace with your own subscription key and service region (e.g., "westus").
        auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
        // Creates a speech synthesizer using the default speaker as audio output. The default spoken language is "en-us".
        auto synthesizer = SpeechSynthesizer::FromConfig(config);
    
        // Receive a text from console input and synthesize it to speaker.
        cout << "Type some text that you want to speak..." << std::endl;
        cout << "> ";
        std::string text;
        getline(cin, text);
    
        auto result = synthesizer->SpeakTextAsync(text).get();
    
        // Checks result.
        if (result->Reason == ResultReason::SynthesizingAudioCompleted)
        {
            cout << "Speech synthesized to speaker for text [" << text << "]" << std::endl;
        }
        else if (result->Reason == ResultReason::Canceled)
        {
            auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
            cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
            if (cancellation->Reason == CancellationReason::Error)
            {
                cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
                cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
                cout << "CANCELED: Did you update the subscription info?" << std::endl;
            }
        }
    
        // This is to give some time for the speaker to finish playing back the audio
        cout << "Press enter to exit..." << std::endl;
        cin.get();
    }
    
    int wmain()
    {
        synthesizeSpeech();
        return 0;
    }
    
  3. In the same file, replace the string YourSubscriptionKey with your subscription key.

  4. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  5. From the menu bar, choose File > Save All.

Build and run the application

  1. From the menu bar, select Build > Build Solution to build the application. The code should compile without errors now.

  2. Choose Debug > Start Debugging (or press F5) to start the helloworld application.

  3. Type an English phrase or sentence. The application transmits your text to the Speech Services, which sends synthesized speech to the application to play on your speaker.

    Console output after successful speech synthesis

Next steps

Additional samples, such as how to save speech to an audio file, are available on GitHub.

See also