Quickstart: Synthesize speech in C++ on Windows by using the Speech SDK

Quickstarts are also available for speech-recognition and speech-translation.

In this article, you create a C++ console application for Windows. You use the Cognitive Services Speech SDK to synthesize speech from text in real time and play the speech on your PC's speaker. The application is built with the Speech SDK NuGet package and Microsoft Visual Studio 2017 (any edition).

The feature described within this article is available from Speech SDK 1.5.0.

For a complete list of languages/voices available for speech synthesis, see language support.

Prerequisites

You need a Speech Services subscription key to complete this Quickstart. You can get one for free. See Try the Speech Services for free for details.

Create a Visual Studio project

  1. Start Visual Studio 2017.

  2. Make sure the Desktop development with C++ workload is available. Choose Tools > Get Tools and Features from the Visual Studio menu bar to open the Visual Studio installer. If this workload is already enabled, skip to the next step.

    Screenshot of Visual Studio Workloads tab

    Otherwise, check the box next to Desktop development with C++.

  3. Make sure the NuGet package manager component is available. Switch to the Individual components tab of the Visual Studio installer dialog box, and select NuGet package manager if it is not already enabled.

    Screenshot of Visual Studio Individual components tab

  4. If you needed to enable either the C++ workload or NuGet, select Modify (at the lower right corner of the dialog box). Installation of the new features takes a moment. If both features were already enabled, close the dialog box instead.

  5. Create a new Visual C++ Windows Desktop Windows Console Application. First, choose File > New > Project from the menu. In the New Project dialog box, expand Installed > Visual C++ > Windows Desktop in the left pane. Then select Windows Console Application. For the project name, enter helloworld.

    Screenshot of New Project dialog box

  6. If you're running 64-bit Windows, you may switch your build platform to x64 by using the drop-down menu in the Visual Studio toolbar. (64-bit versions of Windows can run 32-bit applications, so this is not a requirement.)

    Screenshot of Visual Studio toolbar, with x64 option highlighted

  7. In Solution Explorer, right-click the solution and choose Manage NuGet Packages for Solution.

    Screenshot of Solution Explorer, with Manage NuGet Packages for Solution option highlighted

  8. In the upper-right corner, in the Package Source field, select nuget.org. Search for the Microsoft.CognitiveServices.Speech package, and install it into the helloworld project.

    Screenshot of Manage Packages for Solution dialog box

    Note

    The current version of the Cognitive Services Speech SDK is 1.5.0.

  9. Accept the displayed license to begin installation of the NuGet package.

    Screenshot of License Acceptance dialog box

After the package is installed, a confirmation appears in the Package Manager console.

Add sample code

  1. Open the source file helloworld.cpp. Replace all the code below the initial include statement (#include "stdafx.h" or #include "pch.h") with the following:

    #include <iostream>
    #include <speechapi_cxx.h>
    
    using namespace std;
    using namespace Microsoft::CognitiveServices::Speech;
    
    void synthesizeSpeech()
    {
        // Creates an instance of a speech config with specified subscription key and service region.
        // Replace with your own subscription key and service region (e.g., "westus").
        auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
        // Creates a speech synthesizer using the default speaker as audio output. The default spoken language is "en-us".
        auto synthesizer = SpeechSynthesizer::FromConfig(config);
    
        // Receive a text from console input and synthesize it to speaker.
        cout << "Type some text that you want to speak..." << std::endl;
        cout << "> ";
        std::string text;
        getline(cin, text);
    
        auto result = synthesizer->SpeakTextAsync(text).get();
    
        // Checks result.
        if (result->Reason == ResultReason::SynthesizingAudioCompleted)
        {
            cout << "Speech synthesized to speaker for text [" << text << "]" << std::endl;
        }
        else if (result->Reason == ResultReason::Canceled)
        {
            auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
            cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
            if (cancellation->Reason == CancellationReason::Error)
            {
                cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
                cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
                cout << "CANCELED: Did you update the subscription info?" << std::endl;
            }
        }
    
        // This is to give some time for the speaker to finish playing back the audio
        cout << "Press enter to exit..." << std::endl;
        cin.get();
    }
    
    int wmain()
    {
        synthesizeSpeech();
        return 0;
    }
    
  2. In the same file, replace the string YourSubscriptionKey with your subscription key.

  3. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  4. Save changes to the project.

Build and run the app

  1. Build the application. From the menu bar, choose Build > Build Solution. The code should compile without errors.

    Screenshot of Visual Studio application, with Build Solution option highlighted

  2. Start the application. From the menu bar, choose Debug > Start Debugging, or press F5.

    Screenshot of Visual Studio application, with Start Debugging option highlighted

  3. A console window appears, prompting you to type some text. Type a few words or a sentence. The text that you typed is transmitted to the Speech Services and synthesized to speech, which plays on your speaker.

    Screenshot of console output after successful synthesis

Next steps

Additional samples, such as how to save speech to an audio file, are available on GitHub.

See also