Quickstart: Synthesize speech in C++ on Linux by using the Speech SDK

Quickstarts are also available for speech-recognition.

In this article, you create a C++ console application for Linux (Ubuntu 16.04, Ubuntu 18.04, Debian 9). You use the Cognitive Services Speech SDK to synthesize speech from text in real time and play the speech on your PC's speaker. The application is built with the Speech SDK for Linux and your Linux distribution's C++ compiler (for example, g++).

Prerequisites

You need a Speech Services subscription key to complete this Quickstart. You can get one for free. See Try the Speech Services for free for details.

Install Speech SDK

Important

By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. See the Microsoft Software License Terms for the Speech SDK.

The current version of the Cognitive Services Speech SDK is 1.5.1.

The Speech SDK for Linux can be used to build both 64-bit and 32-bit applications. The required libraries and header files can be downloaded as a tar file from https://aka.ms/csspeech/linuxbinary.

Download and install the SDK as follows:

  1. Make sure the SDK's dependencies are installed.

    • On Ubuntu:

      sudo apt-get update
      sudo apt-get install build-essential libssl1.0.0 libasound2 wget
      
    • On Debian 9:

      sudo apt-get update
      sudo apt-get install build-essential libssl1.0.2 libasound2 wget
      
  2. Choose a directory to which the Speech SDK files should be extracted, and set the SPEECHSDK_ROOT environment variable to point to that directory. This variable makes it easy to refer to the directory in future commands. For example, if you want to use the directory speechsdk in your home directory, use a command like the following:

    export SPEECHSDK_ROOT="$HOME/speechsdk"
    
  3. Create the directory if it doesn't exist yet.

    mkdir -p "$SPEECHSDK_ROOT"
    
  4. Download and extract the .tar.gz archive containing the Speech SDK binaries:

    wget -O SpeechSDK-Linux.tar.gz https://aka.ms/csspeech/linuxbinary
    tar --strip 1 -xzf SpeechSDK-Linux.tar.gz -C "$SPEECHSDK_ROOT"
    
  5. Validate the contents of the top-level directory of the extracted package:

    ls -l "$SPEECHSDK_ROOT"
    

    The directory listing should contain the third-party notice and license files, as well as an include directory containing header (.h) files and a lib directory containing libraries.

    Path Description
    license.md License
    ThirdPartyNotices.md Third-party notices.
    REDIST.txt Redistribution notice.
    include The required header files for C and C++
    lib/x64 Native library for x64 required to link your application
    lib/x86 Native library for x86 required to link your application

Add sample code

  1. Create a C++ source file named helloworld.cpp, and paste the following code into it.

    #include <iostream> // cin, cout
    #include <speechapi_cxx.h>
    
    using namespace std;
    using namespace Microsoft::CognitiveServices::Speech;
    
    void synthesizeSpeech()
    {
        // Creates an instance of a speech config with specified subscription key and service region.
        // Replace with your own subscription key and service region (e.g., "westus").
        auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
        // Creates a speech synthesizer using the default speaker as audio output. The default spoken language is "en-us".
        auto synthesizer = SpeechSynthesizer::FromConfig(config);
    
        // Receive a text from console input and synthesize it to speaker.
        cout << "Type some text that you want to speak..." << std::endl;
        cout << "> ";
        std::string text;
        getline(cin, text);
    
        auto result = synthesizer->SpeakTextAsync(text).get();
    
        // Checks result.
        if (result->Reason == ResultReason::SynthesizingAudioCompleted)
        {
            cout << "Speech synthesized to speaker for text [" << text << "]" << std::endl;
        }
        else if (result->Reason == ResultReason::Canceled)
        {
            auto cancellation = SpeechSynthesisCancellationDetails::FromResult(result);
            cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
            if (cancellation->Reason == CancellationReason::Error)
            {
                cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
                cout << "CANCELED: ErrorDetails=[" << cancellation->ErrorDetails << "]" << std::endl;
                cout << "CANCELED: Did you update the subscription info?" << std::endl;
            }
        }
    
        // This is to give some time for the speaker to finish playing back the audio
        cout << "Press enter to exit..." << std::endl;
        cin.get();
    }
    
    int main(int argc, char **argv) {
        setlocale(LC_ALL, "");
        synthesizeSpeech();
        return 0;
    }
    
  2. In this new file, replace the string YourSubscriptionKey with your Speech Services subscription key.

  3. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

Build the app

Note

Make sure to enter the commands below as a single command line. The easiest way to do that is to copy the command by using the Copy button next to each command, and then paste it at your shell prompt.

  • On an x64 (64-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/x64" -l:libasound.so.2
    
  • On an x86 (32-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/x86" -l:libasound.so.2
    

Run the app

  1. Configure the loader's library path to point to the Speech SDK library.

    • On an x64 (64-bit) system, enter the following command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x64"
      
    • On an x86 (32-bit) system, enter this command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x86"
      
  2. Run the application.

    ./helloworld
    
  3. In the console window, a prompt appears, prompting you to type some text. Type a few words or a sentence. The text that you typed is transmitted to the Speech Services and synthesized to speech, which plays on your speaker.

    Type some text that you want to speak...
    > hello
    Speech synthesized to speaker for text [hello]
    Press enter to exit...
    

Next steps

See also