Quickstart: Translate speech with the Speech SDK for C++

Quickstarts are also available for speech-recognition and text-to-speech.

In this quickstart, you'll create a simple C++ application that captures user speech from your computer's microphone, translates the speech, and transcribes the translated text to the command line in real time. This application is designed to run on 64-bit Windows, and is built with the Speech SDK NuGet package and Microsoft Visual Studio 2017.

For a complete list of languages available for speech translation, see language support.

Prerequisites

This quickstart requires:

Create a Visual Studio project

  1. Start Visual Studio 2017.

  2. Make sure the Desktop development with C++ workload is available. Choose Tools > Get Tools and Features from the Visual Studio menu bar to open the Visual Studio installer. If this workload is already enabled, skip to the next step.

    Screenshot of Visual Studio Workloads tab

    Otherwise, check the box next to Desktop development with C++.

  3. Make sure the NuGet package manager component is available. Switch to the Individual components tab of the Visual Studio installer dialog box, and select NuGet package manager if it is not already enabled.

    Screenshot of Visual Studio Individual components tab

  4. If you needed to enable either the C++ workload or NuGet, select Modify (at the lower right corner of the dialog box). Installation of the new features takes a moment. If both features were already enabled, close the dialog box instead.

  5. Create a new Visual C++ Windows Desktop Windows Console Application. First, choose File > New > Project from the menu. In the New Project dialog box, expand Installed > Visual C++ > Windows Desktop in the left pane. Then select Windows Console Application. For the project name, enter helloworld.

    Screenshot of New Project dialog box

  6. If you're running 64-bit Windows, you may switch your build platform to x64 by using the drop-down menu in the Visual Studio toolbar. (64-bit versions of Windows can run 32-bit applications, so this is not a requirement.)

    Screenshot of Visual Studio toolbar, with x64 option highlighted

  7. In Solution Explorer, right-click the solution and choose Manage NuGet Packages for Solution.

    Screenshot of Solution Explorer, with Manage NuGet Packages for Solution option highlighted

  8. In the upper-right corner, in the Package Source field, select nuget.org. Search for the Microsoft.CognitiveServices.Speech package, and install it into the helloworld project.

    Screenshot of Manage Packages for Solution dialog box

    Note

    The current version of the Cognitive Services Speech SDK is 1.5.0.

  9. Accept the displayed license to begin installation of the NuGet package.

    Screenshot of License Acceptance dialog box

After the package is installed, a confirmation appears in the Package Manager console.

Add sample code

  1. Open the source file helloworld.cpp. Replace all the code below the initial include statement (#include "stdafx.h" or #include "pch.h") with the following:

    #include "pch.h"
    #include <iostream>
    #include <vector>
    #include <speechapi_cxx.h>
    
    using namespace std;
    using namespace Microsoft::CognitiveServices::Speech;
    using namespace Microsoft::CognitiveServices::Speech::Translation;
    
    // Translation with microphone input.
    void TranslationWithMicrophone()
    {
        // Creates an instance of a speech translation config with specified subscription key and service region.
        // Replace with your own subscription key and service region (e.g., "westus").
        auto config = SpeechTranslationConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
        // Sets source and target languages
        // Replace with the languages of your choice.
        auto fromLanguage = "en-US";
        config->SetSpeechRecognitionLanguage(fromLanguage);
        config->AddTargetLanguage("de");
        config->AddTargetLanguage("fr");
    
        // Creates a translation recognizer using microphone as audio input.
        auto recognizer = TranslationRecognizer::FromConfig(config);
        cout << "Say something...\n";
    
        // Starts translation, and returns after a single utterance is recognized. The end of a
        // single utterance is determined by listening for silence at the end or until a maximum of 15
        // seconds of audio is processed. The task returns the recognized text as well as the translation.
        // Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
        // shot recognition like command or query.
        // For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
        auto result = recognizer->RecognizeOnceAsync().get();
    
        // Checks result.
        if (result->Reason == ResultReason::TranslatedSpeech)
        {
            cout << "RECOGNIZED: Text=" << result->Text << std::endl
            << "    Language=" << fromLanguage << std::endl;
    
            for (const auto& it : result->Translations)
            {
                cout << "TRANSLATED into '" << it.first.c_str() << "': " << it.second.c_str() << std::endl;
            }
        }
        else if (result->Reason == ResultReason::RecognizedSpeech)
        {
            cout << "RECOGNIZED: Text=" << result->Text << " (text could not be translated)" << std::endl;
        }
        else if (result->Reason == ResultReason::NoMatch)
        {
            cout << "NOMATCH: Speech could not be recognized." << std::endl;
        }
        else if (result->Reason == ResultReason::Canceled)
        {
            auto cancellation = CancellationDetails::FromResult(result);
            cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
            if (cancellation->Reason == CancellationReason::Error)
            {
                cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
                cout << "CANCELED: ErrorDetails=" << cancellation->ErrorDetails << std::endl;
                cout << "CANCELED: Did you update the subscription info?" << std::endl;
            }
        }
    }
    
    int wmain()
    {
        TranslationWithMicrophone();
        cout << "Please press a key to continue.\n";
        cin.get();
        return 0;
    }
    
  2. In the same file, replace the string YourSubscriptionKey with your subscription key.

  3. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  4. Save changes to the project.

Build and run the app

  1. Build the application. From the menu bar, choose Build > Build Solution. The code should compile without errors.

    Screenshot of Visual Studio application, with Build Solution option highlighted

  2. Start the application. From the menu bar, choose Debug > Start Debugging, or press F5.

    Screenshot of Visual Studio application, with Start Debugging option highlighted

  3. A console window appears, prompting you to say something. Speak an English phrase or sentence. Your speech is transmitted to the Speech service, translated and transcribed to text, which appears in the same window.

    Screenshot of console output after successful translation

Next steps

Additional samples, such as how to read speech from an audio file, and output translated text as synthesized speech, are available on GitHub.

See also