Quickstart: Recognize speech in C++ on Windows by using the Speech SDK

Quickstarts are also available for text-to-speech and speech-translation.

If desired, choose a different programming language and/or environment:

In this article, you create a C++ console application for Windows. You use the Cognitive Services Speech SDK to transcribe speech to text in real time from your PC's microphone. The application is built with the Speech SDK NuGet package and Microsoft Visual Studio 2017 (any edition).

Prerequisites

You need a Speech Services subscription key to complete this Quickstart. You can get one for free. See Try the Speech Services for free for details.

Create a Visual Studio project

  1. Start Visual Studio 2019.

  2. Make sure the Desktop development with C++ workload is available. Choose Tools > Get Tools and Features from the Visual Studio menu bar to open the Visual Studio installer. If this workload is already enabled, skip to the next step.

    Screenshot of Visual Studio Workloads tab

    Otherwise, check the box next to Desktop development with C++.

  3. Make sure the NuGet package manager component is available. Switch to the Individual components tab of the Visual Studio installer dialog box, and select NuGet package manager if it is not already enabled.

    Screenshot of Visual Studio Individual components tab

  4. If you needed to enable either the C++ workload or NuGet, select Modify (at the lower right corner of the dialog box). Installation of the new features takes a moment. If both features were already enabled, close the dialog box instead.

  5. Create a new Visual C++ Windows Desktop Windows Console Application. First, choose File > New > Project from the menu. In the New Project dialog box, expand Installed > Visual C++ > Windows Desktop in the left pane. Then select Windows Console Application. For the project name, enter helloworld.

    Screenshot of New Project dialog box

  6. If you're running 64-bit Windows, you may switch your build platform to x64 by using the drop-down menu in the Visual Studio toolbar. (64-bit versions of Windows can run 32-bit applications, so this is not a requirement.)

    Screenshot of Visual Studio toolbar, with x64 option highlighted

  7. In Solution Explorer, right-click the solution and choose Manage NuGet Packages for Solution.

    Screenshot of Solution Explorer, with Manage NuGet Packages for Solution option highlighted

  8. In the upper-right corner, in the Package Source field, select nuget.org. Search for the Microsoft.CognitiveServices.Speech package, and install it into the helloworld project.

    Screenshot of Manage Packages for Solution dialog box

    Note

    The current version of the Cognitive Services Speech SDK is 1.5.0.

  9. Accept the displayed license to begin installation of the NuGet package.

    Screenshot of License Acceptance dialog box

After the package is installed, a confirmation appears in the Package Manager console.

Add sample code

  1. Open the source file helloworld.cpp. Replace all the code below the initial include statement (#include "stdafx.h" or #include "pch.h") with the following:

    #include <iostream>
    #include <speechapi_cxx.h>
    
    using namespace std;
    using namespace Microsoft::CognitiveServices::Speech;
    
    void recognizeSpeech()
    {
        // Creates an instance of a speech config with specified subscription key and service region.
        // Replace with your own subscription key and service region (e.g., "westus").
        auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
        // Creates a speech recognizer.
        auto recognizer = SpeechRecognizer::FromConfig(config);
        cout << "Say something...\n";
    
        // Starts speech recognition, and returns after a single utterance is recognized. The end of a
        // single utterance is determined by listening for silence at the end or until a maximum of 15
        // seconds of audio is processed.  The task returns the recognition text as result. 
        // Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
        // shot recognition like command or query. 
        // For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
        auto result = recognizer->RecognizeOnceAsync().get();
    
        // Checks result.
        if (result->Reason == ResultReason::RecognizedSpeech)
        {
            cout << "We recognized: " << result->Text << std::endl;
        }
        else if (result->Reason == ResultReason::NoMatch)
        {
            cout << "NOMATCH: Speech could not be recognized." << std::endl;
        }
        else if (result->Reason == ResultReason::Canceled)
        {
            auto cancellation = CancellationDetails::FromResult(result);
            cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
            if (cancellation->Reason == CancellationReason::Error) 
            {
                cout << "CANCELED: ErrorCode= " << (int)cancellation->ErrorCode << std::endl;
                cout << "CANCELED: ErrorDetails=" << cancellation->ErrorDetails << std::endl;
                cout << "CANCELED: Did you update the subscription info?" << std::endl;
            }
        }
    }
    
    int wmain()
    {
        recognizeSpeech();
        cout << "Please press a key to continue.\n";
        cin.get();
        return 0;
    }
    
  2. In the same file, replace the string YourSubscriptionKey with your subscription key.

  3. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  4. Save changes to the project.

Build and run the app

  1. Build the application. From the menu bar, choose Build > Build Solution. The code should compile without errors.

    Screenshot of Visual Studio application, with Build Solution option highlighted

  2. Start the application. From the menu bar, choose Debug > Start Debugging, or press F5.

    Screenshot of Visual Studio application, with Start Debugging option highlighted

  3. A console window appears, prompting you to say something. Speak an English phrase or sentence. Your speech is transmitted to the Speech Services and transcribed to text, which appears in the same window.

    Screenshot of console output after successful recognition

Next steps

Additional samples, such as how to read speech from an audio file, are available on GitHub.

See also