Quickstart: Recognize speech from a microphone

In this quickstart you will use the Speech SDK to interactively recognize speech from audio data captured from a microphone. After satisfying a few prerequisites, recognizing speech from a microphone only takes four steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create a SpeechRecognizer object using the SpeechConfig object from above.
  • Using the SpeechRecognizer object, start the recognition process for a single utterance.
  • Inspect the SpeechRecognitionResult returned.

If you prefer to jump right in, view or download all Speech SDK C# Samples on GitHub. Otherwise, let's get started.

Choose your target environment

Prerequisites

Before you get started, make sure to:

Open your project in Visual Studio

The first step is to make sure that you have your project open in Visual Studio.

  1. Launch Visual Studio 2019.
  2. Load your project and open Program.cs.

Start with some boilerplate code

Let's add some code that works as a skeleton for our project. Make note that you've created an async method called RecognizeSpeechAsync().


using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;

namespace helloworld
{
    class Program
    {
        public static async Task RecognizeSpeechAsync()
        {
        }

        static void Main()
        {
            RecognizeSpeechAsync().Wait();
            Console.WriteLine("Please press <Return> to continue.");
            Console.ReadLine();
        }
    }
}

Create a Speech configuration

Before you can initialize a SpeechRecognizer object, you need to create a configuration that uses your subscription key and subscription region. Insert this code in the RecognizeSpeechAsync() method.

Note

This sample uses the FromSubscription() method to build the SpeechConfig. For a full list of available methods, see SpeechConfig Class.

var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");

Initialize a SpeechRecognizer

Now, let's create a SpeechRecognizer. This object is created inside of a using statement to ensure the proper release of unmanaged resources. Insert this code in the RecognizeSpeechAsync() method, right below your Speech configuration.


using (var recognizer = new SpeechRecognizer(config))
{
}

Recognize a phrase

From the SpeechRecognizer object, you're going to call the RecognizeOnceAsync() method. This method lets the Speech service know that you're sending a single phrase for recognition, and that once the phrase is identified to stop recognizing speech.

Inside the using statement, add this code:

var result = await recognizer.RecognizeOnceAsync();

Display the recognition results (or errors)

When the recognition result is returned by the Speech service, you'll want to do something with it. We're going to keep it simple and print the result to console.

Inside the using statement, below RecognizeOnceAsync(), add this code:

if (result.Reason == ResultReason.RecognizedSpeech)
{
    Console.WriteLine($"We recognized: {result.Text}");
}
else if (result.Reason == ResultReason.NoMatch)
{
    Console.WriteLine($"NOMATCH: Speech could not be recognized.");
}
else if (result.Reason == ResultReason.Canceled)
{
    var cancellation = CancellationDetails.FromResult(result);
    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

    if (cancellation.Reason == CancellationReason.Error)
    {
        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
        Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
        Console.WriteLine($"CANCELED: Did you update the subscription info?");
    }
}

Check your code

At this point, your code should look like this:

//
// Copyright (c) Microsoft. All rights reserved.
// Licensed under the MIT license. See LICENSE.md file in the project root for full license information.
//

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;

namespace helloworld
{
    class Program
    {
        public static async Task RecognizeSpeechAsync()
        {
            var config = SpeechConfig.FromSubscription("YourSubscriptionKey", "YourServiceRegion");

            using (var recognizer = new SpeechRecognizer(config))
            {
                var result = await recognizer.RecognizeOnceAsync();

                if (result.Reason == ResultReason.RecognizedSpeech)
                {
                    Console.WriteLine($"We recognized: {result.Text}");
                }
                else if (result.Reason == ResultReason.NoMatch)
                {
                    Console.WriteLine($"NOMATCH: Speech could not be recognized.");
                }
                else if (result.Reason == ResultReason.Canceled)
                {
                    var cancellation = CancellationDetails.FromResult(result);
                    Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

                    if (cancellation.Reason == CancellationReason.Error)
                    {
                        Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
                        Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
                        Console.WriteLine($"CANCELED: Did you update the subscription info?");
                    }
                }
            }
        }

        static void Main()
        {
            RecognizeSpeechAsync().Wait();
            Console.WriteLine("Please press <Return> to continue.");
            Console.ReadLine();
        }
    }
}

Build and run your app

Now you're ready to build your app and test our speech recognition using the Speech service.

  1. Compile the code - From the menu bar of Visual Studio, choose Build > Build Solution.
  2. Start your app - From the menu bar, choose Debug > Start Debugging or press F5.
  3. Start recognition - It'll prompt you to speak a phrase in English. Your speech is sent to the Speech service, transcribed as text, and rendered in the console.

Next steps

In this quickstart you will use the Speech SDK to interactively recognize speech from audio data captured from a microphone. After satisfying a few prerequisites, recognizing speech from a microphone only takes four steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create a SpeechRecognizer object using the SpeechConfig object from above.
  • Using the SpeechRecognizer object, start the recognition process for a single utterance.
  • Inspect the SpeechRecognitionResult returned.

If you prefer to jump right in, view or download all Speech SDK C++ Samples on GitHub. Otherwise, let's get started.

Choose your target environment

Prerequisites

Before you get started, make sure to:

Add sample code

  1. Create a C++ source file named helloworld.cpp, and paste the following code into it.

    #include <iostream> // cin, cout
    #include <speechapi_cxx.h>
    
    using namespace std;
    using namespace Microsoft::CognitiveServices::Speech;
    
    void recognizeSpeech() {
        // Creates an instance of a speech config with specified subscription key and service region.
        // Replace with your own subscription key and service region (e.g., "westus").
        auto config = SpeechConfig::FromSubscription("YourSubscriptionKey", "YourServiceRegion");
    
        // Creates a speech recognizer
        auto recognizer = SpeechRecognizer::FromConfig(config);
        cout << "Say something...\n";
    
        // Starts speech recognition, and returns after a single utterance is recognized. The end of a
        // single utterance is determined by listening for silence at the end or until a maximum of 15
        // seconds of audio is processed.  The task returns the recognition text as result. 
        // Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
        // shot recognition like command or query. 
        // For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
        auto result = recognizer->RecognizeOnceAsync().get();
    
        // Checks result.
        if (result->Reason == ResultReason::RecognizedSpeech) {
            cout << "We recognized: " << result->Text << std::endl;
        }
        else if (result->Reason == ResultReason::NoMatch) {
            cout << "NOMATCH: Speech could not be recognized." << std::endl;
        }
        else if (result->Reason == ResultReason::Canceled) {
            auto cancellation = CancellationDetails::FromResult(result);
            cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
    
            if (cancellation->Reason == CancellationReason::Error) {
                cout << "CANCELED: ErrorCode= " << (int)cancellation->ErrorCode << std::endl;
                cout << "CANCELED: ErrorDetails=" << cancellation->ErrorDetails << std::endl;
                cout << "CANCELED: Did you update the subscription info?" << std::endl;
            }
        }
    }
    
    int main(int argc, char **argv) {
        setlocale(LC_ALL, "");
        recognizeSpeech();
        return 0;
    }
    
  2. In this new file, replace the string YourSubscriptionKey with your Speech Services subscription key.

  3. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

Build the app

Note

Make sure to enter the commands below as a single command line. The easiest way to do that is to copy the command by using the Copy button next to each command, and then paste it at your shell prompt.

  • On an x64 (64-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/x64" -l:libasound.so.2
    
  • On an x86 (32-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/x86" -l:libasound.so.2
    
  • On an ARM64 (64-bit) system, run the following command to build the application.

    g++ helloworld.cpp -o helloworld -I "$SPEECHSDK_ROOT/include/cxx_api" -I "$SPEECHSDK_ROOT/include/c_api" --std=c++14 -lpthread -lMicrosoft.CognitiveServices.Speech.core -L "$SPEECHSDK_ROOT/lib/arm64" -l:libasound.so.2
    

Run the app

  1. Configure the loader's library path to point to the Speech SDK library.

    • On an x64 (64-bit) system, enter the following command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x64"
      
    • On an x86 (32-bit) system, enter this command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/x86"
      
    • On an ARM64 (64-bit) system, enter the following command.

      export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:$SPEECHSDK_ROOT/lib/arm64"
      
  2. Run the application.

    ./helloworld
    
  3. In the console window, a prompt appears, requesting that you say something. Speak an English phrase or sentence. Your speech is transmitted to the Speech Services and transcribed to text, which appears in the same window.

    Say something...
    We recognized: What's the weather like?
    

Next steps

In this quickstart you will use the Speech SDK to interactively recognize speech from audio data captured from a microphone. After satisfying a few prerequisites, recognizing speech from a microphone only takes four steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create a SpeechRecognizer object using the SpeechConfig object from above.
  • Using the SpeechRecognizer object, start the recognition process for a single utterance.
  • Inspect the SpeechRecognitionResult returned.

If you prefer to jump right in, view or download all Speech SDK Java Samples on GitHub. Otherwise, let's get started.

Choose your target environment

Prerequisites

Add sample code

  1. To add a new empty class to your Java project, select File > New > Class.

  2. In the New Java Class window, enter speechsdk.quickstart into the Package field, and Main into the Name field.

    Screenshot of New Java Class window

  3. Replace all code in Main.java with the following snippet:

    package speechsdk.quickstart;
    
    import java.util.concurrent.Future;
    import com.microsoft.cognitiveservices.speech.*;
    
    /**
     * Quickstart: recognize speech using the Speech SDK for Java.
     */
    public class Main {
    
        /**
         * @param args Arguments are ignored in this sample.
         */
        public static void main(String[] args) {
            try {
                // Replace below with your own subscription key
                String speechSubscriptionKey = "YourSubscriptionKey";
                // Replace below with your own service region (e.g., "westus").
                String serviceRegion = "YourServiceRegion";
    
                int exitCode = 1;
                SpeechConfig config = SpeechConfig.fromSubscription(speechSubscriptionKey, serviceRegion);
                assert(config != null);
    
                SpeechRecognizer reco = new SpeechRecognizer(config);
                assert(reco != null);
    
                System.out.println("Say something...");
    
                Future<SpeechRecognitionResult> task = reco.recognizeOnceAsync();
                assert(task != null);
    
                SpeechRecognitionResult result = task.get();
                assert(result != null);
    
                if (result.getReason() == ResultReason.RecognizedSpeech) {
                    System.out.println("We recognized: " + result.getText());
                    exitCode = 0;
                }
                else if (result.getReason() == ResultReason.NoMatch) {
                    System.out.println("NOMATCH: Speech could not be recognized.");
                }
                else if (result.getReason() == ResultReason.Canceled) {
                    CancellationDetails cancellation = CancellationDetails.fromResult(result);
                    System.out.println("CANCELED: Reason=" + cancellation.getReason());
    
                    if (cancellation.getReason() == CancellationReason.Error) {
                        System.out.println("CANCELED: ErrorCode=" + cancellation.getErrorCode());
                        System.out.println("CANCELED: ErrorDetails=" + cancellation.getErrorDetails());
                        System.out.println("CANCELED: Did you update the subscription info?");
                    }
                }
    
                reco.close();
    
                System.exit(exitCode);
            } catch (Exception ex) {
                System.out.println("Unexpected exception: " + ex.getMessage());
    
                assert(false);
                System.exit(1);
            }
        }
    }
    
  4. Replace the string YourSubscriptionKey with your subscription key.

  5. Replace the string YourServiceRegion with the region associated with your subscription (for example, westus for the free trial subscription).

  6. Save changes to the project.

Build and run the app

Press F11, or select Run > Debug. The next 15 seconds of speech input from your microphone will be recognized and logged in the console window.

Screenshot of console output after successful recognition

Next steps

In this quickstart you will use the Speech SDK to interactively recognize speech from audio data captured from a microphone. After satisfying a few prerequisites, recognizing speech from a microphone only takes four steps:

  • Create a SpeechConfig object from your subscription key and region.
  • Create a SpeechRecognizer object using the SpeechConfig object from above.
  • Using the SpeechRecognizer object, start the recognition process for a single utterance.
  • Inspect the SpeechRecognitionResult returned.

If you prefer to jump right in, view or download all Speech SDK Python Samples on GitHub. Otherwise, let's get started.

Prerequisites

Before you get started, make sure to:

Support and updates

Updates to the Speech SDK Python package are distributed via PyPI and announced in the Release notes. If a new version is available, you can update to it with the command pip install --upgrade azure-cognitiveservices-speech. Check which version is currently installed by inspecting the azure.cognitiveservices.speech.__version__ variable.

If you have a problem, or you're missing a feature, see Support and help options.

Create a Python application that uses the Speech SDK

Run the sample

You can copy the sample code from this quickstart to a source file quickstart.py and run it in your IDE or in the console:

python quickstart.py

Or you can download this quickstart tutorial as a Jupyter notebook from the Speech SDK sample repository and run it as a notebook.

Sample code

import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and service region (e.g., "westus").
speech_key, service_region = "YourSubscriptionKey", "YourServiceRegion"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

# Creates a recognizer with the given settings
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config)

print("Say something...")


# Starts speech recognition, and returns after a single utterance is recognized. The end of a
# single utterance is determined by listening for silence at the end or until a maximum of 15
# seconds of audio is processed.  The task returns the recognition text as result. 
# Note: Since recognize_once() returns only a single utterance, it is suitable only for single
# shot recognition like command or query. 
# For long-running multi-utterance recognition, use start_continuous_recognition() instead.
result = speech_recognizer.recognize_once()

# Checks result.
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
    print("Recognized: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.NoMatch:
    print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech Recognition canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))

Install and use the Speech SDK with Visual Studio Code

  1. Download and install a 64-bit version of Python, 3.5 or later, on your computer.

  2. Download and install Visual Studio Code.

  3. Open Visual Studio Code and install the Python extension. Select File > Preferences > Extensions from the menu. Search for Python.

    Install the Python extension

  4. Create a folder to store the project in. An example is by using Windows Explorer.

  5. In Visual Studio Code, select the File icon. Then open the folder you created.

    Open a folder

  6. Create a new Python source file, speechsdk.py, by selecting the new file icon.

    Create a file

  7. Copy, paste, and save the Python code to the newly created file.

  8. Insert your Speech Services subscription information.

  9. If selected, a Python interpreter displays on the left side of the status bar at the bottom of the window. Otherwise, bring up a list of available Python interpreters. Open the command palette (Ctrl+Shift+P) and enter Python: Select Interpreter. Choose an appropriate one.

  10. You can install the Speech SDK Python package from within Visual Studio Code. Do that if it's not installed yet for the Python interpreter you selected. To install the Speech SDK package, open a terminal. Bring up the command palette again (Ctrl+Shift+P) and enter Terminal: Create New Integrated Terminal. In the terminal that opens, enter the command python -m pip install azure-cognitiveservices-speech or the appropriate command for your system.

  11. To run the sample code, right-click somewhere inside the editor. Select Run Python File in Terminal. Speak a few words when you're prompted. The transcribed text displays shortly afterward.

    Run a sample

If you have issues following these instructions, refer to the more extensive Visual Studio Code Python tutorial.

Next steps

View or download all Speech SDK Samples on GitHub.

Additional language and platform support

If you've clicked this tab, you probably didn't see a quickstart in your favorite programming language. Don't worry, we have additional quickstart materials and code samples available on GitHub. Use the table to find the right sample for your programming language and platform/OS combination.

Language Additional Quickstarts Code samples
C++ Windows, Linux, macOS
C# .NET Framework, .NET Core, UWP, Unity, Xamarin
Java Android, JRE
Javascript Browser
Node.js Windows, Linux, macOS
Objective-C macOs, iOS iOS, macOS
Python Windows, Linux, macOS
Swift macOs, iOS iOS, macOS