Quickstart: Recognize speech with the Speech SDK for Python

This article shows how to use the Speech Services through the Speech SDK for Python. It illustrates how to recognize speech from microphone input.

Prerequisites

  • An Azure subscription key for the Speech Services. Get one for free.

  • Python 3.5 or later.

  • The Python Speech SDK package is available for these operating systems:

    • Windows: x64 and x86.
    • Mac: macOS X version 10.12 or later.
    • Linux: Ubuntu 16.04 or 18.04 on x64.
  • On Ubuntu, run these commands to install the required packages:

    sudo apt-get update
    sudo apt-get install build-essential libssl1.0.0 libasound2 wget
    
  • On Windows, you also need the Microsoft Visual C++ Redistributable for Visual Studio 2017 for your platform.

Install the Speech SDK

Important

By downloading any of the Speech SDK for Azure Cognitive Services components on this page, you acknowledge its license. See the Microsoft Software License Terms for the Speech SDK.

This command installs the Python package from PyPI for the Speech SDK:

pip install azure-cognitiveservices-speech

Support and updates

Updates to the Speech SDK Python package are distributed via PyPI and announced in the Release notes. If a new version is available, you can update to it with the command pip install --upgrade azure-cognitiveservices-speech. Check which version is currently installed by inspecting the azure.cognitiveservices.speech.__version__ variable.

If you have a problem, or you're missing a feature, see Support and help options.

Create a Python application that uses the Speech SDK

Run the sample

You can copy the sample code from this quickstart to a source file quickstart.py and run it in your IDE or in the console:

python quickstart.py

Or you can download this quickstart tutorial as a Jupyter notebook from the Speech SDK sample repository and run it as a notebook.

Sample code

import azure.cognitiveservices.speech as speechsdk

# Creates an instance of a speech config with specified subscription key and service region.
# Replace with your own subscription key and service region (e.g., "westus").
speech_key, service_region = "YourSubscriptionKey", "YourServiceRegion"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

# Creates a recognizer with the given settings
speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config)

print("Say something...")


# Starts speech recognition, and returns after a single utterance is recognized. The end of a
# single utterance is determined by listening for silence at the end or until a maximum of 15
# seconds of audio is processed.  The task returns the recognition text as result. 
# Note: Since recognize_once() returns only a single utterance, it is suitable only for single
# shot recognition like command or query. 
# For long-running multi-utterance recognition, use start_continuous_recognition() instead.
result = speech_recognizer.recognize_once()

# Checks result.
if result.reason == speechsdk.ResultReason.RecognizedSpeech:
    print("Recognized: {}".format(result.text))
elif result.reason == speechsdk.ResultReason.NoMatch:
    print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
    cancellation_details = result.cancellation_details
    print("Speech Recognition canceled: {}".format(cancellation_details.reason))
    if cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(cancellation_details.error_details))

Install and use the Speech SDK with Visual Studio Code

  1. Download and install a 64-bit version of Python, 3.5 or later, on your computer.

  2. Download and install Visual Studio Code.

  3. Open Visual Studio Code and install the Python extension. Select File > Preferences > Extensions from the menu. Search for Python.

    Install the Python extension

  4. Create a folder to store the project in. An example is by using Windows Explorer.

  5. In Visual Studio Code, select the File icon. Then open the folder you created.

    Open a folder

  6. Create a new Python source file, speechsdk.py, by selecting the new file icon.

    Create a file

  7. Copy, paste, and save the Python code to the newly created file.

  8. Insert your Speech Services subscription information.

  9. If selected, a Python interpreter displays on the left side of the status bar at the bottom of the window. Otherwise, bring up a list of available Python interpreters. Open the command palette (Ctrl+Shift+P) and enter Python: Select Interpreter. Choose an appropriate one.

  10. You can install the Speech SDK Python package from within Visual Studio Code. Do that if it's not installed yet for the Python interpreter you selected. To install the Speech SDK package, open a terminal. Bring up the command palette again (Ctrl+Shift+P) and enter Terminal: Create New Integrated Terminal. In the terminal that opens, enter the command python -m pip install azure-cognitiveservices-speech or the appropriate command for your system.

  11. To run the sample code, right-click somewhere inside the editor. Select Run Python File in Terminal. Speak a few words when you're prompted. The transcribed text displays shortly afterward.

    Run a sample

If you have issues following these instructions, refer to the more extensive Visual Studio Code Python tutorial.

Next steps