Tutorial: Recognize intents from speech using the Speech SDK for C#

The Cognitive Services Speech SDK integrates with the Language Understanding service (LUIS) to provide intent recognition. An intent is something the user wants to do: book a flight, check the weather, or make a call. The user can use whatever terms feel natural. Using machine learning, LUIS maps user requests to the intents you have defined.

Note

A LUIS application defines the intents and entities you want to recognize. It's separate from the C# application that uses the Speech service. In this article, "app" means the LUIS app, while "application" means the C# code.

In this tutorial, you use the Speech SDK to develop a C# console application that derives intents from user utterances through your device's microphone. You'll learn how to:

  • Create a Visual Studio project referencing the Speech SDK NuGet package
  • Create a speech config and get an intent recognizer
  • Get the model for your LUIS app and add the intents you need
  • Specify the language for speech recognition
  • Recognize speech from a file
  • Use asynchronous, event-driven continuous recognition

Prerequisites

Be sure you have the following before you begin this tutorial.

  • A LUIS account. You can get one for free through the LUIS portal.
  • Visual Studio 2017 (any edition).

LUIS and speech

LUIS integrates with the Speech Services to recognize intents from speech. You don't need a Speech Services subscription, just LUIS.

LUIS uses two kinds of keys:

Key type Purpose
authoring lets you create and modify LUIS apps programmatically
endpoint authorizes access to a particular LUIS app

The endpoint key is the LUIS key needed for this tutorial. This tutorial uses the example Home Automation LUIS app, which you can create by following Use prebuilt Home automation app. If you have created a LUIS app of your own, you can use it instead.

When you create a LUIS app, a starter key is automatically generated so you can test the app using text queries. This key does not enable the Speech Services integration and won't work with this tutorial. You must create a LUIS resource in the Azure dashboard and assign it to the LUIS app. You can use the free subscription tier for this tutorial.

After creating the LUIS resource in the Azure dashboard, log into the LUIS portal, choose your application on the My Apps page, then switch to the app's Manage page. Finally, click Keys and Endpoints in the sidebar.

LUIS portal keys and endpoint settings

On the Keys and Endpoint settings page:

  1. Scroll down to the Resources and Keys section and click Assign resource.

  2. In the Assign a key to your app dialog, choose the following:

    • Choose Microsoft as the Tenant.
    • Under Subscription Name, choose the Azure subscription that contains the LUIS resource you want to use.
    • Under Key, choose the LUIS resource that you want to use with the app.

In a moment, the new subscription appears in the table at the bottom of the page. Click the icon next to a key to copy it to the clipboard. (You may use either key.)

LUIS app subscription keys

Create a speech project in Visual Studio

  1. Open Visual Studio 2019.

  2. In the Start window, select Create a new project.

  3. Select Console App (.NET Framework), and then select Next.

  4. In Project name, enter helloworld, and then select Create.

  5. From the menu bar in Visual Studio, select Tools > Get Tools and Features, and check whether the .NET desktop development workload is available. If the workload hasn't been installed, mark the checkbox, then select Modify to start the installation. It may take a few minutes to download and install.

    If the checkbox next to .NET desktop development is selected, you can close the dialog box now.

    Enable .NET desktop development

The next step is to install the Speech SDK NuGet package, so you can reference it in the code.

  1. In the Solution Explorer, right-click helloworld, and then select Manage NuGet Packages to show the NuGet Package Manager.

    NuGet Package Manager

  2. In the upper-right corner, find the Package Source drop-down box, and make sure that nuget.org is selected.

  3. In the upper-left corner, select Browse.

  4. In the search box, type Microsoft.CognitiveServices.Speech package and press Enter.

  5. Select Microsoft.CognitiveServices.Speech, and then select Install to install the latest stable version.

    Install Microsoft.CognitiveServices.Speech NuGet package

  6. Accept all agreements and licenses to start the installation.

    After the package is installed, a confirmation appears in the Package Manager Console window.

Now, to build and run the console application, create a platform configuration matching your computer's architecture.

  1. From the menu bar, select Build > Configuration Manager. The Configuration Manager dialog box appears.

    Configuration Manager dialog box

  2. In the Active solution platform drop-down box, select New. The New Solution Platform dialog box appears.

  3. In the Type or select the new platform drop-down box:

    • If you're running 64-bit Windows, select x64.
    • If you're running 32-bit Windows, select x86.
  4. Select OK and then Close.

Add the code

Open the file Program.cs in the Visual Studio project and replace the block of using statements at the beginning of the file with the following declarations.

using System;
using System.Threading.Tasks;
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using Microsoft.CognitiveServices.Speech.Intent;

Inside the provided Main() method, add the following code.

RecognizeIntentAsync().Wait();
Console.WriteLine("Please press Enter to continue.");
Console.ReadLine();

Create an empty asynchronous method RecognizeIntentAsync(), as shown here.

static async Task RecognizeIntentAsync()
{
}

In the body of this new method, add this code.

// Creates an instance of a speech config with specified subscription key
// and service region. Note that in contrast to other services supported by
// the Cognitive Services Speech SDK, the Language Understanding service
// requires a specific subscription key from https://www.luis.ai/.
// The Language Understanding service calls the required key 'endpoint key'.
// Once you've obtained it, replace with below with your own Language Understanding subscription key
// and service region (e.g., "westus").
// The default language is "en-us".
var config = SpeechConfig.FromSubscription("YourLanguageUnderstandingSubscriptionKey", "YourLanguageUnderstandingServiceRegion");

// Creates an intent recognizer using microphone as audio input.
using (var recognizer = new IntentRecognizer(config))
{
    // Creates a Language Understanding model using the app id, and adds specific intents from your model
    var model = LanguageUnderstandingModel.FromAppId("YourLanguageUnderstandingAppId");
    recognizer.AddIntent(model, "YourLanguageUnderstandingIntentName1", "id1");
    recognizer.AddIntent(model, "YourLanguageUnderstandingIntentName2", "id2");
    recognizer.AddIntent(model, "YourLanguageUnderstandingIntentName3", "any-IntentId-here");

    // Starts recognizing.
    Console.WriteLine("Say something...");

    // Starts intent recognition, and returns after a single utterance is recognized. The end of a
    // single utterance is determined by listening for silence at the end or until a maximum of 15
    // seconds of audio is processed.  The task returns the recognition text as result. 
    // Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
    // shot recognition like command or query. 
    // For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
    var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);

    // Checks result.
    if (result.Reason == ResultReason.RecognizedIntent)
    {
        Console.WriteLine($"RECOGNIZED: Text={result.Text}");
        Console.WriteLine($"    Intent Id: {result.IntentId}.");
        Console.WriteLine($"    Language Understanding JSON: {result.Properties.GetProperty(PropertyId.LanguageUnderstandingServiceResponse_JsonResult)}.");
    }
    else if (result.Reason == ResultReason.RecognizedSpeech)
    {
        Console.WriteLine($"RECOGNIZED: Text={result.Text}");
        Console.WriteLine($"    Intent not recognized.");
    }
    else if (result.Reason == ResultReason.NoMatch)
    {
        Console.WriteLine($"NOMATCH: Speech could not be recognized.");
    }
    else if (result.Reason == ResultReason.Canceled)
    {
        var cancellation = CancellationDetails.FromResult(result);
        Console.WriteLine($"CANCELED: Reason={cancellation.Reason}");

        if (cancellation.Reason == CancellationReason.Error)
        {
            Console.WriteLine($"CANCELED: ErrorCode={cancellation.ErrorCode}");
            Console.WriteLine($"CANCELED: ErrorDetails={cancellation.ErrorDetails}");
            Console.WriteLine($"CANCELED: Did you update the subscription info?");
        }
    }
}

Replace the placeholders in this method with your LUIS subscription key, region, and app ID as follows.

Placeholder Replace with
YourLanguageUnderstandingSubscriptionKey Your LUIS endpoint key. As previously noted, this must be a key obtained from your Azure dashboard, not a "starter key." You can find it on your app's Keys and Endpoints page (under Manage) in the LUIS portal.
YourLanguageUnderstandingServiceRegion The short identifier for the region your LUIS subscription is in, such as westus for West US. See Regions.
YourLanguageUnderstandingAppId The LUIS app ID. You can find it on your app's Settings page of the LUIS portal.

With these changes made, you can build (Control-Shift-B) and run (F5) the tutorial application. When prompted, try saying "Turn off the lights" into your PC's microphone. The result is displayed in the console window.

The following sections include a discussion of the code.

Create an intent recognizer

The first step in recognizing intents in speech is to create a speech config from your LUIS endpoint key and region. Speech configs can be used to create recognizers for the various capabilities of the Speech SDK. The speech config has multiple ways to specify the subscription you want to use; here, we use FromSubscription, which takes the subscription key and region.

Note

Use the key and region of your LUIS subscription, not of a Speech Services subscription.

Next, create an intent recognizer using new IntentRecognizer(config). Since the configuration already knows which subscription to use, there's no need to specify the subscription key and endpoint again when creating the recognizer.

Import a LUIS model and add intents

Now import the model from the LUIS app using LanguageUnderstandingModel.FromAppId() and add the LUIS intents that you wish to recognize via the recognizer's AddIntent() method. These two steps improve the accuracy of speech recognition by indicating words that the user is likely to use in their requests. It is not necessary to add all the app's intents if you do not need to recognize them all in your application.

Adding intents requires three arguments: the LUIS model (which has been created and is named model), the intent name, and an intent ID. The difference between the ID and the name is as follows.

AddIntent() argument Purpose
intentName The name of the intent as defined in the LUIS app. Must match the LUIS intent name exactly.
intentID An ID assigned to a recognized intent by the Speech SDK. Can be whatever you like; does not need to correspond to the intent name as defined in the LUIS app. If multiple intents are handled by the same code, for instance, you could use the same ID for them.

The Home Automation LUIS app has two intents: one for turning on a device, and another for turning a device off. The lines below add these intents to the recognizer; replace the three AddIntent lines in the RecognizeIntentAsync() method with this code.

recognizer.AddIntent(model, "HomeAutomation.TurnOff", "off");
recognizer.AddIntent(model, "HomeAutomation.TurnOn", "on");

Instead of adding individual intents, you can also use the AddAllIntents method to add all the intents in a model to the recognizer.

Start recognition

With the recognizer created and the intents added, recognition can begin. The Speech SDK supports both single-shot and continuous recognition.

Recognition mode Methods to call Result
Single-shot RecognizeOnceAsync() Returns the recognized intent, if any, after one utterance.
Continuous StartContinuousRecognitionAsync()
StopContinuousRecognitionAsync()
Recognizes multiple utterances. Emits events (e.g. IntermediateResultReceived) when results are available.

The tutorial application uses single-shot mode and so calls RecognizeOnceAsync() to begin recognition. The result is an IntentRecognitionResult object containing information about the intent recognized. The LUIS JSON response is extracted by the following expression:

result.Properties.GetProperty(PropertyId.LanguageUnderstandingServiceResponse_JsonResult)

The tutorial application doesn't parse the JSON result, only displaying it in the console window.

LUIS recognition results

Specify recognition language

By default, LUIS recognizes intents in US English (en-us). By assigning a locale code to the SpeechRecognitionLanguage property of the speech configuration, you can recognize intents in other languages. For example, add config.SpeechRecognitionLanguage = "de-de"; in our tutorial application before creating the recognizer to recognize intents in German. See Supported Languages.

Continuous recognition from a file

The following code illustrates two additional capabilities of intent recognition using the Speech SDK. The first, previously mentioned, is continuous recognition, where the recognizer emits events when results are available. These events can then be processed by event handlers that you provide. With continuous recognition, you call the recognizer's StartContinuousRecognitionAsync() to start recognition instead of RecognizeOnceAsync().

The other capability is reading the audio containing the speech to be processed from a WAV file. This involves creating an audio configuration that can be used when creating the intent recognizer. The file must be single-channel (mono) with a sampling rate of 16 kHz.

To try out these features, replace the body of the RecognizeIntentAsync() method with the following code.

// Creates an instance of a speech config with specified subscription key
// and service region. Note that in contrast to other services supported by
// the Cognitive Services Speech SDK, the Language Understanding service
// requires a specific subscription key from https://www.luis.ai/.
// The Language Understanding service calls the required key 'endpoint key'.
// Once you've obtained it, replace with below with your own Language Understanding subscription key
// and service region (e.g., "westus").
var config = SpeechConfig.FromSubscription("YourLanguageUnderstandingSubscriptionKey", "YourLanguageUnderstandingServiceRegion");

// Creates an intent recognizer using file as audio input.
// Replace with your own audio file name.
using (var audioInput = AudioConfig.FromWavFileInput("whatstheweatherlike.wav"))
{
    using (var recognizer = new IntentRecognizer(config, audioInput))
    {
        // The TaskCompletionSource to stop recognition.
        var stopRecognition = new TaskCompletionSource<int>();

        // Creates a Language Understanding model using the app id, and adds specific intents from your model
        var model = LanguageUnderstandingModel.FromAppId("YourLanguageUnderstandingAppId");
        recognizer.AddIntent(model, "YourLanguageUnderstandingIntentName1", "id1");
        recognizer.AddIntent(model, "YourLanguageUnderstandingIntentName2", "id2");
        recognizer.AddIntent(model, "YourLanguageUnderstandingIntentName3", "any-IntentId-here");

        // Subscribes to events.
        recognizer.Recognizing += (s, e) => {
            Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
        };

        recognizer.Recognized += (s, e) => {
            if (e.Result.Reason == ResultReason.RecognizedIntent)
            {
                Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
                Console.WriteLine($"    Intent Id: {e.Result.IntentId}.");
                Console.WriteLine($"    Language Understanding JSON: {e.Result.Properties.GetProperty(PropertyId.LanguageUnderstandingServiceResponse_JsonResult)}.");
            }
            else if (e.Result.Reason == ResultReason.RecognizedSpeech)
            {
                Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
                Console.WriteLine($"    Intent not recognized.");
            }
            else if (e.Result.Reason == ResultReason.NoMatch)
            {
                Console.WriteLine($"NOMATCH: Speech could not be recognized.");
            }
        };

        recognizer.Canceled += (s, e) => {
            Console.WriteLine($"CANCELED: Reason={e.Reason}");

            if (e.Reason == CancellationReason.Error)
            {
                Console.WriteLine($"CANCELED: ErrorCode={e.ErrorCode}");
                Console.WriteLine($"CANCELED: ErrorDetails={e.ErrorDetails}");
                Console.WriteLine($"CANCELED: Did you update the subscription info?");
            }

            stopRecognition.TrySetResult(0);
        };

        recognizer.SessionStarted += (s, e) => {
            Console.WriteLine("\n    Session started event.");
        };

        recognizer.SessionStopped += (s, e) => {
            Console.WriteLine("\n    Session stopped event.");
            Console.WriteLine("\nStop recognition.");
            stopRecognition.TrySetResult(0);
        };


        // Starts continuous recognition. Uses StopContinuousRecognitionAsync() to stop recognition.
        Console.WriteLine("Say something...");
        await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);

        // Waits for completion.
        // Use Task.WaitAny to keep the task rooted.
        Task.WaitAny(new[] { stopRecognition.Task });

        // Stops recognition.
        await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
    }
}

Revise the code to include your LUIS endpoint key, region, and app ID and to add the Home Automation intents, as before. Change whatstheweatherlike.wav to the name of your audio file. Then build and run.

Get the samples

For the latest samples, see the Cognitive Services Speech SDK sample code repository on GitHub.

Look for the code from this article in the samples/csharp/sharedcontent/console folder.

Next steps