Get started with Custom Keyword

In this quickstart, you learn the basics of working with custom keywords, using Speech Studio and the Speech SDK. A keyword is a word or short phrase which allows your product to be voice activated. You create keyword models in Speech Studio, then you export a model file that you use with the Speech SDK in your applications.

Prerequisites

The steps in this article require a Speech subscription, and the Speech SDK. If you don't already have a subscription, try the Speech service for free. To get the SDK, see the install guide for your platform.

Create a keyword in Speech Studio

Before you can use a custom keyword, you need to create a keyword using the Custom Keyword page on Speech Studio. After you provide a keyword, it produces a .table file that you can use with the Speech SDK.

Important

Custom keyword models, and the resulting .table files, can only be created in Speech Studio. You cannot create custom keywords from the SDK or with REST calls.

  1. Go to the Speech Studio and Sign in or, if you do not yet have a speech subscription, choose Create a subscription.

  2. At the Custom Keyword page, create a New project.

  3. Enter a Name, an optional Description, and select the language. You need one project per language, and support is currently limited to the en-US language.

    Describe your keyword project

  4. Select your project from the list.

    Select your keyword project

  5. To create a new keyword model, click Train model.

  6. Enter a Name for the model, an optional Description, and the Keyword of your choice, then click Next. See the guidelines on choosing an effective keyword.

    Enter your keyword

  7. The portal creates candidate pronunciations for your keyword. Listen to each candidate by clicking the play buttons and remove the checks next to any pronunciations that are incorrect. Once only good pronunciations are checked, click Train to begin generating the keyword model.

    Screenshot that shows where you choose the correct pronounciations.

  8. It may take up to thirty minutes for the model to be generated. The keyword list will change from Processing to Succeeded when the model is complete. You can then download the file.

    Review your keyword

  9. The downloaded file is a .zip archive. Extract the archive, and you see a file with the .table extension. This is the file you use with the SDK in the next section, so make sure to note its path. the file name mirrors your keyword name, for example a keyword Activate device has the file name Activate_device.table.

Use a keyword model with the SDK

First, load your keyword model file using the FromFile() static function, which returns a KeywordRecognitionModel. Use the path to the .table file you downloaded from Speech Studio. Additionally, you create an AudioConfig using the default microphone, then instantiate a new KeywordRecognizer using the audio configuration.

using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

var keywordModel = KeywordRecognitionModel.FromFile("your/path/to/Activate_device.table");
using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
using var keywordRecognizer = new KeywordRecognizer(audioConfig);

Next, running keyword recognition is done with one call to RecognizeOnceAsync() by passing your model object. This starts a keyword recognition session that lasts until the keyword is recognized. Thus, you generally use this design pattern in multi-threaded applications, or in use cases where you may be waiting for a wake-word indefinitely.

KeywordRecognitionResult result = await keywordRecognizer.RecognizeOnceAsync(keywordModel);

Note

The example shown here uses local keyword recognition, since it does not require a SpeechConfig object for authentication context, and does not contact the back-end. However, you can run both keyword recognition and verification utilizing a direct back-end connection.

Continuous recognition

Other classes in the Speech SDK support continuous recognition (for both speech and intent recognition) with keyword recognition. This allows you to use the same code you would normally use for continuous recognition, with the ability to reference a .table file for your keyword model.

For speech-to-text, follow the same design pattern shown in the quickstart to set up continuous recognition. Then, replace the call to recognizer.StartContinuousRecognitionAsync() with recognizer.StartKeywordRecognitionAsync(KeywordRecognitionModel), and pass your KeywordRecognitionModel object. To stop continuous recognition with keyword recognition, use recognizer.StopKeywordRecognitionAsync() instead of recognizer.StopContinuousRecognitionAsync().

Intent recognition uses an identical pattern with the StartKeywordRecognitionAsync and StopKeywordRecognitionAsync functions.

See the sample on GitHub for using your Custom Keyword model with the Python SDK.

See the sample on GitHub for using your Custom Keyword model with the Objective C SDK.

Next steps

Test your custom keyword with the Speech Devices SDK Quickstart.