Get started with Custom Keyword
In this quickstart, you learn the basics of working with custom keywords, using Speech Studio and the Speech SDK. A keyword is a word or short phrase which allows your product to be voice activated. You create keyword models in Speech Studio, then you export a model file that you use with the Speech SDK in your applications.
The steps in this article require a Speech subscription, and the Speech SDK. If you don't already have a subscription, try the Speech service for free. To get the SDK, see the install guide for your platform.
Create a keyword in Speech Studio
Before you can use a custom keyword, you need to create a keyword using the Custom Keyword page on Speech Studio. After you provide a keyword, it produces a
.table file that you can use with the Speech SDK.
Custom keyword models, and the resulting
.table files, can only be created in Speech Studio.
You cannot create custom keywords from the SDK or with REST calls.
At the Custom Keyword page, create a New project.
Enter a Name, an optional Description, and select the language. You need one project per language, and support is currently limited to the
Select your project from the list.
To create a new keyword model, click Train model.
Enter a Name for the model, an optional Description, and the Keyword of your choice, then click Next. See the guidelines on choosing an effective keyword.
The portal creates candidate pronunciations for your keyword. Listen to each candidate by clicking the play buttons and remove the checks next to any pronunciations that are incorrect. Once only good pronunciations are checked, click Train to begin generating the keyword model.
It may take up to thirty minutes for the model to be generated. The keyword list will change from Processing to Succeeded when the model is complete. You can then download the file.
The downloaded file is a
.ziparchive. Extract the archive, and you see a file with the
.tableextension. This is the file you use with the SDK in the next section, so make sure to note its path. the file name mirrors your keyword name, for example a keyword Activate device has the file name
Use a keyword model with the SDK
First, load your keyword model file using the
FromFile() static function, which returns a
KeywordRecognitionModel. Use the path to the
.table file you downloaded from Speech Studio. Additionally, you create an
AudioConfig using the default microphone, then instantiate a new
KeywordRecognizer using the audio configuration.
using Microsoft.CognitiveServices.Speech; using Microsoft.CognitiveServices.Speech.Audio; var keywordModel = KeywordRecognitionModel.FromFile("your/path/to/Activate_device.table"); using var audioConfig = AudioConfig.FromDefaultMicrophoneInput(); using var keywordRecognizer = new KeywordRecognizer(audioConfig);
Next, running keyword recognition is done with one call to
RecognizeOnceAsync() by passing your model object. This starts a keyword recognition session that lasts until the keyword is recognized. Thus, you generally use this design pattern in multi-threaded applications, or in use cases where you may be waiting for a wake-word indefinitely.
KeywordRecognitionResult result = await keywordRecognizer.RecognizeOnceAsync(keywordModel);
The example shown here uses local keyword recognition, since it does not require a
object for authentication context, and does not contact the back-end. However, you can run both keyword recognition and verification utilizing a direct back-end connection.
Other classes in the Speech SDK support continuous recognition (for both speech and intent recognition) with keyword recognition. This allows you to use the same code you would normally use for continuous recognition, with the ability to reference a
.table file for your keyword model.
For speech-to-text, follow the same design pattern shown in the quickstart to set up continuous recognition. Then, replace the call to
recognizer.StartKeywordRecognitionAsync(KeywordRecognitionModel), and pass your
KeywordRecognitionModel object. To stop continuous recognition with keyword recognition, use
recognizer.StopKeywordRecognitionAsync() instead of
See the sample on GitHub for using your Custom Keyword model with the Python SDK.
See the sample on GitHub for using your Custom Keyword model with the Objective C SDK.
Test your custom keyword with the Speech Devices SDK Quickstart.