Best practices for building a language understanding app with Cognitive Services
Use the app authoring process to build your LUIS app.
- Build language model
- Add a few training example utterances (10-15 per intent)
- Test from endpoint
- Add features
Once your app is published, use the authoring cycle of add features, publish, and test from endpoint. Do not begin the next authoring cycle by adding more example utterances. That does not let LUIS learn your model with real-world user utterances.
In order for LUIS to be efficient at its job of learning, do not expand the utterances until the current set of both example and endpoint utterances are returning confident, high prediction scores. Improve scores using active learning, patterns, and phrase lists.
Do and Don't
The following list includes best practices for LUIS apps:
Do define distinct intents
Make sure the vocabulary for each intent is just for that intent and not overlapping with a different intent. For example, if you want to have an app that handles travel arrangements such as airline flights and hotels, you can choose to have these subject areas as separate intents or the same intent with entities for specific data inside the utterance.
If the vocabulary between two intents is the same, combine the intent, and use entities.
Consider the following example utterances:
|Book a flight|
|Book a hotel|
"Book a flight" and "Book a hotel" use the same vocabulary of "book a ". This format is the same so it should be the same intent with the different words of flight and hotel as extracted entities.
Do find sweet spot for intents
Use prediction data from LUIS to determine if your intents are overlapping. Overlapping intents confuse LUIS. The result is that the top scoring intent is too close to another intent. Because LUIS does not use the exact same path through the data for training each time, an overlapping intent has a chance of being first or second in training. You want the utterance's score for each intention to be farther apart so this flip/flop doesn't happen. Good distinction for intents should result in the expected top intent every time.
Do build the app iteratively
Keep a separate set of utterances that isn't used as example utterances or endpoint utterances. Keep improving the app for your test set. Adapt the test set to reflect real user utterances. Use this test set to evaluate each iteration or version of the app.
Developers should have three sets of data. The first is the example utterances for building the model. The second is for testing the model at the endpoint. The third is the blind test data used in batch testing. This last set isn't used in training the application nor sent on the endpoint.
Do add phrase lists and patterns in later iterations
Phrase lists allow you to define dictionaries of words related to your app domain. Seed your phrase list with a few words then use the suggest feature so LUIS knows about more words in the vocabulary specific to your app. Don't add every word to the vocabulary since the phrase list isn't an exact match.
Real user utterances from the endpoint, very similar to each other, may reveal patterns of word choice and placement. The pattern feature takes this word choice and placement along with regular expressions to improve your prediction accuracy. A regular expression in the pattern allows for words and punctuation you intend to ignore while still matching the pattern.
Do not apply these practices before your app has received endpoint requests because that skews the confidence.
Balance your utterances across all intents
In order for LUIS to predictions to be accurate, the quantity of example utterances in each intent (except for the None intent), must be relatively equal.
If you have an intent with 100 example utterances and an intent with 20 example utterances, the 100-utterance intent will have a higher rate of prediction.
Do add example utterances to None intent
This intent is the fallback intent, indicated everything outside your application. Add one example utterance to the None intent for every 10 example utterances in the rest of your LUIS app.
Do leverage the suggest feature for active learning
Use active learning's Review endpoint utterances on a regular basis, instead of adding more example utterances to intents. Because the app is constantly receiving endpoint utterances, this list is growing and changing.
Do monitor the performance of your app
Monitor the prediction accuracy using a batch test set.
Don't add many example utterances to intents
After the app is published, only add utterances from active learning in the iterative process. If utterances are too similar, add a pattern.
Don't use LUIS as a training platform
LUIS is specific to a language model's domain. It isn't meant to work as a general natural language training platform.
Don't add many example utterances of the same format, ignoring other formats
LUIS expects variations in an intent's utterances. The utterances can vary while having the same overall meaning. Variations can include utterance length, word choice, and word placement.
|Don't use same format||Do use varying format|
|Buy a ticket to Seattle
Buy a ticket to Paris
Buy a ticket to Orlando
|Buy 1 ticket to Seattle
Reserve two seats on the red eye to Paris next Monday
I would like to book 3 tickets to Orlando for spring break
The second column uses different verbs (buy, reserve, book), different quantities (1, two, 3), and different arrangements of words but all have the same intention of purchasing airline tickets for travel.
Don't mix the definition of intents and entities
Create an intent for any action your bot will take. Use entities as parameters that make that action possible.
For a chatbot that will book airline flights, create a BookFlight intent. Do not create an intent for every airline or every destination. Use those pieces of data as entities and mark them in the example utterances.
Don't create phrase lists with all the possible values
Provide a few examples in the phrase lists but not every word. LUIS generalizes and takes context into account.
Don't add many patterns
Don't add too many patterns. LUIS is meant to learn quickly with fewer examples. Don't overload the system unnecessarily.
Don't train and publish with every single example utterance
Add 10 or 15 utterances before training and publishing. That allows you to see the impact on prediction accuracy. Adding a single utterance may not have a visible impact on the score.
Do use versions for each app iteration
Each authoring cycle should be within a new version, cloned from an existing version. LUIS has no limit for versions. A version name is used as part of the API route so it is important to pick characters allowed in a URL as well as keeping within the 10 character count for a version. Develop a version name strategy to keep your versions organized.
- Learn how to plan your app in your LUIS app.