Batch testing with a set of example utterances

Batch testing is a comprehensive test on your current trained model to measure its performance in LUIS. The data sets used for batch testing should not include example utterances in the intents or utterances received from the prediction runtime endpoint.

Caution

This document has not been updated with text and screenshots for the latest LUIS portal.

Import a dataset file for batch testing

  1. Select Test in the top bar, and then select Batch testing panel.

    Batch Testing Link

  2. Select Import dataset. The Import new dataset dialog box appears. Select Choose File and locate a JSON file with the correct JSON format that contains no more than 1,000 utterances to test.

    Import errors are reported in a red notification bar at the top of the browser. When an import has errors, no dataset is created. For more information, see Common errors.

  3. In the Dataset Name field, enter a name for your dataset file. The dataset file includes an array of utterances including the labeled intent and entities. Review the example batch file for syntax.

  4. Select Done. The dataset file is added.

Run, rename, export, or delete dataset

To run, rename, export, or delete the dataset, use the ellipsis (...) button at the end of the dataset row.

Dataset Actions

Run a batch test on your trained app

To run the test, select the dataset name. When the test completes, this row displays the test result of the dataset.

Batch Test Result

The downloadable dataset is the same file that was uploaded for batch testing.

State Meaning
Successful test green circle icon All utterances are successful.
Failing test red x icon At least one utterance intent did not match the prediction.
Ready to test icon Test is ready to run.

View batch test results

To review the batch test results, select See results.

Batch test results

Filter chart results

To filter the chart by a specific intent or entity, select the intent or entity in the right-side filtering panel. The data points and their distribution update in the graph according to your selection.

Visualized Batch Test Result

View single-point utterance data

In the chart, hover over a data point to see the certainty score of its prediction. Select a data point to retrieve its corresponding utterance in the utterances list at the bottom of the page.

Selected utterance

View section data

In the four-section chart, select the section name, such as False Positive at the top-right of the chart. Below the chart, all utterances in that section display below the chart in a list.

Selected utterances by section

In this preceding image, the utterance switch on is labeled with the TurnAllOn intent, but received the prediction of None intent. This is an indication that the TurnAllOn intent needs more example utterances in order to make the expected prediction.

The two sections of the chart in red indicate utterances that did not match the expected prediction. These indicate utterances which LUIS needs more training.

The two sections of the chart in green did match the expected prediction.

Roles in batch testing

Caution

Entity roles are not supported in batch testing.

Next steps

If testing indicates that your LUIS app doesn't recognize the correct intents and entities, you can work to improve your LUIS app's performance by labeling more utterances or adding features.