Tutorial: Extract names with simple entity and a phrase list

In this tutorial, extract machine-learned data of employment job name from an utterance using the Simple entity. To increase the extraction accuracy, add a phrase list of terms specific to the simple entity.

The simple entity detects a single data concept contained in words or phrases.

In this tutorial, you learn how to:

  • Import example app
  • Add simple entity
  • Add phrase list to boost signal words
  • Train
  • Publish
  • Get intents and entities from endpoint

For this article, you can use the free LUIS account in order to author your LUIS application.

Simple entity

This tutorial adds a new simple entity to extract the job name. The purpose of the simple entity in this LUIS app is to teach LUIS what a job name is and where it can be found in an utterance. The part of the utterance that is the job name can change from utterance to utterance based on word choice and utterance length. LUIS needs examples of job names across all intents that use job names.

The simple entity is a good fit for this type of data when:

  • Data is a single concept.
  • Data is not well-formatted such as a regular expression.
  • Data is not common such as a prebuilt entity of phone number or data.
  • Data is not matched exactly to a list of known words, such as a list entity.
  • Data does not contain other data items such as a composite entity or contextual roles.

Consider the following utterances from a chat bot:

Utterance Extractable job name
I want to apply for the new accounting job. accounting
Submit my resume for the engineering position. engineering
Fill out application for job 123456 123456

The job name is difficult to determine because a name can be a noun, verb, or a phrase of several words. For example:

Jobs
engineer
software engineer
senior software engineer
engineering team lead
air traffic controller
motor vehicle operator
ambulance driver
tender
extruder
millwright

This LUIS app has job names in several intents. By labeling these words in all the intents' utterances, LUIS learns more about what a job name is and where it is found in utterances.

Once the entities are marked in the example utterances, it is important to add a phrase list to boost the signal of the simple entity. A phrase list is not used as an exact match and does not need to be every possible value you expect.

Import example app

  1. Download and save the app JSON file from the Intents tutorial.

  2. Import the JSON into a new app.

  3. From the Manage section, on the Versions tab, clone the version, and name it simple. Cloning is a great way to play with various LUIS features without affecting the original version. Because the version name is used as part of the URL route, the name can't contain any characters that are not valid in a URL.

Mark entities in example utterances of an intent

  1. Make sure your Human Resources app is in the Build section of LUIS. You can change to this section by selecting Build on the top, right menu bar.

  2. On the Intents page, select ApplyForJob intent.

  3. In the utterance, I want to apply for the new accounting job, select accounting, enter Job in the top field of the pop-up menu, then select Create new entity in the pop-up menu.

    Screenshot of LUIS with 'ApplyForJob' intent with create entity steps highlighted

  4. In the pop-up window, verify the entity name and type and select Done.

    Create simple entity pop-up modal dialog with name of Job and type of simple

  5. In the remaining utterances, mark the job-related words with Job entity by selecting the word or phrase, then selecting Job from the pop-up menu.

    Screenshot of LUIS labeling job entity highlighted

Add more example utterances and mark entity

Simple entities need many examples in order to have a high confidence of prediction.

  1. Add more utterances and mark the job words or phrases as Job entity.

    Utterance Job entity
    I'm applying for the Program Manager desk in R&D Program Manager
    Here is my line cook application. line cook
    My resume for camp counselor is attached. camp counselor
    This is my c.v. for administrative assistant. administrative assistant
    I want to apply for the management job in sales. management, sales
    This is my resume for the new accounting position. accounting
    My application for barback is included. barback
    I'm submitting my application for roofer and framer. roofer, framer
    My c.v. for bus driver is here. bus driver
    I'm a registered nurse. Here is my resume. registered nurse
    I would like to submit my paperwork for the teaching position I saw in the paper. teaching
    This is my c.v. for the stocker post in fruits and vegetables. stocker
    Apply for tile work. tile
    Attached resume for landscape architect. landscape architect
    My curriculum vitae for professor of biology is enclosed. professor of biology
    I would like to apply for the position in photography. photography

Mark job entity in other intents

  1. Select Intents from the left menu.

  2. Select GetJobInformation from the list of intents.

  3. Label the jobs in the example utterances

    If there are more example utterances in one intent than another intent, that intent has a higher likelihood of being the highest predicted intext.

Train the app so the changes to the intent can be tested

  1. In the top right side of the LUIS website, select the Train button.

    Train button

  2. Training is complete when you see the green status bar at the top of the website confirming success.

    Trained status bar

Publish the app so the trained model is queryable from the endpoint

In order to receive a LUIS prediction in a chat bot or other client application, you need to publish the app to the endpoint.

  1. Select Publish in the top right navigation.

    LUIS publish to endpoint button in top right menu

  2. Select the Production slot and the Publish button.

    LUIS publish to endpoint

  3. Publishing is complete when you see the green status bar at the top of the website confirming success.

    LUIS publish to endpoint

  4. Select the endpoints link in the green status bar to go to the Keys and endpoints page. The endpoint URLs are listed at the bottom.

Get intent and entity prediction from endpoint

  1. In the Manage section (top right menu), on the Keys and endpoints page (left menu), select the endpoint URL at the bottom of the page. This action opens another browser tab with the endpoint URL in the address bar.

    The endpoint URL looks like https://<region>.api.cognitive.microsoft.com/luis/v2.0/apps/<appID>?verbose=true&subscription-key=<YOUR_KEY>&<optional-name-value-pairs>&q=<user-utterance-text>.

  2. Go to the end of the URL in the address and enter Here is my c.v. for the engineering job. The last querystring parameter is q, the utterance query. This utterance is not the same as any of the labeled utterances so it is a good test and should return the ApplyForJob utterances.

    {
      "query": "Here is my c.v. for the engineering job",
      "topScoringIntent": {
        "intent": "ApplyForJob",
        "score": 0.98052007
      },
      "intents": [
        {
          "intent": "ApplyForJob",
          "score": 0.98052007
        },
        {
          "intent": "GetJobInformation",
          "score": 0.03424581
        },
        {
          "intent": "None",
          "score": 0.0015820954
        }
      ],
      "entities": [
        {
          "entity": "engineering",
          "type": "Job",
          "startIndex": 24,
          "endIndex": 34,
          "score": 0.668959737
        }
      ]
    }
    

    LUIS found the correct intent, ApplyForJob, and extracted the correct entity, Job, with a value of engineering.

Names are tricky

The LUIS app found the correct intent with high confidence and it extracted the job name, but names are tricky. Try the utterance This is the lead welder paperwork.

In the following JSON, LUIS responds with the correct intent, ApplyForJob, but didn't extract the lead welder job name.

{
  "query": "This is the lead welder paperwork",
  "topScoringIntent": {
    "intent": "ApplyForJob",
    "score": 0.860295951
  },
  "intents": [
    {
      "intent": "ApplyForJob",
      "score": 0.860295951
    },
    {
      "intent": "GetJobInformation",
      "score": 0.07265678
    },
    {
      "intent": "None",
      "score": 0.00482481951
    }
  ],
  "entities": []
}

Because a name can be anything, LUIS predicts entities more accurately if it has a phrase list of words to boost the signal.

Open the jobs-phrase-list.csv from the Azure-Samples GitHub repository. The list is over 1,000 job words and phrases. Look through the list for job words that are meaningful to you. If your words or phrases are not on the list, add your own.

  1. In the Build section of the LUIS app, select Phrase lists found under the Improve app performance menu.

  2. Select Create new phrase list.

  3. Name the new phrase list JobNames and copy the list from jobs-phrase-list.csv into the Values text box. Select enter.

    Screenshot of create new phrase list dialog pop-up

    If you want more words added to the phrase list, select Recommand then review the new Related Values and add any that are relevant.

    Make sure to keep the These values are interchangeable checked because these values should all be treated as synonyms for jobs. Learn more about interchangeable and noninterchangeable phrase list concepts.

  4. Select Save to activate the phrase list.

    Screenshot of create new phrase list dialog pop-up with words in phrase list values box

  5. Train and publish the app again to use phrase list.

  6. Requery at the endpoint with the same utterance: This is the lead welder paperwork.

    The JSON response includes the extracted entity:

      {
      "query": "This is the lead welder paperwork.",
      "topScoringIntent": {
        "intent": "ApplyForJob",
        "score": 0.983076453
      },
      "intents": [
        {
          "intent": "ApplyForJob",
          "score": 0.983076453
        },
        {
          "intent": "GetJobInformation",
          "score": 0.0120766377
        },
        {
          "intent": "None",
          "score": 0.00248388131
        }
      ],
      "entities": [
        {
          "entity": "lead welder",
          "type": "Job",
          "startIndex": 12,
          "endIndex": 22,
          "score": 0.8373154
        }
      ]
    }
    

Clean up resources

When no longer needed, delete the LUIS app. To do so, select My apps from the top left menu. Select the ellipsis (...) to the right of the app name in the app list, select Delete. On the pop-up dialog Delete app?, select Ok.

Next steps

In this tutorial, the Human Resources app uses a machine-learned simple entity to find job names in utterances. Because job names can be such a wide variety of words or phrases, the app needed a phrase list to boost the job name words.