Named Entity Recognition cognitive skill

The Named Entity Recognition skill extracts named entities from text. Available entities include the types person, location and organization.

Important

Named entity recognition skill is now discontinued replaced by Microsoft.Skills.Text.EntityRecognitionSkill. Support stopped on February 15, 2019 and the API was removed from the product on May 2, 2019. Follow the recommendations in Deprecated cognitive search skills to migrate to a supported skill.

Note

As you expand scope by increasing the frequency of processing, adding more documents, or adding more AI algorithms, you will need to attach a billable Cognitive Services resource. Charges accrue when calling APIs in Cognitive Services, and for image extraction as part of the document-cracking stage in Azure Cognitive Search. There are no charges for text extraction from documents.

Execution of built-in skills is charged at the existing Cognitive Services pay-as-you go price. Image extraction pricing is described on the Azure Cognitive Search pricing page.

@odata.type

Microsoft.Skills.Text.NamedEntityRecognitionSkill

Data limits

The maximum size of a record should be 50,000 characters as measured by String.Length. If you need to break up your data before sending it to the key phrase extractor, consider using the Text Split skill.

Skill parameters

Parameters are case-sensitive.

Parameter name Description
categories Array of categories that should be extracted. Possible category types: "Person", "Location", "Organization". If no category is provided, all types are returned.
defaultLanguageCode Language code of the input text. The following languages are supported: de, en, es, fr, it
minimumPrecision A number between 0 and 1. If the precision is lower than this value, the entity is not returned. The default is 0.

Skill inputs

Input name Description
languageCode Optional. Default is "en".
text The text to analyze.

Skill outputs

Output name Description
persons An array of strings where each string represents the name of a person.
locations An array of strings where each string represents a location.
organizations An array of strings where each string represents an organization.
entities An array of complex types. Each complex type includes the following fields:
  • category ("person", "organization", or "location")
  • value (the actual entity name)
  • offset (The location where it was found in the text)
  • confidence (A value between 0 and 1 that represents that confidence that the value is an actual entity)

Sample definition

  {
    "@odata.type": "#Microsoft.Skills.Text.NamedEntityRecognitionSkill",
    "categories": [ "Person", "Location", "Organization"],
    "defaultLanguageCode": "en",
    "inputs": [
      {
        "name": "text",
        "source": "/document/content"
      }
    ],
    "outputs": [
      {
        "name": "persons",
        "targetName": "people"
      }
    ]
  }

Sample input

{
    "values": [
      {
        "recordId": "1",
        "data":
           {
             "text": "This is the loan application for Joe Romero, a Microsoft employee who was born in Chile and who then moved to Australia… Ana Smith is provided as a reference.",
             "languageCode": "en"
           }
      }
    ]
}

Sample output

{
  "values": [
    {
      "recordId": "1",
      "data" : 
      {
        "persons": [ "Joe Romero", "Ana Smith"],
        "locations": ["Chile", "Australia"],
        "organizations":["Microsoft"],
        "entities":  
        [
          {
            "category":"person",
            "value": "Joe Romero",
            "offset": 33,
            "confidence": 0.87
          },
          {
            "category":"person",
            "value": "Ana Smith",
            "offset": 124,
            "confidence": 0.87
          },
          {
            "category":"location",
            "value": "Chile",
            "offset": 88,
            "confidence": 0.99
          },
          {
            "category":"location",
            "value": "Australia",
            "offset": 112,
            "confidence": 0.99
          },
          {
            "category":"organization",
            "value": "Microsoft",
            "offset": 54,
            "confidence": 0.99
          }
        ]
      }
    }
  ]
}

Error cases

If the language code for the document is unsupported, an error is returned and no entities are extracted.

See also