Entity components
In Conversational Language Understanding, entities are relevant pieces of information that are extracted from your utterances. An entity can be extracted by different methods. They can be learned through context, matched from a list, or detected by a prebuilt recognized entity. Every entity in your project is composed of one or more of these methods, which are defined as your entity's components. When an entity is defined by more than one component, their predictions can overlap. You can determine the behavior of an entity prediction when its components overlap by using a fixed set of options in the Entity options.
Component types
An entity component determines a way you can extract the entity. An entity can simply contain one component, which would determine the only method that would be used to extract the entity, or multiple components to expand the ways in which the entity is defined and extracted.
Learned component
The learned component uses the entity tags you label your utterances with to train a machine learned model. The model learns to predict where the entity is, based on the context within the utterance. Your labels provide examples of where the entity is expected to be present in an utterance, based on the meaning of the words around it and as the words that were labeled. This component is only defined if you add labels by tagging utterances for the entity. If you do not tag any utterances with the entity, it will not have a learned component.
List component
The list component represents a fixed, closed set of related words along with their synonyms. The component performs an exact text match against the list of values you provide as synonyms. Each synonym belongs to a "list key", which can be used as the normalized, standard value for the synonym that will return in the output if the list component is matched. List keys are not used for matching.
In multilingual projects, you can specify a different set of synonyms for each language. While using the prediction API, you can specify the language in the input request, which will only match the synonyms associated to that language.
Prebuilt component
The prebuilt component allows you to select from a library of common types such as numbers, datetimes, and names. When added, a prebuilt component is automatically detected. You can have up to five prebuilt components per entity. See the list of supported prebuilt components for more information.
Entity options
When multiple components are defined for an entity, their predictions may overlap. When an overlap occurs, each entity's final prediction is determined by one of the following options.
Combine components
Combine components as one entity when they overlap by taking the union of all the components.
Use this to combine all components when they overlap. When components are combined, you get all the extra information that’s tied to a list or prebuilt component when they are present.
Example
Suppose you have an entity called Software that has a list component, which contains “Proseware OS” as an entry. In your utterance data, you have “I want to buy Proseware OS 9” with “Proseware OS 9” tagged as Software:
By using combine components, the entity will return with the full context as “Proseware OS 9” along with the key from the list component:
Suppose you had the same utterance but only “OS 9” was predicted by the learned component:
With combine components, the entity will still return as “Proseware OS 9” with the key from the list component:
Do not combine components
Each overlapping component will return as a separate instance of the entity. Apply your own logic after prediction with this option.
Example
Suppose you have an entity called Software that has a list component, which contains “Proseware Desktop” as an entry. In your utterance data, you have “I want to buy Proseware Desktop Pro” with “Proseware Desktop Pro” tagged as Software:
When you do not combine components, the entity will return twice:
Note
During public preview of the service, there were 4 available options: Longest overlap, Exact overlap, Union overlap, and Return all separately. Longest overlap and exact overlap are deprecated and will only be supported for projects that previously had those options selected. Union overlap has been renamed to Combine components, while Return all separately has been renamed to Do not combine components.
How to use components and options
Components give you the flexibility to define your entity in more than one way. When you combine components, you make sure that each component is represented and you reduce the number of entities returned in your predictions.
A common practice is to extend a prebuilt component with a list of values that the prebuilt might not support. For example, if you have an Organization entity, which has a General.Organization prebuilt component added to it, the entity may not predict all the organizations specific to your domain. You can use a list component to extend the values of the Organization entity and thereby extending the prebuilt with your own organizations.
Other times you may be interested in extracting an entity through context such as a Product in a retail project. You would label for the learned component of the product to learn where a product is based on its position within the sentence. You may also have a list of products that you already know before hand that you'd like to always extract. Combining both components in one entity allows you to get both options for the entity.
When you do not combine components, you simply allow every component to act as an independent entity extractor. One way of using this option is to separate the entities extracted from a list to the ones extracted through the learned or prebuilt components to handle and treat them differently.
Next steps
Povratne informacije
Pošalјite i prikažite povratne informacije za