Skills for extra processing during indexing (Azure AI Search)

Article
11/15/2023

This article describes the skills provided with Azure AI Search that you can include in a skillset to access external processing.

A skill provides an atomic operation that transforms content in some way. Often, it's an operation that recognizes or extracts text, but it can also be a utility skill that reshapes the enrichments that are already created. Typically, the output is text-based so that it can be used in full text search or vectors used in vector search.

Skills are organized into categories:

A built-in skill wraps API calls to an Azure resource, where the inputs, outputs, and processing steps are well understood. For skills that call an Azure AI resource, the connection is made over the internal network. For skills that call Azure OpenAI, you provide the connection information that the search service uses to connect to the resource. A small quantity of processing is non-billable, but at larger volumes, processing is billable. Built-in skills are based on pretrained models from Microsoft, which means you can't train the model using your own training data.
A custom skill provides custom code that executes externally to the search service. It's accessed through a URI. Custom code is often made available through an Azure function app. To attach an open-source or third-party vectorization model, use a custom skill.
A utility is internal to Azure AI Search, with no dependency on external resources or outbound connections. Most utilities are non-billable.

Azure AI resource skills

Skills that call the Azure AI are billed at the pay-as-you-go rate when you attach an AI service resource.

OData type	Description	Metered by
Microsoft.Skills.Text.CustomEntityLookupSkill	Looks for text from a custom, user-defined list of words and phrases.	Azure AI Search (pricing)
Microsoft.Skills.Text.KeyPhraseExtractionSkill	This skill uses a pretrained model to detect important phrases based on term placement, linguistic rules, proximity to other terms, and how unusual the term is within the source data.	Azure AI services (pricing)
Microsoft.Skills.Text.LanguageDetectionSkill	This skill uses a pretrained model to detect which language is used (one language ID per document). When multiple languages are used within the same text segments, the output is the LCID of the predominantly used language.	Azure AI services (pricing)
Microsoft.Skills.Text.V3.EntityLinkingSkill	This skill uses a pretrained model to generate links for recognized entities to articles in Wikipedia.	Azure AI services (pricing)
Microsoft.Skills.Text.V3.EntityRecognitionSkill	This skill uses a pretrained model to establish entities for a fixed set of categories: `"Person"`, `"Location"`, `"Organization"`, `"Quantity"`, `"DateTime"`, `"URL"`, `"Email"`, `"PersonType"`, `"Event"`, `"Product"`, `"Skill"`, `"Address"`, `"Phone Number"` and `"IP Address"` fields.	Azure AI services (pricing)
Microsoft.Skills.Text.PIIDetectionSkill	This skill uses a pretrained model to extract personal information from a given text. The skill also gives various options for masking the detected personal information entities in the text.	Azure AI services (pricing)
Microsoft.Skills.Text.V3.SentimentSkill	This skill uses a pretrained model to assign sentiment labels (such as "negative", "neutral" and "positive") based on the highest confidence score found by the service at a sentence and document-level on a record by record basis.	Azure AI services (pricing)
Microsoft.Skills.Text.TranslationSkill	This skill uses a pretrained model to translate the input text into various languages for normalization or localization use cases.	Azure AI services (pricing)
Microsoft.Skills.Vision.ImageAnalysisSkill	This skill uses an image detection algorithm to identify the content of an image and generate a text description.	Azure AI services (pricing)
Microsoft.Skills.Vision.OcrSkill	Optical character recognition.	Azure AI services (pricing)

Azure OpenAI skills

Skills that call models deployed on Azure OpenAI are billed at the pay-as-you-go rate.

OData type	Description	Metered by
Microsoft.Skills.Text.AzureOpenAIEmbeddingSkill	Connects to a deployed embedding model on Azure OpenAI for integrated vectorization.	Azure OpenAI (pricing)

Utility skills

Skills that execute only on Azure AI Search, iterate mostly on nodes in the enrichment cache, and are mostly non-billable.

OData type	Description	Metered by
Microsoft.Skills.Util.ConditionalSkill	Allows filtering, assigning a default value, and merging data based on a condition.	Not applicable
Microsoft.Skills.Util.DocumentExtractionSkill	Extracts content from a file within the enrichment pipeline.	Azure AI Search (pricing) for image extraction.
Microsoft.Skills.Text.MergeSkill	Consolidates text from a collection of fields into a single field.	Not applicable
Microsoft.Skills.Util.ShaperSkill	Maps output to a complex type (a multi-part data type, which might be used for a full name, a multi-line address, or a combination of last name and a personal identifier.)	Not applicable
Microsoft.Skills.Text.SplitSkill	Splits text into pages so that you can enrich or augment content incrementally.	Not applicable

Custom skills

Custom skills wrap external code that you design, develop, and deploy to the web. You can then call the module from within a skillset as a custom skill.

Type	Description	Metered by
Microsoft.Skills.Custom.WebApiSkill	Allows extensibility of an AI enrichment pipeline by making an HTTP call into a custom Web API	None unless your solution uses a metered Azure service
Microsoft.Skills.Custom.AmlSkill	Allows extensibility of an AI enrichment pipeline with an Azure Machine Learning model	None unless your solution uses a metered Azure service

For guidance on creating a custom skill, see Define a custom interface and Example: Creating a custom skill for AI enrichment.

Skills for extra processing during indexing (Azure AI Search)

Azure AI resource skills

Azure OpenAI skills

Utility skills

Custom skills

See also

Feedback

Additional resources