What is the Text Analytics API?

The Text Analytics API is a cloud-based service that provides Natural Language Processing (NLP) features for text mining and text analysis, including: sentiment analysis, opinion mining, key phrase extraction, language detection, and named entity recognition.

The API is a part of Azure Cognitive Services, a collection of machine learning and AI algorithms in the cloud for your development projects. You can use these features with the REST API version 3.0 or version 3.1, or the client library.

This documentation contains the following types of articles:

  • Quickstarts are step-by-step instructions that let you make calls to the service and get results in a short period of time.
  • How-to guides contain instructions for using the service in more specific or customized ways.
  • Concepts provide in-depth explanations of the service's functionality and features.
  • Tutorials are longer guides that show you how to use this service as a component in broader business solutions.

Sentiment analysis

Use sentiment analysis (SA) and find out what people think of your brand or topic by mining the text for clues about positive or negative sentiment.

The feature provides sentiment labels (such as "negative", "neutral" and "positive") based on the highest confidence score found by the service at a sentence and document-level. This feature also returns confidence scores between 0 and 1 for each document & sentences within it for positive, neutral and negative sentiment. You can also be run the service on premises using a container.

Starting in the v3.1, opinion mining (OM) is a feature of Sentiment Analysis. Also known as Aspect-based Sentiment Analysis in Natural Language Processing (NLP), this feature provides more granular information about the opinions related to words (such as the attributes of products or services) in text.

Key phrase extraction

Use key phrase extraction (KPE) to quickly identify the main concepts in text. For example, in the text "The food was delicious and there were wonderful staff", Key Phrase Extraction will return the main talking points: "food" and "wonderful staff".

Language detection

Language detection can detect the language an input text is written in and report a single language code for every document submitted on the request in a wide range of languages, variants, dialects, and some regional/cultural languages. The language code is paired with a confidence score.

Named entity recognition

Named Entity Recognition (NER) can Identify and categorize entities in your text as people, places, organizations, quantities, Well-known entities are also recognized and linked to more information on the web.

Text summarization

Summarization produces a summary of text by extracting sentences that collectively represent the most important or relevant information within the original content. This feature condenses articles, papers, or documents down to key sentences.

Text Analytics for health

Text Analytics for health is a feature of the Text Analytics API service that extracts and labels relevant medical information from unstructured texts such as doctor's notes, discharge summaries, clinical documents, and electronic health records.

Deploy on premises using Docker containers

Use Text Analytics containers to deploy API features on-premises. These docker containers enable you to bring the service closer to your data for compliance, security or other operational reasons. Text Analytics offers the following containers:

  • sentiment analysis
  • key phrase extraction (preview)
  • language detection (preview)
  • Text Analytics for health

Asynchronous operations

The /analyze endpoint enables you to use many features of the Text Analytics API asynchronously. Named Entity Recognition (NER), Key phrase extraction (KPE), Sentiment Analysis (SA), Opinion Mining (OM) are available as part of /analyze endpoint. It allows clubbing of these features in a single call. It allows sending up to 125,000 characters per document. Pricing is same as regular Text Analytics.

Typical workflow

The workflow is simple: you submit data for analysis and handle outputs in your code. Analyzers are consumed as-is, with no additional configuration or customization.

  1. Create an Azure resource for Text Analytics. Afterwards, get the key generated for you to authenticate your requests.

  2. Formulate a request containing your data as raw unstructured text, in JSON.

  3. Post the request to the endpoint established during sign-up, appending the desired resource: sentiment analysis, key phrase extraction, language detection, or named entity recognition.

  4. Stream or store the response locally. Depending on the request, results are either a sentiment score, a collection of extracted key phrases, or a language code.

Output is returned as a single JSON document, with results for each text document you posted, based on ID. You can subsequently analyze, visualize, or categorize the results into actionable insights.

Data is not stored in your account. Operations performed by the Text Analytics API are stateless, which means the text you provide is processed and results are returned immediately.

Text Analytics for multiple programming experience levels

You can start using the Text Analytics API in your processes, even if you don't have much experience in programming. Use these tutorials to learn how you can use the API to analyze text in different ways to fit your experience level.

Supported languages

This section has been moved to a separate article for better discoverability. Refer to Supported languages in the Text Analytics API for this content.

Data limits

All of the Text Analytics API endpoints accept raw text data. See the Data limits article for more information.

Unicode encoding

The Text Analytics API uses Unicode encoding for text representation and character count calculations. Requests can be submitted in both UTF-8 and UTF-16 with no measurable differences in the character count. Unicode codepoints are used as the heuristic for character length and are considered equivalent for the purposes of text analytics data limits. If you use StringInfo.LengthInTextElements to get the character count, you are using the same method we use to measure data size.

Next steps