Quickstart: Using the Python REST API to call the Text Analytics Cognitive Service

Use this quickstart to begin analyzing language with the Text Analytics REST API and Python. This article shows you how to detect language, analyze sentiment, extract key phrases, and identify linked entities.

Tip

For detailed API technical documentation and to see it in action, use the following links. You can also send POST requests from the built-in API test console. No setup is required, simply paste your resource key and JSON documents into the request:

Prerequisites

  • Python 3.x

  • The Python requests library

    You can install the library with this command:

    pip install --upgrade requests
    

A key and endpoint for a Text Analytics resource. Azure Cognitive Services are represented by Azure resources that you subscribe to. Create a resource for Text Analytics using the Azure portal or Azure CLI on your local machine. You can also:

Create a new Python application

Create a new Python application in your favorite editor or IDE. Add the following imports to your file.

import requests
# pprint is used to format the JSON response
from pprint import pprint

Create variables for your resource's Azure endpoint and subscription key.

import os

subscription_key = "<paste-your-text-analytics-key-here>"
endpoint = "<paste-your-text-analytics-endpoint-here>"

The following sections describe how to call each of the API's features.

Detect languages

Append /text/analytics/v3.0/languages to the Text Analytics base endpoint to form the language detection URL. For example: https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v3.0/languages

language_api_url = endpoint + "/text/analytics/v3.0/languages"

The payload to the API consists of a list of documents, which are tuples containing an id and a text attribute. The text attribute stores the text to be analyzed, and the id can be any value.

documents = {"documents": [
    {"id": "1", "text": "This is a document written in English."},
    {"id": "2", "text": "Este es un document escrito en Español."},
    {"id": "3", "text": "这是一个用中文写的文件"}
]}

Use the Requests library to send the documents to the API. Add your subscription key to the Ocp-Apim-Subscription-Key header, and send the request with requests.post().

headers = {"Ocp-Apim-Subscription-Key": subscription_key}
response = requests.post(language_api_url, headers=headers, json=documents)
languages = response.json()
pprint(languages)

Output

{
    "documents": [
        {
            "id": "1",
            "detectedLanguage": {
                "name": "English",
                "iso6391Name": "en",
                "confidenceScore": 1.0
            },
            "warnings": []
        },
        {
            "id": "2",
            "detectedLanguage": {
                "name": "Spanish",
                "iso6391Name": "es",
                "confidenceScore": 1.0
            },
            "warnings": []
        },
        {
            "id": "3",
            "detectedLanguage": {
                "name": "Chinese_Simplified",
                "iso6391Name": "zh_chs",
                "confidenceScore": 1.0
            },
            "warnings": []
        }
    ],
    "errors": [],
    "modelVersion": "2019-10-01"
}

Analyze sentiment

To detect the sentiment (which ranges between positive or negative) of a set of documents, append /text/analytics/v3.0/sentiment to the Text Analytics base endpoint to form the language detection URL. For example: https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v3.0/sentiment

sentiment_url = endpoint + "/text/analytics/v3.0/sentiment"

As with the language detection example, create a dictionary with a documents key that consists of a list of documents. Each document is a tuple consisting of the id, the text to be analyzed and the language of the text.

documents = {"documents": [
    {"id": "1", "language": "en",
        "text": "I really enjoy the new XBox One S. It has a clean look, it has 4K/HDR resolution and it is affordable."},
    {"id": "2", "language": "es",
        "text": "Este ha sido un dia terrible, llegué tarde al trabajo debido a un accidente automobilistico."}
]}

Use the Requests library to send the documents to the API. Add your subscription key to the Ocp-Apim-Subscription-Key header, and send the request with requests.post().

headers = {"Ocp-Apim-Subscription-Key": subscription_key}
response = requests.post(sentiment_url, headers=headers, json=documents)
sentiments = response.json()
pprint(sentiments)

Output

The sentiment score for a document is between 0.0 and 1.0, with a higher score indicating a more positive sentiment.

{
    "documents": [
        {
            "id": "1",
            "sentiment": "positive",
            "confidenceScores": {
                "positive": 1.0,
                "neutral": 0.0,
                "negative": 0.0
            },
            "sentences": [
                {
                    "sentiment": "positive",
                    "confidenceScores": {
                        "positive": 1.0,
                        "neutral": 0.0,
                        "negative": 0.0
                    },
                    "offset": 0,
                    "length": 102,
                    "text": "I really enjoy the new XBox One S. It has a clean look, it has 4K/HDR resolution and it is affordable."
                }
            ],
            "warnings": []
        },
        {
            "id": "2",
            "sentiment": "negative",
            "confidenceScores": {
                "positive": 0.02,
                "neutral": 0.05,
                "negative": 0.93
            },
            "sentences": [
                {
                    "sentiment": "negative",
                    "confidenceScores": {
                        "positive": 0.02,
                        "neutral": 0.05,
                        "negative": 0.93
                    },
                    "offset": 0,
                    "length": 92,
                    "text": "Este ha sido un dia terrible, llegué tarde al trabajo debido a un accidente automobilistico."
                }
            ],
            "warnings": []
        }
    ],
    "errors": [],
    "modelVersion": "2020-04-01"
}

Extract key phrases

To extract the key phrases from a set of documents, append /text/analytics/v3.0/keyPhrases to the Text Analytics base endpoint to form the language detection URL. For example: https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v3.0/keyPhrases

keyphrase_url = endpoint + "/text/analytics/v3.0/keyphrases"

This collection of documents is the same used for the sentiment analysis example.

documents = {"documents": [
    {"id": "1", "language": "en",
        "text": "I really enjoy the new XBox One S. It has a clean look, it has 4K/HDR resolution and it is affordable."},
    {"id": "2", "language": "es",
        "text": "Si usted quiere comunicarse con Carlos, usted debe de llamarlo a su telefono movil. Carlos es muy responsable, pero necesita recibir una notificacion si hay algun problema."},
    {"id": "3", "language": "en",
        "text": "The Grand Hotel is a new hotel in the center of Seattle. It earned 5 stars in my review, and has the classiest decor I've ever seen."}
]}

Use the Requests library to send the documents to the API. Add your subscription key to the Ocp-Apim-Subscription-Key header, and send the request with requests.post().

headers = {"Ocp-Apim-Subscription-Key": subscription_key}
response = requests.post(keyphrase_url, headers=headers, json=documents)
key_phrases = response.json()
pprint(key_phrases)

Output

{
    "documents": [
        {
            "id": "1",
            "keyPhrases": [
                "HDR resolution",
                "new XBox",
                "clean look"
            ],
            "warnings": []
        },
        {
            "id": "2",
            "keyPhrases": [
                "Carlos",
                "notificacion",
                "algun problema",
                "telefono movil"
            ],
            "warnings": []
        },
        {
            "id": "3",
            "keyPhrases": [
                "new hotel",
                "Grand Hotel",
                "review",
                "center of Seattle",
                "classiest decor",
                "stars"
            ],
            "warnings": []
        }
    ],
    "errors": [],
    "modelVersion": "2019-10-01"
}

Identify Entities

To identify well-known entities (people, places, and things) in text documents, append /text/analytics/v3.0/entities/recognition/general to the Text Analytics base endpoint to form the language detection URL. For example: https://<your-custom-subdomain>.cognitiveservices.azure.com/text/analytics/v3.0/entities/recognition/general

entities_url = endpoint + "/text/analytics/v3.0/entities/recognition/general/recognition/general"

Create a collection of documents, like in the previous examples.

documents = {"documents": [
    {"id": "1", "text": "Microsoft is an It company."}
]}

Use the Requests library to send the documents to the API. Add your subscription key to the Ocp-Apim-Subscription-Key header, and send the request with requests.post().

headers = {"Ocp-Apim-Subscription-Key": subscription_key}
response = requests.post(entities_url, headers=headers, json=documents)
entities = response.json()
pprint(entities)

Output

{
    "documents": [
        {
            "id": "1",
            "entities": [
                {
                    "text": "Microsoft",
                    "category": "Organization",
                    "offset": 0,
                    "length": 9,
                    "confidenceScore": 0.86
                },
                {
                    "text": "IT",
                    "category": "Skill",
                    "offset": 16,
                    "length": 2,
                    "confidenceScore": 0.8
                }
            ],
            "warnings": []
        }
    ],
    "errors": [],
    "modelVersion": "2020-04-01"
}

Next steps

See also

Text Analytics overview
Frequently asked questions (FAQ)