Quickstart: Use the Text Analytics client library for detecting language

Get started with the Text Analytics client library. Follow these steps to install the package and try out the example code for basic tasks.

Use the Text Analytics client library to perform:

  • Sentiment analysis
  • Language detection
  • Entity recognition
  • Key phrase extraction

Reference documentation | Library source code | Package (NuGet) | Samples

Note

The code in this article uses the synchronous methods of the Text Analytics .NET SDK for simplicity. For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. For example, calling SentimentBatchAsync() instead of Sentiment().

Prerequisites

Setting up

Create a Text Analytics Azure resource

Get a key and endpoint to authenticate your applications. Create a resource for Text Analytics using the Azure portal or Azure CLI on your local machine. You can also:

After you get a key and endpoint from your trial subscription or resource, create two environment variables. One named TEXT_ANALYTICS_SUBSCRIPTION_KEY for your key, and one named TEXT_ANALYTICS_ENDPOINT for your endpoint.

Create a new .NET Core application

In a console window (such as cmd, PowerShell, or Bash), use the dotnet new command to create a new console app with the name text-analytics quickstart. This command creates a simple "Hello World" project with a single C# source file: program.cs.

dotnet new console -n text-analytics-quickstart

Change your directory to the newly created app folder. You can build the application with:

dotnet build

The build output should contain no warnings or errors.

...
Build succeeded.
 0 Warning(s)
 0 Error(s)
...

From the project directory, open the program.cs file and add the following using directives:

using System;
using System.Collections.Generic;
using System.Net.Http;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.Azure.CognitiveServices.Language.TextAnalytics;
using Microsoft.Azure.CognitiveServices.Language.TextAnalytics.Models;
using Microsoft.Rest;

In the application's Program class, create variables for your resource's key and endpoint from the environment variables you created earlier. If you created these environment variables after you began editing the application, you will need to close and reopen the editor, IDE, or shell you are using to access the variables.

Tip

To find your key and endpoint on the Azure portal:

  1. Navigate to your azure resource at https://portal.azure.com/.
  2. Click on Quick start, located under Resource Management.
private const string key_var = "TEXT_ANALYTICS_SUBSCRIPTION_KEY";
private static readonly string key = Environment.GetEnvironmentVariable(key_var);

private const string endpoint_var = "TEXT_ANALYTICS_ENDPOINT";
private static readonly string endpoint = Environment.GetEnvironmentVariable(endpoint_var);

Replace the application's Main method. You will define the methods called here later.

static void Main(string[] args)
{
    var client = authenticateClient();

    sentimentAnalysisExample(client);
    languageDetectionExample(client);
    entityRecognitionExample(client);
    keyPhraseExtractionExample(client);
    Console.Write("Press any key to exit.");
    Console.ReadKey();
}

Install the client library

Within the application directory, install the Text Analytics client library for .NET with the following command:

dotnet add package Microsoft.Azure.CognitiveServices.Language.TextAnalytics --version 4.0.0

Object model

The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key, and provides functions to accept text as single strings or as a batch. You can send text to the API synchronously, or asynchronously. The response object will contain the analysis information for each document you send.

Code examples

Authenticate the client

Create a new ApiKeyServiceClientCredentials class to store the credentials and add them to the client's requests. Within it, create an override for ProcessHttpRequestAsync() that adds your key to the Ocp-Apim-Subscription-Key header.

class ApiKeyServiceClientCredentials : ServiceClientCredentials
{
    private readonly string apiKey;

    public ApiKeyServiceClientCredentials(string apiKey)
    {
        this.apiKey = apiKey;
    }

    public override Task ProcessHttpRequestAsync(HttpRequestMessage request, CancellationToken cancellationToken)
    {
        if (request == null)
        {
            throw new ArgumentNullException("request");
        }
        request.Headers.Add("Ocp-Apim-Subscription-Key", this.apiKey);
        return base.ProcessHttpRequestAsync(request, cancellationToken);
    }
}

Create a method to instantiate the TextAnalyticsClient object with your endpoint and a ApiKeyServiceClientCredentials object containing your key.

static TextAnalyticsClient authenticateClient()
{
    ApiKeyServiceClientCredentials credentials = new ApiKeyServiceClientCredentials(key);
    TextAnalyticsClient client = new TextAnalyticsClient(credentials)
    {
        Endpoint = endpoint
    };
    return client;
}

In your program's main() method, call the authentication method to instantiate the client.

Sentiment analysis

Create a new function called SentimentAnalysisExample() that takes the client that you created earlier, and call its Sentiment() function. The returned SentimentResult object will contain the sentiment Score if successful, and an errorMessage if not.

A score that's close to 0 indicates a negative sentiment, while a score that's closer to 1 indicates a positive sentiment.

static void sentimentAnalysisExample(ITextAnalyticsClient client)
{
    var result = client.Sentiment("I had the best day of my life.", "en");
    Console.WriteLine($"Sentiment Score: {result.Score:0.00}");
}

Output

Sentiment Score: 0.87

Language detection

Create a new function called languageDetectionExample() that takes the client that you created earlier, and call its DetectLanguage() function. The returned LanguageResult object will contain the list of detected languages in DetectedLanguages if successful, and an errorMessage if not. Print the first returned language.

Tip

In some cases it may be hard to disambiguate languages based on the input. You can use the countryHint parameter to specify a 2-letter country code. By default the API is using the "US" as the default countryHint, to remove this behavior you can reset this parameter by setting this value to empty string countryHint = "" .

static void languageDetectionExample(ITextAnalyticsClient client)
{
    var result = client.DetectLanguage("This is a document written in English.");
    Console.WriteLine($"Language: {result.DetectedLanguages[0].Name}");
}

Output

Language: English

Entity recognition

Create a new function called RecognizeEntitiesExample() that takes the client that you created earlier, and call its Entities() function. Iterate through the results. The returned EntitiesResult object will contain the list of detected entities in Entities if successful, and an errorMessage if not. For each detected entity, print its Type, Sub-Type, Wikipedia name (if they exist) as well as the locations in the original text.

static void entityRecognitionExample(ITextAnalyticsClient client)
{

    var result = client.Entities("Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800.");
    Console.WriteLine("Entities:");
    foreach (var entity in result.Entities)
    {
        Console.WriteLine($"\tName: {entity.Name},\tType: {entity.Type ?? "N/A"},\tSub-Type: {entity.SubType ?? "N/A"}");
        foreach (var match in entity.Matches)
        {
            Console.WriteLine($"\t\tOffset: {match.Offset},\tLength: {match.Length},\tScore: {match.EntityTypeScore:F3}");
        }
    }
}

Output

Entities:
    Name: Microsoft,        Type: Organization,     Sub-Type: N/A
        Offset: 0,      Length: 9,      Score: 1.000
    Name: Bill Gates,       Type: Person,   Sub-Type: N/A
        Offset: 25,     Length: 10,     Score: 1.000
    Name: Paul Allen,       Type: Person,   Sub-Type: N/A
        Offset: 40,     Length: 10,     Score: 0.999
    Name: April 4,  Type: Other,    Sub-Type: N/A
        Offset: 54,     Length: 7,      Score: 0.800
    Name: April 4, 1975,    Type: DateTime, Sub-Type: Date
        Offset: 54,     Length: 13,     Score: 0.800
    Name: BASIC,    Type: Other,    Sub-Type: N/A
        Offset: 89,     Length: 5,      Score: 0.800
    Name: Altair 8800,      Type: Other,    Sub-Type: N/A
        Offset: 116,    Length: 11,     Score: 0.800

Key phrase extraction

Create a new function called KeyPhraseExtractionExample() that takes the client that you created earlier and call its KeyPhrases() function. The result will contain the list of detected key phrases in KeyPhrases if successful, and an errorMessage if not. Print any detected key phrases.

static void keyPhraseExtractionExample(TextAnalyticsClient client)
{
    var result = client.KeyPhrases("My cat might need to see a veterinarian.");

    // Printing key phrases
    Console.WriteLine("Key phrases:");

    foreach (string keyphrase in result.KeyPhrases)
    {
        Console.WriteLine($"\t{keyphrase}");
    }
}

Output

Key phrases:
    cat
    veterinarian

Reference documentation | Library source code | Package (PiPy) | Samples

Prerequisites

Setting up

Create a Text Analytics Azure resource

Get a key and endpoint to authenticate your applications. Create a resource for Text Analytics using the Azure portal or Azure CLI on your local machine. You can also:

After you get a key and endpoint from your trial subscription or resource, create two environment variables. One named TEXT_ANALYTICS_SUBSCRIPTION_KEY for your key, and one named TEXT_ANALYTICS_ENDPOINT for your endpoint.

Install the client library

After installing Python, you can install the client library with:

pip install --upgrade azure-cognitiveservices-language-textanalytics

Create a new python application

Create a new Python file and import the following libraries.

# -*- coding: utf-8 -*-

import os
from azure.cognitiveservices.language.textanalytics import TextAnalyticsClient
from msrest.authentication import CognitiveServicesCredentials

Create variables for your resource's Azure endpoint and subscription key. Obtain these values from the environment variables TEXT_ANALYTICS_SUBSCRIPTION_KEY and TEXT_ANALYTICS_ENDPOINT. If you created these environment variables after you began editing the application, you will need to close and reopen the editor, IDE, or shell you are using to access the variables.

Tip

To find your key and endpoint on the Azure portal:

  1. Navigate to your azure resource at https://portal.azure.com/.
  2. Click on Quick start, located under Resource Management.
key_var_name = 'TEXT_ANALYTICS_SUBSCRIPTION_KEY'
if not key_var_name in os.environ:
    raise Exception('Please set/export the environment variable: {}'.format(key_var_name))
subscription_key = os.environ[key_var_name]

endpoint_var_name = 'TEXT_ANALYTICS_ENDPOINT'
if not endpoint_var_name in os.environ:
    raise Exception('Please set/export the environment variable: {}'.format(endpoint_var_name))
endpoint = os.environ[endpoint_var_name]

Object model

The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key. The client provides several methods for analyzing text, as a single string, or a batch.

Text is sent to the API as a list of documents, which are dictionary objects containing a combination of id, text, and language attributes depending on the method used. The text attribute stores the text to be analyzed in the origin language, and the id can be any value.

The response object is a list containing the analysis information for each document.

Code examples

These code snippets show you how to do the following with the Text Analytics client library for Python:

Authenticate the client

Create a new TextAnalyticsClient with your endpoint, and a CognitiveServicesCredentials object containing your key.

def authenticateClient():
    credentials = CognitiveServicesCredentials(subscription_key)
    text_analytics_client = TextAnalyticsClient(
        endpoint=endpoint, credentials=credentials)
    return text_analytics_client

Sentiment analysis

Authenticate a client object, and call the sentiment() function. Iterate through the results, and print each document's ID, and sentiment score. A score closer to 0 indicates a negative sentiment, while a score closer to 1 indicates a positive sentiment.

def sentiment():
    
    client = authenticateClient()

    try:
        documents = [
            {"id": "1", "language": "en", "text": "I had the best day of my life."},
            {"id": "2", "language": "en",
                "text": "This was a waste of my time. The speaker put me to sleep."},
            {"id": "3", "language": "es", "text": "No tengo dinero ni nada que dar..."},
            {"id": "4", "language": "it",
                "text": "L'hotel veneziano era meraviglioso. È un bellissimo pezzo di architettura."}
        ]

        response = client.sentiment(documents=documents)
        for document in response.documents:
            print("Document Id: ", document.id, ", Sentiment Score: ",
                  "{:.2f}".format(document.score))

    except Exception as err:
        print("Encountered exception. {}".format(err))
sentiment()

Output

Document ID: 1 , Sentiment Score: 0.87
Document ID: 2 , Sentiment Score: 0.11
Document ID: 3 , Sentiment Score: 0.44
Document ID: 4 , Sentiment Score: 1.00

Language detection

Using the client created earlier, call detect_language() and get the result. Then iterate through the results, and print each document's ID, and the first returned language.

def language_detection():
    client = authenticateClient()

    try:
        documents = [
            {'id': '1', 'text': 'This is a document written in English.'},
            {'id': '2', 'text': 'Este es un document escrito en Español.'},
            {'id': '3', 'text': '这是一个用中文写的文件'}
        ]
        response = client.detect_language(documents=documents)

        for document in response.documents:
            print("Document Id: ", document.id, ", Language: ",
                  document.detected_languages[0].name)

    except Exception as err:
        print("Encountered exception. {}".format(err))
language_detection()

Output

Document ID: 1 , Language: English
Document ID: 2 , Language: Spanish
Document ID: 3 , Language: Chinese_Simplified

Entity recognition

Using the client created earlier, call the entities() function and get the result. Then iterate through the results, and print each document's ID, and the entities contained in it.

def entity_recognition():
    
    client = authenticateClient()

    try:
        documents = [
            {"id": "1", "language": "en", "text": "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800."},
            {"id": "2", "language": "es",
                "text": "La sede principal de Microsoft se encuentra en la ciudad de Redmond, a 21 kilómetros de Seattle."}
        ]
        response = client.entities(documents=documents)

        for document in response.documents:
            print("Document Id: ", document.id)
            print("\tKey Entities:")
            for entity in document.entities:
                print("\t\t", "NAME: ", entity.name, "\tType: ",
                      entity.type, "\tSub-type: ", entity.sub_type)
                for match in entity.matches:
                    print("\t\t\tOffset: ", match.offset, "\tLength: ", match.length, "\tScore: ",
                          "{:.2f}".format(match.entity_type_score))

    except Exception as err:
        print("Encountered exception. {}".format(err))
entity_recognition()

Output

Document ID: 1
        Name: Microsoft,        Type: Organization,     Sub-Type: N/A
        Offset: 0, Length: 9,   Score: 1.0

        Name: Bill Gates,       Type: Person,   Sub-Type: N/A
        Offset: 25, Length: 10, Score: 0.999847412109375

        Name: Paul Allen,       Type: Person,   Sub-Type: N/A
        Offset: 40, Length: 10, Score: 0.9988409876823425

        Name: April 4,  Type: Other,    Sub-Type: N/A
        Offset: 54, Length: 7,  Score: 0.8

        Name: April 4, 1975,    Type: DateTime, Sub-Type: Date
        Offset: 54, Length: 13, Score: 0.8

        Name: BASIC,    Type: Other,    Sub-Type: N/A
        Offset: 89, Length: 5,  Score: 0.8

        Name: Altair 8800,      Type: Other,    Sub-Type: N/A
        Offset: 116, Length: 11,        Score: 0.8

Document ID: 2
        Name: Microsoft,        Type: Organization,     Sub-Type: N/A
        Offset: 21, Length: 9,  Score: 0.999755859375

        Name: Redmond (Washington),     Type: Location, Sub-Type: N/A
        Offset: 60, Length: 7,  Score: 0.9911284446716309

        Name: 21 kilómetros,    Type: Quantity, Sub-Type: Dimension
        Offset: 71, Length: 13, Score: 0.8

        Name: Seattle,  Type: Location, Sub-Type: N/A
        Offset: 88, Length: 7,  Score: 0.9998779296875

Key phrase extraction

Using the client created earlier, call the key_phrases() function and get the result. Then iterate through the results, and print each document's ID, and the key phrases contained in it.

def key_phrases():
    
    client = authenticateClient()

    try:
        documents = [
            {"id": "1", "language": "ja", "text": "猫は幸せ"},
            {"id": "2", "language": "de",
                "text": "Fahrt nach Stuttgart und dann zum Hotel zu Fu."},
            {"id": "3", "language": "en",
                "text": "My cat might need to see a veterinarian."},
            {"id": "4", "language": "es", "text": "A mi me encanta el fútbol!"}
        ]

        for document in documents:
            print(
                "Asking key-phrases on '{}' (id: {})".format(document['text'], document['id']))

        response = client.key_phrases(documents=documents)

        for document in response.documents:
            print("Document Id: ", document.id)
            print("\tKey Phrases:")
            for phrase in document.key_phrases:
                print("\t\t", phrase)

    except Exception as err:
        print("Encountered exception. {}".format(err))
key_phrases()

Output

Document ID: 1
         Key phrases:
                幸せ
Document ID: 2
         Key phrases:
                Stuttgart
                Hotel
                Fahrt
                Fu
Document ID: 3
         Key phrases:
                cat
                veterinarian
Document ID: 4
         Key phrases:
                fútbol

Reference documentation | Library source code | Package (NPM) | Samples

Prerequisites

Setting up

Create a Text Analytics Azure resource

Get a key and endpoint to authenticate your applications. Create a resource for Text Analytics using the Azure portal or Azure CLI on your local machine. You can also:

After you get a key and endpoint from your trial subscription or resource, create two environment variables. One named TEXT_ANALYTICS_SUBSCRIPTION_KEY for your key, and one named TEXT_ANALYTICS_ENDPOINT for your endpoint.

Create a new Node.js application

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

Run the npm init command to create a node application with a package.json file.

npm init

Create a file named index.js and add the following libraries:

"use strict";

const os = require("os");
const CognitiveServicesCredentials = require("@azure/ms-rest-js");
const TextAnalyticsAPIClient = require("@azure/cognitiveservices-textanalytics");

Create variables for your resource's Azure endpoint and subscription key. Obtain these values from the environment variables TEXT_ANALYTICS_SUBSCRIPTION_KEY and TEXT_ANALYTICS_ENDPOINT. If you created these environment variables after you began editing the application, you will need to close and reopen the editor, IDE, or shell you are using to access the variables.

Tip

To find your key and endpoint on the Azure portal:

  1. Navigate to your azure resource at https://portal.azure.com/.
  2. Click on Quick start, located under Resource Management.
const key_var = 'TEXT_ANALYTICS_SUBSCRIPTION_KEY';
if (!process.env[key_var]) {
    throw new Error('please set/export the following environment variable: ' + key_var);
}
const subscription_key = process.env[key_var];

const endpoint_var = 'TEXT_ANALYTICS_ENDPOINT';
if (!process.env[endpoint_var]) {
    throw new Error('please set/export the following environment variable: ' + endpoint_var);
}
const endpoint = process.env[endpoint_var];

Install the client library

Install the @azure/ms-rest-js and @azure/cognitiveservices-textanalytics NPM packages:

npm install @azure/cognitiveservices-textanalytics @azure/ms-rest-js

Your app's package.json file will be updated with the dependencies.

Object model

The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key. The client provides several methods for analyzing text, as a single string, or a batch.

Text is sent to the API as a list of documents, which are dictionary objects containing a combination of id, text, and language attributes depending on the method used. The text attribute stores the text to be analyzed in the origin language, and the id can be any value.

The response object is a list containing the analysis information for each document.

Code examples

Authenticate the client

Create a new TextAnalyticsClient object with credentials and endpoint as a parameter.

const creds = new CognitiveServicesCredentials.ApiKeyCredentials({ inHeader: { 'Ocp-Apim-Subscription-Key': subscription_key } });
const textAnalyticsClient = new TextAnalyticsAPIClient.TextAnalyticsClient(creds, endpoint);

Sentiment analysis

Create a list of dictionary objects, containing the documents you want to analyze. Call the client's sentiment() method and get the returned SentimentBatchResult. Iterate through the list of results, and print each document's ID and sentiment score. A score closer to 0 indicates a negative sentiment, while a score closer to 1 indicates a positive sentiment.

async function sentimentAnalysis(client){

    console.log("3. This will perform sentiment analysis on the sentences.");

    const sentimentInput = {
        documents: [
            { language: "en", id: "1", text: "I had the best day of my life." },
            {
                language: "en",
                id: "2",
                text: "This was a waste of my time. The speaker put me to sleep."
            },
            {
                language: "es",
                id: "3",
                text: "No tengo dinero ni nada que dar..."
            },
            {
                language: "it",
                id: "4",
                text:
                    "L'hotel veneziano era meraviglioso. È un bellissimo pezzo di architettura."
            }
        ]
    };

    const sentimentResult = await client.sentiment({
        multiLanguageBatchInput: sentimentInput
    });
    console.log(sentimentResult.documents);
    console.log(os.EOL);
}
sentimentAnalysis(textAnalyticsClient)

Run your code with node index.js in your console window.

Output

[ { id: '1', score: 0.87 } ]
[ { id: '2', score: 0.11 } ]
[ { id: '3', score: 0.44 } ]
[ { id: '4', score: 1.00 } ]

Language detection

Create a list of dictionary objects containing your documents. Call the client's detectLanguage() method and get the returned LanguageBatchResult. Then iterate through the results, and print each document's ID, and language.

async function languageDetection(client) {

    console.log("1. This will detect the languages of the inputs.");
    const languageInput = {
        documents: [
            { id: "1", text: "This is a document written in English." },
            { id: "2", text: "Este es un document escrito en Español." },
            { id: "3", text: "这是一个用中文写的文件" }
        ]
    };

    const languageResult = await client.detectLanguage({
        languageBatchInput: languageInput
    });

    languageResult.documents.forEach(document => {
        console.log(`ID: ${document.id}`);
        document.detectedLanguages.forEach(language =>
            console.log(`\tLanguage ${language.name}`)
        );
    });
    console.log(os.EOL);
}
languageDetection(textAnalyticsClient);

Run your code with node index.js in your console window.

Output

Document ID: 1 , Language: English
Document ID: 2 , Language: Spanish
Document ID: 3 , Language: Chinese_Simplified

Entity recognition

Create a list of objects, containing your documents. Call the client's entities() method and get the EntitiesBatchResult object. Iterate through the list of results, and print each document's ID. For each detected entity, print its wikipedia name, the type and sub-types (if exists) as well as the locations in the original text.

async function entityRecognition(client){
    console.log("3. This will perform Entity recognition on the sentences.");

    const entityInputs = {
        documents: [
            {
                language: "en",
                id: "1",
                text:
                    "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800"
            },
            {
                language: "es",
                id: "2",
                text:
                    "La sede principal de Microsoft se encuentra en la ciudad de Redmond, a 21 kilómetros de Seattle."
            }
        ]
    };

    const entityResults = await client.entities({
        multiLanguageBatchInput: entityInputs
    });

    entityResults.documents.forEach(document => {
        console.log(`Document ID: ${document.id}`);
        document.entities.forEach(e => {
            console.log(`\tName: ${e.name} Type: ${e.type} Sub Type: ${e.type}`);
            e.matches.forEach(match =>
                console.log(
                    `\t\tOffset: ${match.offset} Length: ${match.length} Score: ${
                    match.entityTypeScore
                    }`
                )
            );
        });
    });

    console.log(os.EOL);
}
entityRecognition(textAnalyticsClient);

Run your code with node index.js in your console window.

Output

Document ID: 1
    Name: Microsoft,        Type: Organization,     Sub-Type: N/A
    Offset: 0, Length: 9,   Score: 1.0
    Name: Bill Gates,       Type: Person,   Sub-Type: N/A
    Offset: 25, Length: 10, Score: 0.999847412109375
    Name: Paul Allen,       Type: Person,   Sub-Type: N/A
    Offset: 40, Length: 10, Score: 0.9988409876823425
    Name: April 4,  Type: Other,    Sub-Type: N/A
    Offset: 54, Length: 7,  Score: 0.8
    Name: April 4, 1975,    Type: DateTime, Sub-Type: Date
    Offset: 54, Length: 13, Score: 0.8
    Name: BASIC,    Type: Other,    Sub-Type: N/A
    Offset: 89, Length: 5,  Score: 0.8
    Name: Altair 8800,      Type: Other,    Sub-Type: N/A
    Offset: 116, Length: 11,        Score: 0.8

Document ID: 2
    Name: Microsoft,        Type: Organization,     Sub-Type: N/A
    Offset: 21, Length: 9,  Score: 0.999755859375
    Name: Redmond (Washington),     Type: Location, Sub-Type: N/A
    Offset: 60, Length: 7,  Score: 0.9911284446716309
    Name: 21 kilómetros,    Type: Quantity, Sub-Type: Dimension
    Offset: 71, Length: 13, Score: 0.8
    Name: Seattle,  Type: Location, Sub-Type: N/A
    Offset: 88, Length: 7,  Score: 0.9998779296875

Key phrase extraction

Create a list of objects, containing your documents. Call the client's keyPhrases() method and get the returned KeyPhraseBatchResult object. Iterate through the results and print each document's ID, and any detected key phrases.

async function keyPhraseExtraction(client){

    console.log("2. This will extract key phrases from the sentences.");
    const keyPhrasesInput = {
        documents: [
            { language: "ja", id: "1", text: "猫は幸せ" },
            {
                language: "de",
                id: "2",
                text: "Fahrt nach Stuttgart und dann zum Hotel zu Fu."
            },
            {
                language: "en",
                id: "3",
                text: "My cat might need to see a veterinarian."
            },
            { language: "es", id: "4", text: "A mi me encanta el fútbol!" }
        ]
    };

    const keyPhraseResult = await client.keyPhrases({
        multiLanguageBatchInput: keyPhrasesInput
    });
    console.log(keyPhraseResult.documents);
    console.log(os.EOL);
}
keyPhraseExtraction(textAnalyticsClient);

Run your code with node index.js in your console window.

Output

[
    { id: '1', keyPhrases: [ '幸せ' ] }
    { id: '2', keyPhrases: [ 'Stuttgart', "hotel", "Fahrt", "Fu" ] }
    { id: '3', keyPhrases: [ 'cat', 'veterinarian' ] }
    { id: '3', keyPhrases: [ 'fútbol' ] }
]

Run the application

Run the application with the node command on your quickstart file.

node index.js

Reference documentation | Library source code | Package (Github) | Samples

Prerequisites

Setting up

Create a Text Analytics Azure resource

Get a key and endpoint to authenticate your applications. Create a resource for Text Analytics using the Azure portal or Azure CLI on your local machine. You can also:

After you get a key and endpoint from your trial subscription or resource, create two environment variables. One named TEXT_ANALYTICS_SUBSCRIPTION_KEY for your key, and one named TEXT_ANALYTICS_ENDPOINT for your endpoint.

Create a new Go project

In a console window (cmd, PowerShell, Terminal, Bash), create a new workspace for your Go project and navigate to it. Your workspace will contain three folders:

  • src - This directory contains source code and packages. Any packages installed with the go get command will reside here.
  • pkg - This directory contains the compiled Go package objects. These files all have an .a extension.
  • bin - This directory contains the binary executable files that are created when you run go install.

Tip

Learn more about the structure of a Go workspace. This guide includes information for setting $GOPATH and $GOROOT.

Create a workspace called my-app and the required sub directories for src, pkg, and bin:

$ mkdir -p my-app/{src, bin, pkg}  
$ cd my-app

Install the Text Analytics client library for Go

Install the client library for Go:

$ go get -u <https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/textanalytics>

or if you use dep, within your repo run:

$ dep ensure -add <https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/textanalytics>

Create your Go application

Next, create a file named src/quickstart.go:

$ cd src
$ touch quickstart.go

Open quickstart.go in your favorite IDE or text editor. Then add the package name and import the following libraries:

import (
    "context"
    "encoding/json"
    "fmt"
    "log"
    "os"

    "github.com/Azure/azure-sdk-for-go/services/cognitiveservices/v2.1/textanalytics"
    "github.com/Azure/go-autorest/autorest"
    "github.com/Azure/go-autorest/autorest/to"
)

Object model

The Text Analytics client is a BaseClient object that authenticates to Azure using your key. The client provides several methods for analyzing text, as a single string, or a batch.

Text is sent to the API as a list of documents, which are dictionary objects containing a combination of id, text, and language attributes depending on the method used. The text attribute stores the text to be analyzed in the origin language, and the id can be any value.

The response object is a list containing the analysis information for each document.

Code examples

These code snippets show you how to do the following with the Text Analytics client library for Python:

Authenticate the client

In a new function, create variables for your resource's Azure endpoint and subscription key. Obtain these values from the environment variables TEXT_ANALYTICS_SUBSCRIPTION_KEY and TEXT_ANALYTICS_ENDPOINT. If you created these environment variables after you began editing the application, you will need to close and reopen the editor, IDE, or shell you are using to access the variables.

Tip

To find your key and endpoint on the Azure portal:

  1. Navigate to your azure resource at https://portal.azure.com/.
  2. Click on Quick start, located under Resource Management.

Create a new BaseClient object. Pass your key to the autorest.NewCognitiveServicesAuthorizer() function, which will then be passed to the client's authorizer property.

func GetTextAnalyticsClient() textanalytics.BaseClient {
    var subscriptionKeyVar string = "TEXT_ANALYTICS_SUBSCRIPTION_KEY"
    if "" == os.Getenv(subscriptionKeyVar) {
        log.Fatal("Please set/export the environment variable " + subscriptionKeyVar + ".")
    }
    var subscriptionKey string = os.Getenv(subscriptionKeyVar)
    var endpointVar string = "TEXT_ANALYTICS_ENDPOINT"
    if "" == os.Getenv(endpointVar) {
        log.Fatal("Please set/export the environment variable " + endpointVar + ".")
    }
    var endpoint string = os.Getenv(endpointVar)

    textAnalyticsClient := textanalytics.New(endpoint)
    textAnalyticsClient.Authorizer = autorest.NewCognitiveServicesAuthorizer(subscriptionKey)

    return textAnalyticsClient
}

Sentiment analysis

Create a new function called SentimentAnalysis() and create a client using the GetTextAnalyticsClient() method created earlier. Create a list of MultiLanguageInput objects, containing the documents you want to analyze. Each object will contain an id, Language and a text attribute. The text attribute stores the text to be analyzed, language is the language of the document, and the id can be any value.

Call the client's Sentiment() function and get the result. Then iterate through the results, and print each document's ID, and sentiment score. A score closer to 0 indicates a negative sentiment, while a score closer to 1 indicates a positive sentiment.

func SentimentAnalysis() {
    textAnalyticsclient := GetTextAnalyticsClient()
    ctx := context.Background()
    inputDocuments := []textanalytics.MultiLanguageInput{
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("en"),
            ID:       to.StringPtr("0"),
            Text:     to.StringPtr("I had the best day of my life."),
        },
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("en"),
            ID:       to.StringPtr("1"),
            Text:     to.StringPtr("This was a waste of my time. The speaker put me to sleep."),
        },
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("es"),
            ID:       to.StringPtr("2"),
            Text:     to.StringPtr("No tengo dinero ni nada que dar..."),
        },
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("it"),
            ID:       to.StringPtr("3"),
            Text:     to.StringPtr("L'hotel veneziano era meraviglioso. È un bellissimo pezzo di architettura."),
        },
    }

    batchInput := textanalytics.MultiLanguageBatchInput{Documents: &inputDocuments}
    result, _ := textAnalyticsclient.Sentiment(ctx, to.BoolPtr(false), &batchInput)
    batchResult := textanalytics.SentimentBatchResult{}
    jsonString, _ := json.Marshal(result.Value)
    json.Unmarshal(jsonString, &batchResult)

    // Printing sentiment results
    for _, document := range *batchResult.Documents {
        fmt.Printf("Document ID: %s ", *document.ID)
        fmt.Printf("Sentiment Score: %f\n", *document.Score)
    }

    // Printing document errors
    fmt.Println("Document Errors")
    for _, error := range *batchResult.Errors {
        fmt.Printf("Document ID: %s Message : %s\n", *error.ID, *error.Message)
    }
}

call SentimentAnalysis() in your project.

Output

Document ID: 1 , Sentiment Score: 0.87
Document ID: 2 , Sentiment Score: 0.11
Document ID: 3 , Sentiment Score: 0.44
Document ID: 4 , Sentiment Score: 1.00

Language detection

Create a new function called LanguageDetection() and create a client using the GetTextAnalyticsClient() method created earlier. Create a list of LanguageInput objects, containing the documents you want to analyze. Each object will contain an id and a text attribute. The text attribute stores the text to be analyzed, and the id can be any value.

Call the client's DetectLanguage() and get the result. Then iterate through the results, and print each document's ID, and detected language.

func DetectLanguage() {
    textAnalyticsclient := GetTextAnalyticsClient()
    ctx := context.Background()
    inputDocuments := []textanalytics.LanguageInput{
        textanalytics.LanguageInput{
            ID:   to.StringPtr("0"),
            Text: to.StringPtr("This is a document written in English."),
        },
        textanalytics.LanguageInput{
            ID:   to.StringPtr("1"),
            Text: to.StringPtr("Este es un document escrito en Español."),
        },
        textanalytics.LanguageInput{
            ID:   to.StringPtr("2"),
            Text: to.StringPtr("这是一个用中文写的文件"),
        },
    }

    batchInput := textanalytics.LanguageBatchInput{Documents: &inputDocuments}
    result, _ := textAnalyticsclient.DetectLanguage(ctx, to.BoolPtr(false), &batchInput)

    // Printing language detection results
    for _, document := range *result.Documents {
        fmt.Printf("Document ID: %s ", *document.ID)
        fmt.Printf("Detected Languages with Score: ")
        for _, language := range *document.DetectedLanguages {
            fmt.Printf("%s %f,", *language.Name, *language.Score)
        }
        fmt.Println()
    }

    // Printing document errors
    fmt.Println("Document Errors")
    for _, error := range *result.Errors {
        fmt.Printf("Document ID: %s Message : %s\n", *error.ID, *error.Message)
    }
}

Call LanguageDetection() in your project.

Output

Document ID: 0 , Language: English 
Document ID: 1 , Language: Spanish
Document ID: 2 , Language: Chinese_Simplified

Entity recognition

Create a new function called ExtractEntities() and create a client using the GetTextAnalyticsClient() method created earlier. Create a list of MultiLanguageInput objects, containing the documents you want to analyze. Each object will contain an id, language, and a text attribute. The text attribute stores the text to be analyzed, language is the language of the document, and the id can be any value.

Call the client's Entities() and get the result. Then iterate through the results, and print each document's ID, and extracted entities score.

func ExtractEntities() {
    textAnalyticsclient := GetTextAnalyticsClient()
    ctx := context.Background()
    inputDocuments := []textanalytics.MultiLanguageInput{
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("en"),
            ID:       to.StringPtr("0"),
            Text:     to.StringPtr("Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800."),
        },
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("es"),
            ID:       to.StringPtr("1"),
            Text:     to.StringPtr("La sede principal de Microsoft se encuentra en la ciudad de Redmond, a 21 kilómetros de Seattle."),
        },
    }

    batchInput := textanalytics.MultiLanguageBatchInput{Documents: &inputDocuments}
    result, _ := textAnalyticsclient.Entities(ctx, to.BoolPtr(false), &batchInput)

    // Printing extracted entities results
    for _, document := range *result.Documents {
        fmt.Printf("Document ID: %s\n", *document.ID)
        fmt.Printf("\tExtracted Entities:\n")
        for _, entity := range *document.Entities {
            fmt.Printf("\t\tName: %s\tType: %s", *entity.Name, *entity.Type)
            if entity.SubType != nil {
                fmt.Printf("\tSub-Type: %s\n", *entity.SubType)
            }
            fmt.Println()
            for _, match := range *entity.Matches {
                fmt.Printf("\t\t\tOffset: %v\tLength: %v\tScore: %f\n", *match.Offset, *match.Length, *match.EntityTypeScore)
            }
        }
        fmt.Println()
    }

    // Printing document errors
    fmt.Println("Document Errors")
    for _, error := range *result.Errors {
        fmt.Printf("Document ID: %s Message : %s\n", *error.ID, *error.Message)
    }
}

call ExtractEntities() in your project.

Output

Document ID: 1
    Name: Microsoft,        Type: Organization,     Sub-Type: N/A
    Offset: 0, Length: 9,   Score: 1.0
    Name: Bill Gates,       Type: Person,   Sub-Type: N/A
    Offset: 25, Length: 10, Score: 0.999847412109375
    Name: Paul Allen,       Type: Person,   Sub-Type: N/A
    Offset: 40, Length: 10, Score: 0.9988409876823425
    Name: April 4,  Type: Other,    Sub-Type: N/A
    Offset: 54, Length: 7,  Score: 0.8
    Name: April 4, 1975,    Type: DateTime, Sub-Type: Date
    Offset: 54, Length: 13, Score: 0.8
    Name: BASIC,    Type: Other,    Sub-Type: N/A
    Offset: 89, Length: 5,  Score: 0.8
    Name: Altair 8800,      Type: Other,    Sub-Type: N/A
    Offset: 116, Length: 11,        Score: 0.8

Document ID: 2
    Name: Microsoft,        Type: Organization,     Sub-Type: N/A
    Offset: 21, Length: 9,  Score: 0.999755859375
    Name: Redmond (Washington),     Type: Location, Sub-Type: N/A
    Offset: 60, Length: 7,  Score: 0.9911284446716309
    Name: 21 kilómetros,    Type: Quantity, Sub-Type: Dimension
    Offset: 71, Length: 13, Score: 0.8
    Name: Seattle,  Type: Location, Sub-Type: N/A
    Offset: 88, Length: 7,  Score: 0.9998779296875

Key phrase extraction

Create a new function called ExtractKeyPhrases() and create a client using the GetTextAnalyticsClient() method created earlier. Create a list of MultiLanguageInput objects, containing the documents you want to analyze. Each object will contain an id, language, and a text attribute. The text attribute stores the text to be analyzed, language is the language of the document, and the id can be any value.

Call the client's KeyPhrases() and get the result. Then iterate through the results, and print each document's ID, and extracted key phrases.

func ExtractKeyPhrases() {
    textAnalyticsclient := GetTextAnalyticsClient()
    ctx := context.Background()
    inputDocuments := []textanalytics.MultiLanguageInput{
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("ja"),
            ID:       to.StringPtr("0"),
            Text:     to.StringPtr("猫は幸せ"),
        },
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("de"),
            ID:       to.StringPtr("1"),
            Text:     to.StringPtr("Fahrt nach Stuttgart und dann zum Hotel zu Fu."),
        },
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("en"),
            ID:       to.StringPtr("2"),
            Text:     to.StringPtr("My cat might need to see a veterinarian."),
        },
        textanalytics.MultiLanguageInput{
            Language: to.StringPtr("es"),
            ID:       to.StringPtr("3"),
            Text:     to.StringPtr("A mi me encanta el fútbol!"),
        },
    }

    batchInput := textanalytics.MultiLanguageBatchInput{Documents: &inputDocuments}
    result, _ := textAnalyticsclient.KeyPhrases(ctx, to.BoolPtr(false), &batchInput)

    // Printing extracted key phrases results
    for _, document := range *result.Documents {
        fmt.Printf("Document ID: %s\n", *document.ID)
        fmt.Printf("\tExtracted Key Phrases:\n")
        for _, keyPhrase := range *document.KeyPhrases {
            fmt.Printf("\t\t%s\n", keyPhrase)
        }
        fmt.Println()
    }

    // Printing document errors
    fmt.Println("Document Errors")
    for _, error := range *result.Errors {
        fmt.Printf("Document ID: %s Message : %s\n", *error.ID, *error.Message)
    }
}

Call ExtractKeyPhrases() in your project.

Output

Document ID: 1
         Key phrases:
                幸せ
Document ID: 2
         Key phrases:
                Stuttgart
                Hotel
                Fahrt
                Fu
Document ID: 3
         Key phrases:
                cat
                veterinarian
Document ID: 4
         Key phrases:
                fútbol

Reference documentation | Library source code | Package (RubyGems) | Samples

Prerequisites

Setting up

Create a Text Analytics Azure resource

Get a key and endpoint to authenticate your applications. Create a resource for Text Analytics using the Azure portal or Azure CLI on your local machine. You can also:

After you get a key and endpoint from your trial subscription or resource, create two environment variables. One named TEXT_ANALYTICS_SUBSCRIPTION_KEY for your key, and one named TEXT_ANALYTICS_ENDPOINT for your endpoint.

Create a new Ruby application

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it. Then create a file named GemFile, and a Ruby file for your code.

mkdir myapp && cd myapp

In your GemFile, add the following lines to add the client library as a dependency.

source 'https://rubygems.org'
gem 'azure_cognitiveservices_textanalytics', '~>0.17.3'

In your Ruby file, import the following packages.

require 'azure_cognitiveservices_textanalytics'
include Azure::CognitiveServices::TextAnalytics::V2_1::Models

Create variables for your resource's Azure endpoint and key, named TEXT_ANALYTICS_ENDPOINT and TEXT_ANALYTICS_SUBSCRIPTION_KEY. If you created the environment variable after you launched the application, you will need to close and reopen the editor, IDE, or shell running it to access the variable.

Tip

To find your key and endpoint on the Azure portal:

  1. Navigate to your azure resource at https://portal.azure.com/.
  2. Click on Quick start, located under Resource Management.
key_var = "TEXT_ANALYTICS_SUBSCRIPTION_KEY"
if (!ENV[key_var])
    raise "Please set/export the following environment variable: " + key_var
else
    subscription_key = ENV[key_var]
end

endpoint_var = "TEXT_ANALYTICS_ENDPOINT"
if (!ENV[endpoint_var])
    raise "Please set/export the following environment variable: " + endpoint_var
else
    endpoint = ENV[endpoint_var]
end

Object model

The Text Analytics client authenticates to Azure using your key. The client provides several methods for analyzing text, as a single string, or a batch.

Text is sent to the API as a list of documents, which are dictionary objects containing a combination of id, text, and language attributes depending on the method used. The text attribute stores the text to be analyzed in the origin language, and the id can be any value.

The response object is a list containing the analysis information for each document.

Code examples

These code snippets show you how to do the following with the Text Analytics client library for Python:

Authenticate the client

Create a class named TextAnalyticsClient.

class TextAnalyticsClient
  @textAnalyticsClient
  #...
end

In this class, create a function called initialize to authenticate the client. Use your TEXT_ANALYTICS_SUBSCRIPTION_KEY and TEXT_ANALYTICS_ENDPOINT environment variables.

def initialize(endpoint, key)
  credentials =
      MsRestAzure::CognitiveServicesCredentials.new(key)

  endpoint = String.new(endpoint)

  @textAnalyticsClient = Azure::TextAnalytics::Profiles::Latest::Client.new({
      credentials: credentials
  })
  @textAnalyticsClient.endpoint = endpoint
end

Outside of the class, use the client's new() function to instantiate it.

client = TextAnalyticsClient.new(endpoint, subscription_key)

Sentiment analysis

In the client object, create a function called AnalyzeSentiment() that takes a list of input documents that will be created later. Call the client's sentiment() function and get the result. Then iterate through the results, and print each document's ID, and sentiment score. A score closer to 0 indicates a negative sentiment, while a score closer to 1 indicates a positive sentiment.

def AnalyzeSentiment(inputDocuments)
  result = @textAnalyticsClient.sentiment(
      multi_language_batch_input: inputDocuments
  )

  if (!result.nil? && !result.documents.nil? && result.documents.length > 0)
    puts '===== SENTIMENT ANALYSIS ====='
    result.documents.each do |document|
      puts "Document Id: #{document.id}: Sentiment Score: #{document.score}"
    end
  end
  puts ''
end

Outside of the client function, create a new function called SentimentAnalysisExample() that takes the TextAnalyticsClient object created earlier. Create a list of MultiLanguageInput objects, containing the documents you want to analyze. Each object will contain an id, Language and a text attribute. The text attribute stores the text to be analyzed, language is the language of the document, and the id can be any value. Then call the client's AnalyzeSentiment() function.

def SentimentAnalysisExample(client)
  # The documents to be analyzed. Add the language of the document. The ID can be any value.
  input_1 = MultiLanguageInput.new
  input_1.id = '1'
  input_1.language = 'en'
  input_1.text = 'I had the best day of my life.'

  input_2 = MultiLanguageInput.new
  input_2.id = '2'
  input_2.language = 'en'
  input_2.text = 'This was a waste of my time. The speaker put me to sleep.'

  input_3 = MultiLanguageInput.new
  input_3.id = '3'
  input_3.language = 'es'
  input_3.text = 'No tengo dinero ni nada que dar...'

  input_4 = MultiLanguageInput.new
  input_4.id = '4'
  input_4.language = 'it'
  input_4.text = "L'hotel veneziano era meraviglioso. È un bellissimo pezzo di architettura."

  inputDocuments =  MultiLanguageBatchInput.new
  inputDocuments.documents = [input_1, input_2, input_3, input_4]

  client.AnalyzeSentiment(inputDocuments)
end

Call the SentimentAnalysisExample() function.

SentimentAnalysisExample(textAnalyticsClient)

Output

===== SENTIMENT ANALYSIS =====
Document ID: 1 , Sentiment Score: 0.87
Document ID: 2 , Sentiment Score: 0.11
Document ID: 3 , Sentiment Score: 0.44
Document ID: 4 , Sentiment Score: 1.00

Language detection

In the client object, create a function called DetectLanguage() that takes a list of input documents that will be created later. Call the client's detect_language() function and get the result. Then iterate through the results, and print each document's ID, and detected language.

def DetectLanguage(inputDocuments)
  result = @textAnalyticsClient.detect_language(
      language_batch_input: inputDocuments
  )

  if (!result.nil? && !result.documents.nil? && result.documents.length > 0)
    puts '===== LANGUAGE DETECTION ====='
    result.documents.each do |document|
      puts "Document ID: #{document.id} , Language: #{document.detected_languages[0].name}"
    end
  else
    puts 'No results data..'
  end
  puts ''
end

Outside of the client function, create a new function called DetectLanguageExample() that takes the TextAnalyticsClient object created earlier. Create a list of LanguageInput objects, containing the documents you want to analyze. Each object will contain an id, and a text attribute. The text attribute stores the text to be analyzed, and the id can be any value. Then call the client's DetectLanguage() function.

def DetectLanguageExample(client)
 # The documents to be analyzed.
 language_input_1 = LanguageInput.new
 language_input_1.id = '1'
 language_input_1.text = 'This is a document written in English.'

 language_input_2 = LanguageInput.new
 language_input_2.id = '2'
 language_input_2.text = 'Este es un document escrito en Español..'

 language_input_3 = LanguageInput.new
 language_input_3.id = '3'
 language_input_3.text = '这是一个用中文写的文件'

 language_batch_input = LanguageBatchInput.new
 language_batch_input.documents = [language_input_1, language_input_2, language_input_3]

 client.DetectLanguage(language_batch_input)
end

Call the DetectLanguageExample() function.

DetectLanguageExample(textAnalyticsClient)

Output

===== LANGUAGE EXTRACTION ======
Document ID: 1 , Language: English
Document ID: 2 , Language: Spanish
Document ID: 3 , Language: Chinese_Simplified

Entity recognition

In the client object, create a function called RecognizeEntities() that takes a list of input documents that will be created later. Call the client's entities() function and get the result. Then iterate through the results, and print each document's ID, and the recognized entities.

def RecognizeEntities(inputDocuments)
  result = @textAnalyticsClient.entities(
      multi_language_batch_input: inputDocuments
  )

  if (!result.nil? && !result.documents.nil? && result.documents.length > 0)
    puts '===== ENTITY RECOGNITION ====='
    result.documents.each do |document|
      puts "Document ID: #{document.id}"
        document.entities.each do |entity|
          puts "\tName: #{entity.name}, \tType: #{entity.type == nil ? "N/A": entity.type},\tSub-Type: #{entity.sub_type == nil ? "N/A": entity.sub_type}"
          entity.matches.each do |match|
            puts "\tOffset: #{match.offset}, \Length: #{match.length},\tScore: #{match.entity_type_score}"
          end
          puts
        end
    end
  else
    puts 'No results data..'
  end
  puts ''
end

Outside of the client function, create a new function called RecognizeEntitiesExample() that takes the TextAnalyticsClient object created earlier. Create a list of MultiLanguageInput objects, containing the documents you want to analyze. Each object will contain an id, a language, and a text attribute. The text attribute stores the text to be analyzed, language is the language of the text, and the id can be any value. Then call the client's RecognizeEntities() function.

def RecognizeEntitiesExample(client)
  # The documents to be analyzed.
  input_1 = MultiLanguageInput.new
  input_1.id = '1'
  input_1.language = 'en'
  input_1.text = 'Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800.'

  input_2 = MultiLanguageInput.new
  input_2.id = '2'
  input_2.language = 'es'
  input_2.text = 'La sede principal de Microsoft se encuentra en la ciudad de Redmond, a 21 kilómetros de Seattle.'

  multi_language_batch_input =  MultiLanguageBatchInput.new
  multi_language_batch_input.documents = [input_1, input_2]

  client.RecognizeEntities(multi_language_batch_input)
end

Call the RecognizeEntitiesExample() function.

RecognizeEntitiesExample(textAnalyticsClient)

Output

===== ENTITY RECOGNITION =====
Document ID: 1
        Name: Microsoft,        Type: Organization,     Sub-Type: N/A
        Offset: 0, Length: 9,   Score: 1.0

        Name: Bill Gates,       Type: Person,   Sub-Type: N/A
        Offset: 25, Length: 10, Score: 0.999847412109375

        Name: Paul Allen,       Type: Person,   Sub-Type: N/A
        Offset: 40, Length: 10, Score: 0.9988409876823425

        Name: April 4,  Type: Other,    Sub-Type: N/A
        Offset: 54, Length: 7,  Score: 0.8

        Name: April 4, 1975,    Type: DateTime, Sub-Type: Date
        Offset: 54, Length: 13, Score: 0.8

        Name: BASIC,    Type: Other,    Sub-Type: N/A
        Offset: 89, Length: 5,  Score: 0.8

        Name: Altair 8800,      Type: Other,    Sub-Type: N/A
        Offset: 116, Length: 11,        Score: 0.8

Document ID: 2
        Name: Microsoft,        Type: Organization,     Sub-Type: N/A
        Offset: 21, Length: 9,  Score: 0.999755859375

        Name: Redmond (Washington),     Type: Location, Sub-Type: N/A
        Offset: 60, Length: 7,  Score: 0.9911284446716309

        Name: 21 kilómetros,    Type: Quantity, Sub-Type: Dimension
        Offset: 71, Length: 13, Score: 0.8

        Name: Seattle,  Type: Location, Sub-Type: N/A
        Offset: 88, Length: 7,  Score: 0.9998779296875

Key phrase extraction

In the client object, create a function called ExtractKeyPhrases() that takes a list of input documents that will be created later. Call the client's key_phrases() function and get the result. Then iterate through the results, and print each document's ID, and the extracted key phrases.

def ExtractKeyPhrases(inputDocuments)
  result = @textAnalyticsClient.key_phrases(
      multi_language_batch_input: inputDocuments
  )

  if (!result.nil? && !result.documents.nil? && result.documents.length > 0)
    puts '===== KEY PHRASE EXTRACTION ====='
    result.documents.each do |document|
      puts "Document Id: #{document.id}"
      puts '  Key Phrases'
      document.key_phrases.each do |key_phrase|
        puts "    #{key_phrase}"
      end
    end
  else
    puts 'No results data..'
  end
  puts ''
end

Outside of the client function, create a new function called KeyPhraseExtractionExample() that takes the TextAnalyticsClient object created earlier. Create a list of MultiLanguageInput objects, containing the documents you want to analyze. Each object will contain an id, a language, and a text attribute. The text attribute stores the text to be analyzed, language is the language of the text, and the id can be any value. Then call the client's ExtractKeyPhrases() function.

def KeyPhraseExtractionExample(client)
  # The documents to be analyzed.
  input_1 = MultiLanguageInput.new
  input_1.id = '1'
  input_1.language = 'ja'
  input_1.text = '猫は幸せ'

  input_2 = MultiLanguageInput.new
  input_2.id = '2'
  input_2.language = 'de'
  input_2.text = 'Fahrt nach Stuttgart und dann zum Hotel zu Fu.'

  input_3 = MultiLanguageInput.new
  input_3.id = '3'
  input_3.language = 'en'
  input_3.text = 'My cat is stiff as a rock.'

  input_4 = MultiLanguageInput.new
  input_4.id = '4'
  input_4.language = 'es'
  input_4.text = 'A mi me encanta el fútbol!'

  input_documents =  MultiLanguageBatchInput.new
  input_documents.documents = [input_1, input_2, input_3, input_4]

  client.ExtractKeyPhrases(input_documents)
end

Call the KeyPhraseExtractionExample() function.

KeyPhraseExtractionExample(textAnalyticsClient)

Output

Document ID: 1
         Key phrases:
                幸せ
Document ID: 2
         Key phrases:
                Stuttgart
                Hotel
                Fahrt
                Fu
Document ID: 3
         Key phrases:
                cat
                rock
Document ID: 4
         Key phrases:
                fútbol

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps