您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

快速入门:使用文本分析客户端库Quickstart: Use the Text Analytics client library

从文本分析客户端库开始操作。Get started with the Text Analytics client library. 请按照以下步骤安装程序包并试用基本任务的示例代码。Follow these steps to install the package and try out the example code for basic tasks.

使用文本分析客户端库执行:Use the Text Analytics client library to perform:

  • 情绪分析Sentiment analysis
  • 语言检测Language detection
  • 实体识别Entity recognition
  • 关键短语提取Key phrase extraction

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
    • 确保只按所用版本的说明操作。Be sure to only follow the instructions for the version you are using.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below.

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • Visual Studio IDEThe Visual Studio IDE
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

创建新的 .NET Core 应用程序Create a new .NET Core application

使用 Visual Studio IDE 创建新的 .NET Core 控制台应用。Using the Visual Studio IDE, create a new .NET Core console app. 这会创建包含单个 C# 源文件的“Hello World”项目:program.csThis will create a "Hello World" project with a single C# source file: program.cs.

右键单击解决方案资源管理器中的解决方案,然后选择“管理 NuGet 包”,以便安装客户端库。Install the client library by right-clicking on the solution in the Solution Explorer and selecting Manage NuGet Packages. 在打开的包管理器中选择“浏览”,搜索 Azure.AI.TextAnalyticsIn the package manager that opens select Browse and search for Azure.AI.TextAnalytics. 选择版本 5.0.0,然后选择“安装”。Select version 5.0.0, and then Install. 也可使用包管理器控制台You can also use the Package Manager Console.

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

打开 program.cs 文件并添加以下 using 指令:Open the program.cs file and add the following using directives:

using Azure;
using System;
using System.Globalization;
using Azure.AI.TextAnalytics;

在应用程序的 Program 类中,为资源的密钥和终结点创建变量。In the application's Program class, create variables for your resource's key and endpoint.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

private static readonly AzureKeyCredential credentials = new AzureKeyCredential("<replace-with-your-text-analytics-key-here>");
private static readonly Uri endpoint = new Uri("<replace-with-your-text-analytics-endpoint-here>");

替换应用程序的 Main 方法。Replace the application's Main method. 稍后将定义此处调用的方法。You will define the methods called here later.

static void Main(string[] args)
{
    var client = new TextAnalyticsClient(endpoint, credentials);
    // You will implement these methods later in the quickstart.
    SentimentAnalysisExample(client);
    LanguageDetectionExample(client);
    EntityRecognitionExample(client);
    EntityLinkingExample(client);
    KeyPhraseExtractionExample(client);

    Console.Write("Press any key to exit.");
    Console.ReadKey();
}

对象模型Object model

文本分析客户端是一个 TextAnalyticsClient 对象,该对象使用你的密钥在 Azure 中进行身份验证,并提供用于接受文本(单个字符串或批)的函数。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key, and provides functions to accept text as single strings or as a batch. 可以同步方式或异步方式将文本发送到 API。You can send text to the API synchronously, or asynchronously. 响应对象包含发送的每个文档的分析信息。The response object will contain the analysis information for each document you send.

如果使用的是服务的 3.0 版本,则可使用可选的 TextAnalyticsClientOptions 实例,通过各种默认设置(例如默认语言或国家/地区提示)来初始化客户端。If you're using version 3.0 of the service, you can use an optional TextAnalyticsClientOptions instance to initialize the client with various default settings (for example default language or country/region hint). 还可以使用 Azure Active Directory 令牌进行身份验证。You can also authenticate using an Azure Active Directory token.

代码示例Code examples

验证客户端Authenticate the client

请确保先前的 main 方法使用终结点和凭据创建新的客户端对象。Make sure your main method from earlier creates a new client object with your endpoint and credentials.

var client = new TextAnalyticsClient(endpoint, credentials);

情绪分析Sentiment analysis

创建一个名为 SentimentAnalysisExample() 的新函数,该函数接受你之前创建的客户端,并调用其 AnalyzeSentiment() 函数。Create a new function called SentimentAnalysisExample() that takes the client that you created earlier, and call its AnalyzeSentiment() function. 如果成功,则返回的 Response<DocumentSentiment> 对象将包含整个输入文档的情绪标签和分数,以及每个句子的情绪分析。The returned Response<DocumentSentiment> object will contain the sentiment label and score of the entire input document, as well as a sentiment analysis for each sentence if successful. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void SentimentAnalysisExample(TextAnalyticsClient client)
{
    string inputText = "I had the best day of my life. I wish you were there with me.";
    DocumentSentiment documentSentiment = client.AnalyzeSentiment(inputText);
    Console.WriteLine($"Document sentiment: {documentSentiment.Sentiment}\n");

    foreach (var sentence in documentSentiment.Sentences)
    {
        Console.WriteLine($"\tText: \"{sentence.Text}\"");
        Console.WriteLine($"\tSentence sentiment: {sentence.Sentiment}");
        Console.WriteLine($"\tPositive score: {sentence.ConfidenceScores.Positive:0.00}");
        Console.WriteLine($"\tNegative score: {sentence.ConfidenceScores.Negative:0.00}");
        Console.WriteLine($"\tNeutral score: {sentence.ConfidenceScores.Neutral:0.00}\n");
    }
}

输出Output

Document sentiment: Positive

        Text: "I had the best day of my life."
        Sentence sentiment: Positive
        Positive score: 1.00
        Negative score: 0.00
        Neutral score: 0.00

        Text: "I wish you were there with me."
        Sentence sentiment: Neutral
        Positive score: 0.21
        Negative score: 0.02
        Neutral score: 0.77

语言检测Language detection

创建一个名为 LanguageDetectionExample() 的新函数,该函数接受你之前创建的客户端并调用其 DetectLanguage() 函数。Create a new function called LanguageDetectionExample() that takes the client that you created earlier, and call its DetectLanguage() function. 返回的 Response<DetectedLanguage> 对象会包含检测到的语言及其名称和 ISO-6391 代码。The returned Response<DetectedLanguage> object will contain the detected language along with its name and ISO-6391 code. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

提示

在某些情况下,可能很难根据输入区分语言。In some cases it may be hard to disambiguate languages based on the input. 可以使用 countryHint 参数指定 2 个字母的国家/地区代码。You can use the countryHint parameter to specify a 2-letter country/region code. 默认情况下,API 使用“US”作为默认的 countryHint,要删除此行为,可以通过将此值设置为空字符串 countryHint = "" 来重置此参数。By default the API is using the "US" as the default countryHint, to remove this behavior you can reset this parameter by setting this value to empty string countryHint = "". 若要设置不同的默认值,请设置 TextAnalyticsClientOptions.DefaultCountryHint 属性,然后在客户端初始化期间传递它。To set a different default, set the TextAnalyticsClientOptions.DefaultCountryHint property and pass it during the client's initialization.

static void LanguageDetectionExample(TextAnalyticsClient client)
{
    DetectedLanguage detectedLanguage = client.DetectLanguage("Ce document est rédigé en Français.");
    Console.WriteLine("Language:");
    Console.WriteLine($"\t{detectedLanguage.Name},\tISO-6391: {detectedLanguage.Iso6391Name}\n");
}

输出Output

Language:
        French, ISO-6391: fr

命名实体识别 (NER)Named Entity Recognition (NER)

备注

版本 3.0 中的新增功能:New in version 3.0:

  • 实体关联现在独立于实体识别。Entity linking is now a separated from entity recognition.

创建一个名为 EntityRecognitionExample() 的新函数,该函数接受你之前创建的客户端,调用其 RecognizeEntities() 函数并循环访问结果。Create a new function called EntityRecognitionExample() that takes the client that you created earlier, call its RecognizeEntities() function and iterate through the results. 返回的 Response<IReadOnlyCollection<CategorizedEntity>> 对象将包含检测到的实体的列表。The returned Response<IReadOnlyCollection<CategorizedEntity>> object will contain the list of detected entities. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void EntityRecognitionExample(TextAnalyticsClient client)
{
    var response = client.RecognizeEntities("I had a wonderful trip to Seattle last week.");
    Console.WriteLine("Named Entities:");
    foreach (var entity in response.Value)
    {
        Console.WriteLine($"\tText: {entity.Text},\tCategory: {entity.Category},\tSub-Category: {entity.SubCategory}");
        Console.WriteLine($"\t\tScore: {entity.ConfidenceScore:F2}\n");
    }
}

输出Output

Named Entities:
        Text: trip,     Category: Event,        Sub-Category:
                Score: 0.61

        Text: Seattle,  Category: Location,     Sub-Category: GPE
                Score: 0.82

        Text: last week,        Category: DateTime,     Sub-Category: DateRange
                Score: 0.80

实体链接Entity linking

创建一个名为 EntityLinkingExample() 的新函数,该函数接受你之前创建的客户端,调用其 RecognizeLinkedEntities() 函数并循环访问结果。Create a new function called EntityLinkingExample() that takes the client that you created earlier, call its RecognizeLinkedEntities() function and iterate through the results. 返回的 Response<IReadOnlyCollection<LinkedEntity>> 表示检测到的实体的列表。The returned Response<IReadOnlyCollection<LinkedEntity>> represents the list of detected entities. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException. 由于链接实体是唯一标识的,因此同一实体的实例将以分组形式出现在 LinkedEntity 对象下,显示为 LinkedEntityMatch 对象的列表。Since linked entities are uniquely identified, occurrences of the same entity are grouped under a LinkedEntity object as a list of LinkedEntityMatch objects.

static void EntityLinkingExample(TextAnalyticsClient client)
{
    var response = client.RecognizeLinkedEntities(
        "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, " +
        "to develop and sell BASIC interpreters for the Altair 8800. " +
        "During his career at Microsoft, Gates held the positions of chairman, " +
        "chief executive officer, president and chief software architect, " +
        "while also being the largest individual shareholder until May 2014.");
    Console.WriteLine("Linked Entities:");
    foreach (var entity in response.Value)
    {
        Console.WriteLine($"\tName: {entity.Name},\tID: {entity.DataSourceEntityId},\tURL: {entity.Url}\tData Source: {entity.DataSource}");
        Console.WriteLine("\tMatches:");
        foreach (var match in entity.Matches)
        {
            Console.WriteLine($"\t\tText: {match.Text}");
            Console.WriteLine($"\t\tScore: {match.ConfidenceScore:F2}\n");
        }
    }
}

输出Output

Linked Entities:
        Name: Altair 8800,      ID: Altair 8800,        URL: https://en.wikipedia.org/wiki/Altair_8800  Data Source: Wikipedia
        Matches:
                Text: Altair 8800
                Score: 0.88

        Name: Bill Gates,       ID: Bill Gates, URL: https://en.wikipedia.org/wiki/Bill_Gates   Data Source: Wikipedia
        Matches:
                Text: Bill Gates
                Score: 0.63

                Text: Gates
                Score: 0.63

        Name: Paul Allen,       ID: Paul Allen, URL: https://en.wikipedia.org/wiki/Paul_Allen   Data Source: Wikipedia
        Matches:
                Text: Paul Allen
                Score: 0.60

        Name: Microsoft,        ID: Microsoft,  URL: https://en.wikipedia.org/wiki/Microsoft    Data Source: Wikipedia
        Matches:
                Text: Microsoft
                Score: 0.55

                Text: Microsoft
                Score: 0.55

        Name: April 4,  ID: April 4,    URL: https://en.wikipedia.org/wiki/April_4      Data Source: Wikipedia
        Matches:
                Text: April 4
                Score: 0.32

        Name: BASIC,    ID: BASIC,      URL: https://en.wikipedia.org/wiki/BASIC        Data Source: Wikipedia
        Matches:
                Text: BASIC
                Score: 0.33

关键短语提取Key phrase extraction

创建一个名为 KeyPhraseExtractionExample() 的新函数,该函数接受你之前创建的客户端,并调用其 ExtractKeyPhrases() 函数。Create a new function called KeyPhraseExtractionExample() that takes the client that you created earlier, and call its ExtractKeyPhrases() function. 返回的 <Response<IReadOnlyCollection<string>> 对象将包含检测到的关键短语的列表。The returned <Response<IReadOnlyCollection<string>> object will contain the list of detected key phrases. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void KeyPhraseExtractionExample(TextAnalyticsClient client)
{
    var response = client.ExtractKeyPhrases("My cat might need to see a veterinarian.");

    // Printing key phrases
    Console.WriteLine("Key phrases:");

    foreach (string keyphrase in response.Value)
    {
        Console.WriteLine($"\t{keyphrase}");
    }
}

输出Output

Key phrases:
    cat
    veterinarian

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below.

参考文档 | 库源代码 | | 示例Reference documentation | Library source code | Package | Samples

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • Java 开发工具包 (JDK) 版本 8 或更高版本Java Development Kit (JDK) with version 8 or above
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

添加客户端库Add the client library

在首选 IDE 或开发环境中创建 Maven 项目。Create a Maven project in your preferred IDE or development environment. 然后在项目的 pom.xml 文件中,添加以下依赖项。Then add the following dependency to your project's pom.xml file. 可联机找到用于其他生成工具的实现语法。You can find the implementation syntax for other build tools online.

<dependencies>
     <dependency>
        <groupId>com.azure</groupId>
        <artifactId>azure-ai-textanalytics</artifactId>
        <version>5.0.0</version>
    </dependency>
</dependencies>

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

创建名为 TextAnalyticsSamples.java 的 Java 文件。Create a Java file named TextAnalyticsSamples.java. 打开 文件并添加以下 import 语句:Open the file and add the following import statements:

import com.azure.core.credential.AzureKeyCredential;
import com.azure.ai.textanalytics.models.*;
import com.azure.ai.textanalytics.TextAnalyticsClientBuilder;
import com.azure.ai.textanalytics.TextAnalyticsClient;

在 java 文件中,添加一个新类并添加你的 Azure 资源的密钥和终结点,如下所示。In the java file, add a new class and add your Azure resource's key and endpoint as shown below.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

public class TextAnalyticsSamples {
    private static String KEY = "<replace-with-your-text-analytics-key-here>";
    private static String ENDPOINT = "<replace-with-your-text-analytics-endpoint-here>";
}

将以下 main 方法添加到该类。Add the following main method to the class. 稍后将定义此处调用的方法。You will define the methods called here later.

public static void main(String[] args) {
    //You will create these methods later in the quickstart.
    TextAnalyticsClient client = authenticateClient(KEY, ENDPOINT);

    sentimentAnalysisExample(client);
    detectLanguageExample(client);
    recognizeEntitiesExample(client);
    recognizeLinkedEntitiesExample(client);
    extractKeyPhrasesExample(client);
}

对象模型Object model

文本分析客户端是一个 TextAnalyticsClient 对象,该对象使用你的密钥在 Azure 中进行身份验证,并提供用于接受文本(单个字符串或批)的函数。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key, and provides functions to accept text as single strings or as a batch. 可以同步方式或异步方式将文本发送到 API。You can send text to the API synchronously, or asynchronously. 响应对象包含发送的每个文档的分析信息。The response object will contain the analysis information for each document you send.

代码示例Code examples

验证客户端Authenticate the client

使用文本分析资源的密钥和终结点创建用于实例化 TextAnalyticsClient 对象的方法。Create a method to instantiate the TextAnalyticsClient object with the key and endpoint for your Text Analytics resource.

static TextAnalyticsClient authenticateClient(String key, String endpoint) {
    return new TextAnalyticsClientBuilder()
        .credential(new AzureKeyCredential(key))
        .endpoint(endpoint)
        .buildClient();
}

在程序的 main() 方法中,调用身份验证方法来实例化客户端。In your program's main() method, call the authentication method to instantiate the client.

情绪分析Sentiment analysis

创建一个名为 sentimentAnalysisExample() 的新函数,该函数接受你之前创建的客户端,并调用其 analyzeSentiment() 函数。Create a new function called sentimentAnalysisExample() that takes the client that you created earlier, and call its analyzeSentiment() function. 如果成功,则返回的 AnalyzeSentimentResult 对象将包含 documentSentimentsentenceSentiments,否则将包含 errorMessageThe returned AnalyzeSentimentResult object will contain documentSentiment and sentenceSentiments if successful, or an errorMessage if not.

static void sentimentAnalysisExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "I had the best day of my life. I wish you were there with me.";

    DocumentSentiment documentSentiment = client.analyzeSentiment(text);
    System.out.printf(
        "Recognized document sentiment: %s, positive score: %s, neutral score: %s, negative score: %s.%n",
        documentSentiment.getSentiment(),
        documentSentiment.getConfidenceScores().getPositive(),
        documentSentiment.getConfidenceScores().getNeutral(),
        documentSentiment.getConfidenceScores().getNegative());

    for (SentenceSentiment sentenceSentiment : documentSentiment.getSentences()) {
        System.out.printf(
            "Recognized sentence sentiment: %s, positive score: %s, neutral score: %s, negative score: %s.%n",
            sentenceSentiment.getSentiment(),
            sentenceSentiment.getConfidenceScores().getPositive(),
            sentenceSentiment.getConfidenceScores().getNeutral(),
            sentenceSentiment.getConfidenceScores().getNegative());
        }
    }

输出Output

Recognized document sentiment: positive, positive score: 1.0, neutral score: 0.0, negative score: 0.0.
Recognized sentence sentiment: positive, positive score: 1.0, neutral score: 0.0, negative score: 0.0.
Recognized sentence sentiment: neutral, positive score: 0.21, neutral score: 0.77, negative score: 0.02.

语言检测Language detection

创建一个名为 detectLanguageExample() 的新函数,该函数接受你之前创建的客户端并调用其 detectLanguage() 函数。Create a new function called detectLanguageExample() that takes the client that you created earlier, and call its detectLanguage() function. 如果成功,则返回的 DetectLanguageResult 对象将包含检测到的主要语言和检测到的其他语言的列表,如果失败,则将包含 errorMessageThe returned DetectLanguageResult object will contain a primary language detected, a list of other languages detected if successful, or an errorMessage if not.

提示

在某些情况下,可能很难根据输入区分语言。In some cases it may be hard to disambiguate languages based on the input. 可以使用 countryHint 参数指定 2 个字母的国家/地区代码。You can use the countryHint parameter to specify a 2-letter country code. 默认情况下,API 使用“US”作为默认的 countryHint,要删除此行为,可以通过将此值设置为空字符串 countryHint = "" 来重置此参数。By default the API is using the "US" as the default countryHint, to remove this behavior you can reset this parameter by setting this value to empty string countryHint = "". 若要设置不同的默认值,请设置 TextAnalyticsClientOptions.DefaultCountryHint 属性,然后在客户端初始化期间传递它。To set a different default, set the TextAnalyticsClientOptions.DefaultCountryHint property and pass it during the client's initialization.

static void detectLanguageExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "Ce document est rédigé en Français.";

    DetectedLanguage detectedLanguage = client.detectLanguage(text);
    System.out.printf("Detected primary language: %s, ISO 6391 name: %s, score: %.2f.%n",
        detectedLanguage.getName(),
        detectedLanguage.getIso6391Name(),
        detectedLanguage.getConfidenceScore());
}

输出Output

Detected primary language: French, ISO 6391 name: fr, score: 1.00.

命名实体识别 (NER)Named Entity recognition (NER)

备注

在版本 3.0 中:In version 3.0:

  • NER 包含单独用于检测个人信息的方法。NER includes separate methods for detecting personal information.
  • 实体链接是一个独立于 NER 的请求。Entity linking is a separate request than NER.

创建一个名为 recognizeEntitiesExample() 的新函数,该函数接受你之前创建的客户端,并调用其 recognizeEntities() 函数。Create a new function called recognizeEntitiesExample() that takes the client that you created earlier, and call its recognizeEntities() function. 如果成功,则返回的 RecognizeEntitiesResult 对象将包含 NamedEntity 的列表,否则将包含 errorMessageThe returned RecognizeEntitiesResult object will contain a list of NamedEntity if successful, or an errorMessage if not.

static void recognizeEntitiesExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "I had a wonderful trip to Seattle last week.";

    for (CategorizedEntity entity : client.recognizeEntities(text)) {
        System.out.printf(
            "Recognized entity: %s, entity category: %s, entity sub-category: %s, score: %s.%n",
            entity.getText(),
            entity.getCategory(),
            entity.getSubcategory(),
            entity.getConfidenceScore());
    }
}

输出Output

Recognized entity: trip, entity category: Event, entity sub-category: null, score: 0.61.
Recognized entity: Seattle, entity category: Location, entity sub-category: GPE, score: 0.82.
Recognized entity: last week, entity category: DateTime, entity sub-category: DateRange, score: 0.8.

实体链接Entity linking

创建一个名为 recognizeLinkedEntitiesExample() 的新函数,该函数接受你之前创建的客户端,并调用其 recognizeLinkedEntities() 函数。Create a new function called recognizeLinkedEntitiesExample() that takes the client that you created earlier, and call its recognizeLinkedEntities() function. 如果成功,则返回的 RecognizeLinkedEntitiesResult 对象将包含 LinkedEntity 的列表,否则将包含 errorMessageThe returned RecognizeLinkedEntitiesResult object will contain a list of LinkedEntity if successful, or an errorMessage if not. 由于链接实体是唯一标识的,因此同一实体的实例将以分组形式出现在 LinkedEntity 对象下,显示为 LinkedEntityMatch 对象的列表。Since linked entities are uniquely identified, occurrences of the same entity are grouped under a LinkedEntity object as a list of LinkedEntityMatch objects.

static void recognizeLinkedEntitiesExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, " +
        "to develop and sell BASIC interpreters for the Altair 8800. " +
        "During his career at Microsoft, Gates held the positions of chairman, " +
        "chief executive officer, president and chief software architect, " +
        "while also being the largest individual shareholder until May 2014.";

    System.out.printf("Linked Entities:%n");
    for (LinkedEntity linkedEntity : client.recognizeLinkedEntities(text)) {
        System.out.printf("Name: %s, ID: %s, URL: %s, Data Source: %s.%n",
            linkedEntity.getName(),
            linkedEntity.getDataSourceEntityId(),
            linkedEntity.getUrl(),
            linkedEntity.getDataSource());
        System.out.printf("Matches:%n");
        for (LinkedEntityMatch linkedEntityMatch : linkedEntity.getMatches()) {
            System.out.printf("Text: %s, Score: %.2f%n",
            linkedEntityMatch.getText(),
            linkedEntityMatch.getConfidenceScore());
        }
    }
}

输出Output

Linked Entities:
Name: Altair 8800, ID: Altair 8800, URL: https://en.wikipedia.org/wiki/Altair_8800, Data Source: Wikipedia.
Matches:
Text: Altair 8800, Score: 0.88
Name: Bill Gates, ID: Bill Gates, URL: https://en.wikipedia.org/wiki/Bill_Gates, Data Source: Wikipedia.
Matches:
Text: Bill Gates, Score: 0.63
Text: Gates, Score: 0.63
Name: Paul Allen, ID: Paul Allen, URL: https://en.wikipedia.org/wiki/Paul_Allen, Data Source: Wikipedia.
Matches:
Text: Paul Allen, Score: 0.60
Name: Microsoft, ID: Microsoft, URL: https://en.wikipedia.org/wiki/Microsoft, Data Source: Wikipedia.
Matches:
Text: Microsoft, Score: 0.55
Text: Microsoft, Score: 0.55
Name: April 4, ID: April 4, URL: https://en.wikipedia.org/wiki/April_4, Data Source: Wikipedia.
Matches:
Text: April 4, Score: 0.32
Name: BASIC, ID: BASIC, URL: https://en.wikipedia.org/wiki/BASIC, Data Source: Wikipedia.
Matches:
Text: BASIC, Score: 0.33

关键短语提取Key phrase extraction

创建一个名为 extractKeyPhrasesExample() 的新函数,该函数接受你之前创建的客户端,并调用其 extractKeyPhrases() 函数。Create a new function called extractKeyPhrasesExample() that takes the client that you created earlier, and call its extractKeyPhrases() function. 如果成功,则返回的 ExtractKeyPhraseResult 对象将包含关键短语的列表,否则将包含 errorMessageThe returned ExtractKeyPhraseResult object will contain a list of key phrases if successful, or an errorMessage if not.

static void extractKeyPhrasesExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "My cat might need to see a veterinarian.";

    System.out.printf("Recognized phrases: %n");
    for (String keyPhrase : client.extractKeyPhrases(text)) {
        System.out.printf("%s%n", keyPhrase);
    }
}

输出Output

Recognized phrases: 
cat
veterinarian

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
    • 确保只按所用版本的说明操作。Be sure to only follow the instructions for the version you are using.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below.
  • 还可在浏览器中运行此版本的文本分析客户端库。You can also run this version of the Text Analytics client library in your browser.

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • 最新版本的 Node.jsThe current version of Node.js.
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

创建新的 Node.js 应用程序Create a new Node.js application

在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建一个新目录并导航到该目录。In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp 

cd myapp

运行 npm init 命令以使用 package.json 文件创建一个 node 应用程序。Run the npm init command to create a node application with a package.json file.

npm init

安装客户端库Install the client library

安装 @azure/ai-text-analytics NPM 包:Install the @azure/ai-text-analytics NPM packages:

npm install --save @azure/ai-text-analytics@5.0.0

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

应用的 package.json 文件将使用依赖项进行更新。Your app's package.json file will be updated with the dependencies. 创建一个名为 index.js 的文件,并添加以下内容:Create a file named index.js and add the following:

"use strict";

const { TextAnalyticsClient, AzureKeyCredential } = require("@azure/ai-text-analytics");

为资源的 Azure 终结点和密钥创建变量。Create variables for your resource's Azure endpoint and key.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

const key = '<paste-your-text-analytics-key-here>';
const endpoint = '<paste-your-text-analytics-endpoint-here>';

对象模型Object model

文本分析客户端是一个 TextAnalyticsClient 对象,它使用你的密钥向 Azure 进行身份验证。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key. 该客户端提供了几种方法来分析文本,文本可以是单个字符串,也可以是批处理。The client provides several methods for analyzing text, as a single string, or a batch.

文本将以 documents 的列表的形式发送到 API,该项是包含 idtextlanguage 属性的组合的 dictionary 对象,具体取决于所用的方法。Text is sent to the API as a list of documents, which are dictionary objects containing a combination of id, text, and language attributes depending on the method used. text 属性存储要以源 language 分析的文本,而 id 则可以是任何值。The text attribute stores the text to be analyzed in the origin language, and the id can be any value.

响应对象是一个列表,其中包含每个文档的分析信息。The response object is a list containing the analysis information for each document.

代码示例Code examples

客户端身份验证Client Authentication

创建一个新的 TextAnalyticsClient 对象并使用你的密钥和终结点作为参数。Create a new TextAnalyticsClient object with your key and endpoint as parameters.

const textAnalyticsClient = new TextAnalyticsClient(endpoint,  new AzureKeyCredential(key));

情绪分析Sentiment analysis

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 analyzeSentiment() 方法,并获取返回的 SentimentBatchResult 对象。Call the client's analyzeSentiment() method and get the returned SentimentBatchResult object. 循环访问结果列表,输出每个文档的 ID、文档级别情绪以及置信度分数。Iterate through the list of results, and print each document's ID, document level sentiment with confidence scores. 对于每个文档,结果都包含句子级别情绪以及偏移量、长度和置信度分数。For each document, result contains sentence level sentiment along with offsets, length, and confidence scores.

async function sentimentAnalysis(client){

    const sentimentInput = [
        "I had the best day of my life. I wish you were there with me."
    ];
    const sentimentResult = await client.analyzeSentiment(sentimentInput);

    sentimentResult.forEach(document => {
        console.log(`ID: ${document.id}`);
        console.log(`\tDocument Sentiment: ${document.sentiment}`);
        console.log(`\tDocument Scores:`);
        console.log(`\t\tPositive: ${document.confidenceScores.positive.toFixed(2)} \tNegative: ${document.confidenceScores.negative.toFixed(2)} \tNeutral: ${document.confidenceScores.neutral.toFixed(2)}`);
        console.log(`\tSentences Sentiment(${document.sentences.length}):`);
        document.sentences.forEach(sentence => {
            console.log(`\t\tSentence sentiment: ${sentence.sentiment}`)
            console.log(`\t\tSentences Scores:`);
            console.log(`\t\tPositive: ${sentence.confidenceScores.positive.toFixed(2)} \tNegative: ${sentence.confidenceScores.negative.toFixed(2)} \tNeutral: ${sentence.confidenceScores.neutral.toFixed(2)}`);
        });
    });
}
sentimentAnalysis(textAnalyticsClient)

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

ID: 0
        Document Sentiment: positive
        Document Scores:
                Positive: 1.00  Negative: 0.00  Neutral: 0.00
        Sentences Sentiment(2):
                Sentence sentiment: positive
                Sentences Scores:
                Positive: 1.00  Negative: 0.00  Neutral: 0.00
                Sentence sentiment: neutral
                Sentences Scores:
                Positive: 0.21  Negative: 0.02  Neutral: 0.77

语言检测Language detection

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 detectLanguage() 方法,并获取返回的 DetectLanguageResultCollectionCall the client's detectLanguage() method and get the returned DetectLanguageResultCollection. 然后循环访问结果,输出每个文档的 ID 以及各自的主要语言。Then iterate through the results, and print each document's ID with respective primary language.

async function languageDetection(client) {

    const languageInputArray = [
        "Ce document est rédigé en Français."
    ];
    const languageResult = await client.detectLanguage(languageInputArray);

    languageResult.forEach(document => {
        console.log(`ID: ${document.id}`);
        console.log(`\tPrimary Language ${document.primaryLanguage.name}`)
    });
}
languageDetection(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

ID: 0
        Primary Language French

命名实体识别 (NER)Named Entity Recognition (NER)

备注

在版本 3.0 中:In version 3.0:

  • 实体链接是一个独立于 NER 的请求。Entity linking is a separate request than NER.

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 recognizeEntities() 方法,并获取 RecognizeEntitiesResult 对象。Call the client's recognizeEntities() method and get the RecognizeEntitiesResult object. 循环访问结果列表,并输出实体名称、类型、子类型、偏移量、长度和分数。Iterate through the list of results, and print the entity name, type, subtype, offset, length, and score.

async function entityRecognition(client){

    const entityInputs = [
        "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800",
        "La sede principal de Microsoft se encuentra en la ciudad de Redmond, a 21 kilómetros de Seattle."
    ];
    const entityResults = await client.recognizeEntities(entityInputs);

    entityResults.forEach(document => {
        console.log(`Document ID: ${document.id}`);
        document.entities.forEach(entity => {
            console.log(`\tName: ${entity.text} \tCategory: ${entity.category} \tSubcategory: ${entity.subCategory ? entity.subCategory : "N/A"}`);
            console.log(`\tScore: ${entity.confidenceScore}`);
        });
    });
}
entityRecognition(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

Document ID: 0
        Name: Microsoft         Category: Organization  Subcategory: N/A
        Score: 0.29
        Name: Bill Gates        Category: Person        Subcategory: N/A
        Score: 0.78
        Name: Paul Allen        Category: Person        Subcategory: N/A
        Score: 0.82
        Name: April 4, 1975     Category: DateTime      Subcategory: Date
        Score: 0.8
        Name: 8800      Category: Quantity      Subcategory: Number
        Score: 0.8
Document ID: 1
        Name: 21        Category: Quantity      Subcategory: Number
        Score: 0.8
        Name: Seattle   Category: Location      Subcategory: GPE
        Score: 0.25

实体链接Entity Linking

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 recognizeLinkedEntities() 方法,并获取 RecognizeLinkedEntitiesResult 对象。Call the client's recognizeLinkedEntities() method and get the RecognizeLinkedEntitiesResult object. 循环访问结果列表,并输出实体名称、ID、数据源、URL 和匹配项。Iterate through the list of results, and print the entity name, ID, data source, url, and matches. matches 数组中的每个对象都将包含该匹配项的偏移量、长度和分数。Every object in matches array will contain offset, length, and score for that match.

async function linkedEntityRecognition(client){

    const linkedEntityInput = [
        "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014."
    ];
    const entityResults = await client.recognizeLinkedEntities(linkedEntityInput);

    entityResults.forEach(document => {
        console.log(`Document ID: ${document.id}`);
        document.entities.forEach(entity => {
            console.log(`\tName: ${entity.name} \tID: ${entity.dataSourceEntityId} \tURL: ${entity.url} \tData Source: ${entity.dataSource}`);
            console.log(`\tMatches:`)
            entity.matches.forEach(match => {
                console.log(`\t\tText: ${match.text} \tScore: ${match.confidenceScore.toFixed(2)}`);
        })
        });
    });
}
linkedEntityRecognition(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

Document ID: 0
        Name: Altair 8800       ID: Altair 8800         URL: https://en.wikipedia.org/wiki/Altair_8800  Data Source: Wikipedia
        Matches:
                Text: Altair 8800       Score: 0.88
        Name: Bill Gates        ID: Bill Gates  URL: https://en.wikipedia.org/wiki/Bill_Gates   Data Source: Wikipedia
        Matches:
                Text: Bill Gates        Score: 0.63
                Text: Gates     Score: 0.63
        Name: Paul Allen        ID: Paul Allen  URL: https://en.wikipedia.org/wiki/Paul_Allen   Data Source: Wikipedia
        Matches:
                Text: Paul Allen        Score: 0.60
        Name: Microsoft         ID: Microsoft   URL: https://en.wikipedia.org/wiki/Microsoft    Data Source: Wikipedia
        Matches:
                Text: Microsoft         Score: 0.55
                Text: Microsoft         Score: 0.55
        Name: April 4   ID: April 4     URL: https://en.wikipedia.org/wiki/April_4      Data Source: Wikipedia
        Matches:
                Text: April 4   Score: 0.32
        Name: BASIC     ID: BASIC       URL: https://en.wikipedia.org/wiki/BASIC        Data Source: Wikipedia
        Matches:
                Text: BASIC     Score: 0.33

关键短语提取Key phrase extraction

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 extractKeyPhrases() 方法,并获取返回的 ExtractKeyPhrasesResult 对象。Call the client's extractKeyPhrases() method and get the returned ExtractKeyPhrasesResult object. 循环访问结果,输出每个文档的 ID 以及任何检测到的密钥短语。Iterate through the results and print each document's ID, and any detected key phrases.

async function keyPhraseExtraction(client){

    const keyPhrasesInput = [
        "My cat might need to see a veterinarian.",
    ];
    const keyPhraseResult = await client.extractKeyPhrases(keyPhrasesInput);
    
    keyPhraseResult.forEach(document => {
        console.log(`ID: ${document.id}`);
        console.log(`\tDocument Key Phrases: ${document.keyPhrases}`);
    });
}
keyPhraseExtraction(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

ID: 0
        Document Key Phrases: cat,veterinarian

运行应用程序Run the application

在快速入门文件中使用 node 命令运行应用程序。Run the application with the node command on your quickstart file.

node index.js

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
    • 确保只按所用版本的说明操作。Be sure to only follow the instructions for the version you are using.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below.

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • Python 3.xPython 3.x
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

设置Setting up

安装客户端库Install the client library

在安装 Python 后,可以通过以下命令安装客户端库:After installing Python, you can install the client library with:

pip install --upgrade azure-ai-textanalytics

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

创建新的 Python 应用程序Create a new python application

创建一个新的 Python 文件,为资源的 Azure 终结点和订阅密钥创建变量。Create a new Python file and create variables for your resource's Azure endpoint and subscription key.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

key = "<paste-your-text-analytics-key-here>"
endpoint = "<paste-your-text-analytics-endpoint-here>"

对象模型Object model

文本分析客户端是一个 TextAnalyticsClient 对象,它使用你的密钥向 Azure 进行身份验证。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key. 该客户端提供了几种方法来成批分析文本。The client provides several methods for analyzing text as a batch.

当成批处理时,文本将以 documents 的列表的形式发送到 API,该项是包含 idtextlanguage 属性的组合的 dictionary 对象,具体取决于所用的方法。When batch processing text is sent to the API as a list of documents, which are dictionary objects containing a combination of id, text, and language attributes depending on the method used. text 属性存储要以源 language 分析的文本,而 id 则可以是任何值。The text attribute stores the text to be analyzed in the origin language, and the id can be any value.

响应对象是一个列表,其中包含每个文档的分析信息。The response object is a list containing the analysis information for each document.

代码示例Code examples

这些代码片段展示了如何使用适用于 Python 的文本分析客户端库执行以下任务:These code snippets show you how to do the following tasks with the Text Analytics client library for Python:

验证客户端Authenticate the client

创建一个函数,以便通过上面创建的 keyendpoint 来实例化 TextAnalyticsClient 对象。Create a function to instantiate the TextAnalyticsClient object with your key AND endpoint created above. 然后创建一个新客户端。Then create a new client.

from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

def authenticate_client():
    ta_credential = AzureKeyCredential(key)
    text_analytics_client = TextAnalyticsClient(
            endpoint=endpoint, credential=ta_credential)
    return text_analytics_client

client = authenticate_client()

情绪分析Sentiment analysis

创建一个名为 sentiment_analysis_example() 的新函数,该函数采用客户端作为参数,然后调用 analyze_sentiment() 函数。Create a new function called sentiment_analysis_example() that takes the client as an argument, then calls the analyze_sentiment() function. 返回的响应对象将包含整个输入文档的情绪标签和分数,以及每个句子的情绪分析。The returned response object will contain the sentiment label and score of the entire input document, as well as a sentiment analysis for each sentence.

def sentiment_analysis_example(client):

    documents = ["I had the best day of my life. I wish you were there with me."]
    response = client.analyze_sentiment(documents = documents)[0]
    print("Document Sentiment: {}".format(response.sentiment))
    print("Overall scores: positive={0:.2f}; neutral={1:.2f}; negative={2:.2f} \n".format(
        response.confidence_scores.positive,
        response.confidence_scores.neutral,
        response.confidence_scores.negative,
    ))
    for idx, sentence in enumerate(response.sentences):
        print("Sentence: {}".format(sentence.text))
        print("Sentence {} sentiment: {}".format(idx+1, sentence.sentiment))
        print("Sentence score:\nPositive={0:.2f}\nNeutral={1:.2f}\nNegative={2:.2f}\n".format(
            sentence.confidence_scores.positive,
            sentence.confidence_scores.neutral,
            sentence.confidence_scores.negative,
        ))
          
sentiment_analysis_example(client)

输出Output

Document Sentiment: positive
Overall scores: positive=1.00; neutral=0.00; negative=0.00 

Sentence: I had the best day of my life.
Sentence 1 sentiment: positive
Sentence score:
Positive=1.00
Neutral=0.00
Negative=0.00

Sentence: I wish you were there with me.
Sentence 2 sentiment: neutral
Sentence score:
Positive=0.21
Neutral=0.77
Negative=0.02

语言检测Language detection

创建一个名为 language_detection_example() 的新函数,该函数采用客户端作为参数,然后调用 detect_language() 函数。Create a new function called language_detection_example() that takes the client as an argument, then calls the detect_language() function. 如果成功,则返回的响应对象将在 primary_language 中包含检测到的语言,否则将包含 errorThe returned response object will contain the detected language in primary_language if successful, and an error if not.

提示

在某些情况下,可能很难根据输入区分语言。In some cases it may be hard to disambiguate languages based on the input. 可以使用 country_hint 参数指定 2 个字母的国家/地区代码。You can use the country_hint parameter to specify a 2-letter country code. 默认情况下,API 使用“US”作为默认的 countryHint,要删除此行为,可以通过将此值设置为空字符串 country_hint : "" 来重置此参数。By default the API is using the "US" as the default countryHint, to remove this behavior you can reset this parameter by setting this value to empty string country_hint : "".

def language_detection_example(client):
    try:
        documents = ["Ce document est rédigé en Français."]
        response = client.detect_language(documents = documents, country_hint = 'us')[0]
        print("Language: ", response.primary_language.name)

    except Exception as err:
        print("Encountered exception. {}".format(err))
language_detection_example(client)

输出Output

Language:  French

命名实体识别 (NER)Named Entity recognition (NER)

备注

在版本 3.0 中:In version 3.0:

  • 实体链接是一个独立于 NER 的请求。Entity linking is a separate request than NER.

创建一个名为 entity_recognition_example 的新函数,该函数采用客户端作为参数,然后调用 recognize_entities() 函数并循环访问结果。Create a new function called entity_recognition_example that takes the client as an argument, then calls the recognize_entities() function and iterates through the results. 如果成功,则返回的响应对象将在 entity 中包含检测到的实体列表,否则将包含 errorThe returned response object will contain the list of detected entities in entity if successful, and an error if not. 对于检测到的每个实体,输出其类别和子类别(如果存在)。For each detected entity, print its Category and Sub-Category if exists.

def entity_recognition_example(client):

    try:
        documents = ["I had a wonderful trip to Seattle last week."]
        result = client.recognize_entities(documents = documents)[0]

        print("Named Entities:\n")
        for entity in result.entities:
            print("\tText: \t", entity.text, "\tCategory: \t", entity.category, "\tSubCategory: \t", entity.subcategory,
                    "\n\tConfidence Score: \t", round(entity.confidence_score, 2), "\n")

    except Exception as err:
        print("Encountered exception. {}".format(err))
entity_recognition_example(client)

输出Output

Named Entities:

        Text:    trip   Category:        Event  SubCategory:     None
        Confidence Score:        0.61

        Text:    Seattle        Category:        Location       SubCategory:     GPE
        Confidence Score:        0.82

        Text:    last week      Category:        DateTime       SubCategory:     DateRange
        Confidence Score:        0.8

实体链接Entity Linking

创建一个名为 entity_linking_example() 的新函数,该函数采用客户端作为参数,然后调用 recognize_linked_entities() 函数并循环访问结果。Create a new function called entity_linking_example() that takes the client as an argument, then calls the recognize_linked_entities() function and iterates through the results. 如果成功,则返回的响应对象将在 entities 中包含检测到的实体列表,否则将包含 errorThe returned response object will contain the list of detected entities in entities if successful, and an error if not. 由于链接实体是唯一标识的,因此同一实体的实例将以分组形式出现在 entity 对象下,显示为 match 对象的列表。Since linked entities are uniquely identified, occurrences of the same entity are grouped under a entity object as a list of match objects.

def entity_linking_example(client):

    try:
        documents = ["""Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, 
        to develop and sell BASIC interpreters for the Altair 8800. 
        During his career at Microsoft, Gates held the positions of chairman,
        chief executive officer, president and chief software architect, 
        while also being the largest individual shareholder until May 2014."""]
        result = client.recognize_linked_entities(documents = documents)[0]

        print("Linked Entities:\n")
        for entity in result.entities:
            print("\tName: ", entity.name, "\tId: ", entity.data_source_entity_id, "\tUrl: ", entity.url,
            "\n\tData Source: ", entity.data_source)
            print("\tMatches:")
            for match in entity.matches:
                print("\t\tText:", match.text)
                print("\t\tConfidence Score: {0:.2f}".format(match.confidence_score))
            
    except Exception as err:
        print("Encountered exception. {}".format(err))
entity_linking_example(client)

输出Output

Linked Entities:

        Name:  Altair 8800      Id:  Altair 8800        Url:  https://en.wikipedia.org/wiki/Altair_8800
        Data Source:  Wikipedia
        Matches:
                Text: Altair 8800
                Confidence Score: 0.88
        Name:  Bill Gates       Id:  Bill Gates         Url:  https://en.wikipedia.org/wiki/Bill_Gates
        Data Source:  Wikipedia
        Matches:
                Text: Bill Gates
                Confidence Score: 0.63
                Text: Gates
                Confidence Score: 0.63
        Name:  Paul Allen       Id:  Paul Allen         Url:  https://en.wikipedia.org/wiki/Paul_Allen
        Data Source:  Wikipedia
        Matches:
                Text: Paul Allen
                Confidence Score: 0.60
        Name:  Microsoft        Id:  Microsoft  Url:  https://en.wikipedia.org/wiki/Microsoft
        Data Source:  Wikipedia
        Matches:
                Text: Microsoft
                Confidence Score: 0.55
                Text: Microsoft
                Confidence Score: 0.55
        Name:  April 4  Id:  April 4    Url:  https://en.wikipedia.org/wiki/April_4
        Data Source:  Wikipedia
        Matches:
                Text: April 4
                Confidence Score: 0.32
        Name:  BASIC    Id:  BASIC      Url:  https://en.wikipedia.org/wiki/BASIC
        Data Source:  Wikipedia
        Matches:
                Text: BASIC
                Confidence Score: 0.33

关键短语提取Key phrase extraction

创建一个名为 key_phrase_extraction_example() 的新函数,该函数采用客户端作为参数,然后调用 extract_key_phrases() 函数。Create a new function called key_phrase_extraction_example() that takes the client as an argument, then calls the extract_key_phrases() function. 如果成功,结果将包含 key_phrases 中检测到的关键短语列表,如果失败,则将包含 errorThe result will contain the list of detected key phrases in key_phrases if successful, and an error if not. 输出任何检测到的关键短语。Print any detected key phrases.

def key_phrase_extraction_example(client):

    try:
        documents = ["My cat might need to see a veterinarian."]

        response = client.extract_key_phrases(documents = documents)[0]

        if not response.is_error:
            print("\tKey Phrases:")
            for phrase in response.key_phrases:
                print("\t\t", phrase)
        else:
            print(response.id, response.error)

    except Exception as err:
        print("Encountered exception. {}".format(err))
        
key_phrase_extraction_example(client)

输出Output

    Key Phrases:
         cat
         veterinarian

其他语言支持Additional language support

如果已单击此选项卡,则可能看不到采用你偏好的编程语言的快速入门。If you've clicked this tab, you probably didn't see a quickstart in your favorite programming language. 别担心,我们提供了其他快速入门。Don't worry, we have additional quickstarts available. 使用表格查找适用于编程语言的示例。Use the table to find the right sample for your programming language.

语言Language 可用版本Available version
RubyRuby 版本 2.1Version 2.1
GoGo 版本 2.1Version 2.1

清理资源Clean up resources

如果想要清理并删除认知服务订阅,可以删除资源或资源组。If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. 删除资源组同时也会删除与之相关联的任何其他资源。Deleting the resource group also deletes any other resources associated with it.

后续步骤Next steps