您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

快速入门:使用文本分析客户端库和 REST APIQuickstart: Use the Text Analytics client library and REST API

参考本文开始使用文本分析客户端库和 REST API。Use this article to get started with the Text Analytics client library and REST API. 按照以下步骤使用示例代码挖掘文本:Follow these steps to try out examples code for mining text:

  • 情绪分析Sentiment analysis
  • 观点挖掘Opinion mining
  • 语言检测Language detection
  • 实体识别Entity recognition
  • 个人身份信息识别Personal Identifying Information recognition
  • 关键短语提取Key phrase extraction

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
    • 确保只按所用版本的说明操作。Be sure to only follow the instructions for the version you are using.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below.
  • 如果要使用运行状况文本分析或异步操作,请参阅 Github 上的 C#PythonJava 示例If you want to use Text Analytics for health or Asynchronous operations, see the examples on Github for C#, Python or Java

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • Visual Studio IDEThe Visual Studio IDE
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • 若要使用“分析”功能,需要标准 (S) 定价层的“文本分析”资源。To use the Analyze feature, you will need a Text Analytics resource with the standard (S) pricing tier.

设置Setting up

创建新的 .NET Core 应用程序Create a new .NET Core application

使用 Visual Studio IDE 创建新的 .NET Core 控制台应用。Using the Visual Studio IDE, create a new .NET Core console app. 这会创建包含单个 C# 源文件的“Hello World”项目:program.csThis will create a "Hello World" project with a single C# source file: program.cs.

右键单击 解决方案资源管理器 中的解决方案,然后选择“管理 NuGet 包”,以便安装客户端库。Install the client library by right-clicking on the solution in the Solution Explorer and selecting Manage NuGet Packages. 在打开的包管理器中选择“浏览”,搜索 Azure.AI.TextAnalyticsIn the package manager that opens select Browse and search for Azure.AI.TextAnalytics. 选中“包括预发行版”框,选择版本 5.1.0-beta.3,然后选择“安装”。Check the include prerelase box, select version 5.1.0-beta.3, and then Install. 也可使用包管理器控制台You can also use the Package Manager Console.

打开 program.cs 文件并添加以下 using 指令:Open the program.cs file and add the following using directives:

using Azure;
using System;
using System.Globalization;
using Azure.AI.TextAnalytics;

在应用程序的 Program 类中,为资源的密钥和终结点创建变量。In the application's Program class, create variables for your resource's key and endpoint.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

private static readonly AzureKeyCredential credentials = new AzureKeyCredential("<replace-with-your-text-analytics-key-here>");
private static readonly Uri endpoint = new Uri("<replace-with-your-text-analytics-endpoint-here>");

替换应用程序的 Main 方法。Replace the application's Main method. 稍后将定义此处调用的方法。You will define the methods called here later.

static void Main(string[] args)
{
    var client = new TextAnalyticsClient(endpoint, credentials);
    // You will implement these methods later in the quickstart.
    SentimentAnalysisExample(client);
    SentimentAnalysisWithOpinionMiningExample(client);
    LanguageDetectionExample(client);
    EntityRecognitionExample(client);
    EntityLinkingExample(client);
    RecognizePIIExample(client);
    KeyPhraseExtractionExample(client);

    Console.Write("Press any key to exit.");
    Console.ReadKey();
}

对象模型Object model

文本分析客户端是一个 TextAnalyticsClient 对象,该对象使用你的密钥在 Azure 中进行身份验证,并提供用于接受文本(单个字符串或批)的函数。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key, and provides functions to accept text as single strings or as a batch. 可以同步方式或异步方式将文本发送到 API。You can send text to the API synchronously, or asynchronously. 响应对象包含发送的每个文档的分析信息。The response object will contain the analysis information for each document you send.

如果使用的是服务的 3.x 版本,则可使用可选的 TextAnalyticsClientOptions 实例,通过各种默认设置(例如默认语言或国家/地区提示)来初始化客户端。If you're using version 3.x of the service, you can use an optional TextAnalyticsClientOptions instance to initialize the client with various default settings (for example default language or country/region hint). 还可以使用 Azure Active Directory 令牌进行身份验证。You can also authenticate using an Azure Active Directory token.

代码示例Code examples

验证客户端Authenticate the client

请确保先前的 main 方法使用终结点和凭据创建新的客户端对象。Make sure your main method from earlier creates a new client object with your endpoint and credentials.

var client = new TextAnalyticsClient(endpoint, credentials);

情绪分析Sentiment analysis

创建一个名为 SentimentAnalysisExample() 的新函数,该函数接受你之前创建的客户端,并调用其 AnalyzeSentiment() 函数。Create a new function called SentimentAnalysisExample() that takes the client that you created earlier, and call its AnalyzeSentiment() function. 如果成功,则返回的 Response<DocumentSentiment> 对象将包含整个输入文档的情绪标签和分数,以及每个句子的情绪分析。The returned Response<DocumentSentiment> object will contain the sentiment label and score of the entire input document, as well as a sentiment analysis for each sentence if successful. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void SentimentAnalysisExample(TextAnalyticsClient client)
{
    string inputText = "I had the best day of my life. I wish you were there with me.";
    DocumentSentiment documentSentiment = client.AnalyzeSentiment(inputText);
    Console.WriteLine($"Document sentiment: {documentSentiment.Sentiment}\n");

    foreach (var sentence in documentSentiment.Sentences)
    {
        Console.WriteLine($"\tText: \"{sentence.Text}\"");
        Console.WriteLine($"\tSentence sentiment: {sentence.Sentiment}");
        Console.WriteLine($"\tPositive score: {sentence.ConfidenceScores.Positive:0.00}");
        Console.WriteLine($"\tNegative score: {sentence.ConfidenceScores.Negative:0.00}");
        Console.WriteLine($"\tNeutral score: {sentence.ConfidenceScores.Neutral:0.00}\n");
    }
}

输出Output

Document sentiment: Positive

        Text: "I had the best day of my life."
        Sentence sentiment: Positive
        Positive score: 1.00
        Negative score: 0.00
        Neutral score: 0.00

        Text: "I wish you were there with me."
        Sentence sentiment: Neutral
        Positive score: 0.21
        Negative score: 0.02
        Neutral score: 0.77

观点挖掘Opinion mining

创建一个名为 SentimentAnalysisWithOpinionMiningExample() 的新函数,该函数接受你之前创建的客户端,并使用 AdditionalSentimentAnalyses.OpinionMining 选项调用其 AnalyzeSentimentBatch() 函数。Create a new function called SentimentAnalysisWithOpinionMiningExample() that takes the client that you created earlier, and call its AnalyzeSentimentBatch() function with AdditionalSentimentAnalyses.OpinionMining option. 返回的 AnalyzeSentimentResultCollection 对象将包含表示 Response<DocumentSentiment>AnalyzeSentimentResult 的集合。The returned AnalyzeSentimentResultCollection object will contain the collection of AnalyzeSentimentResult in which represents Response<DocumentSentiment>. SentimentAnalysis()SentimentAnalysisWithOpinionMiningExample() 的区别在于后者在每个句子中都包含 MinedOpinion(表明了所分析的角度和相关观点)。The difference between SentimentAnalysis() and SentimentAnalysisWithOpinionMiningExample() is that the latter will contain MinedOpinion in each sentence, which shows an analyzed aspect and the related opinion(s). 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void SentimentAnalysisWithOpinionMiningExample(TextAnalyticsClient client)
{
    var documents = new List<string>
    {
        "The food and service were unacceptable, but the concierge were nice."
    };

    AnalyzeSentimentResultCollection reviews = client.AnalyzeSentimentBatch(documents, options: new AnalyzeSentimentOptions()
    {
        IncludeOpinionMining = true
    });

    foreach (AnalyzeSentimentResult review in reviews)
    {
        Console.WriteLine($"Document sentiment: {review.DocumentSentiment.Sentiment}\n");
        Console.WriteLine($"\tPositive score: {review.DocumentSentiment.ConfidenceScores.Positive:0.00}");
        Console.WriteLine($"\tNegative score: {review.DocumentSentiment.ConfidenceScores.Negative:0.00}");
        Console.WriteLine($"\tNeutral score: {review.DocumentSentiment.ConfidenceScores.Neutral:0.00}\n");
        foreach (SentenceSentiment sentence in review.DocumentSentiment.Sentences)
        {
            Console.WriteLine($"\tText: \"{sentence.Text}\"");
            Console.WriteLine($"\tSentence sentiment: {sentence.Sentiment}");
            Console.WriteLine($"\tSentence positive score: {sentence.ConfidenceScores.Positive:0.00}");
            Console.WriteLine($"\tSentence negative score: {sentence.ConfidenceScores.Negative:0.00}");
            Console.WriteLine($"\tSentence neutral score: {sentence.ConfidenceScores.Neutral:0.00}\n");

            foreach (MinedOpinion minedOpinion in sentence.MinedOpinions)
            {
                Console.WriteLine($"\tAspect: {minedOpinion.Aspect.Text}, Value: {minedOpinion.Aspect.Sentiment}");
                Console.WriteLine($"\tAspect positive score: {minedOpinion.Aspect.ConfidenceScores.Positive:0.00}");
                Console.WriteLine($"\tAspect negative score: {minedOpinion.Aspect.ConfidenceScores.Negative:0.00}");
                foreach (OpinionSentiment opinion in minedOpinion.Opinions)
                {
                    Console.WriteLine($"\t\tRelated Opinion: {opinion.Text}, Value: {opinion.Sentiment}");
                    Console.WriteLine($"\t\tRelated Opinion positive score: {opinion.ConfidenceScores.Positive:0.00}");
                    Console.WriteLine($"\t\tRelated Opinion negative score: {opinion.ConfidenceScores.Negative:0.00}");
                }
            }
        }
        Console.WriteLine($"\n");
    }
}

输出Output

Document sentiment: Positive

        Positive score: 0.84
        Negative score: 0.16
        Neutral score: 0.00

        Text: "The food and service were unacceptable, but the concierge were nice."
        Sentence sentiment: Positive
        Sentence positive score: 0.84
        Sentence negative score: 0.16
        Sentence neutral score: 0.00

        Aspect: food, Value: Negative
        Aspect positive score: 0.01
        Aspect negative score: 0.99
                Related Opinion: unacceptable, Value: Negative
                Related Opinion positive score: 0.01
                Related Opinion negative score: 0.99
        Aspect: service, Value: Negative
        Aspect positive score: 0.01
        Aspect negative score: 0.99
                Related Opinion: unacceptable, Value: Negative
                Related Opinion positive score: 0.01
                Related Opinion negative score: 0.99
        Aspect: concierge, Value: Positive
        Aspect positive score: 1.00
        Aspect negative score: 0.00
                Related Opinion: nice, Value: Positive
                Related Opinion positive score: 1.00
                Related Opinion negative score: 0.00


Press any key to exit.

语言检测Language detection

创建一个名为 LanguageDetectionExample() 的新函数,该函数接受你之前创建的客户端并调用其 DetectLanguage() 函数。Create a new function called LanguageDetectionExample() that takes the client that you created earlier, and call its DetectLanguage() function. 返回的 Response<DetectedLanguage> 对象会包含检测到的语言及其名称和 ISO-6391 代码。The returned Response<DetectedLanguage> object will contain the detected language along with its name and ISO-6391 code. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

提示

在某些情况下,可能很难根据输入区分语言。In some cases it may be hard to disambiguate languages based on the input. 可以使用 countryHint 参数指定 2 个字母的国家/地区代码。You can use the countryHint parameter to specify a 2-letter country/region code. 默认情况下,API 使用“US”作为默认的 countryHint,要删除此行为,可以通过将此值设置为空字符串 countryHint = "" 来重置此参数。By default the API is using the "US" as the default countryHint, to remove this behavior you can reset this parameter by setting this value to empty string countryHint = "". 若要设置不同的默认值,请设置 TextAnalyticsClientOptions.DefaultCountryHint 属性,然后在客户端初始化期间传递它。To set a different default, set the TextAnalyticsClientOptions.DefaultCountryHint property and pass it during the client's initialization.

static void LanguageDetectionExample(TextAnalyticsClient client)
{
    DetectedLanguage detectedLanguage = client.DetectLanguage("Ce document est rédigé en Français.");
    Console.WriteLine("Language:");
    Console.WriteLine($"\t{detectedLanguage.Name},\tISO-6391: {detectedLanguage.Iso6391Name}\n");
}

输出Output

Language:
        French, ISO-6391: fr

命名实体识别 (NER)Named Entity Recognition (NER)

创建一个名为 EntityRecognitionExample() 的新函数,该函数接受你之前创建的客户端,调用其 RecognizeEntities() 函数并循环访问结果。Create a new function called EntityRecognitionExample() that takes the client that you created earlier, call its RecognizeEntities() function and iterate through the results. 返回的 Response<CategorizedEntityCollection> 对象将包含检测到的实体 CategorizedEntity 的集合。The returned Response<CategorizedEntityCollection> object will contain the collection of detected entities CategorizedEntity. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void EntityRecognitionExample(TextAnalyticsClient client)
{
    var response = client.RecognizeEntities("I had a wonderful trip to Seattle last week.");
    Console.WriteLine("Named Entities:");
    foreach (var entity in response.Value)
    {
        Console.WriteLine($"\tText: {entity.Text},\tCategory: {entity.Category},\tSub-Category: {entity.SubCategory}");
        Console.WriteLine($"\t\tScore: {entity.ConfidenceScore:F2},\tLength: {entity.Length},\tOffset: {entity.Offset}\n");
    }
}

输出Output

Named Entities:
        Text: trip,     Category: Event,        Sub-Category:
                Score: 0.61,    Length: 4,      Offset: 18

        Text: Seattle,  Category: Location,     Sub-Category: GPE
                Score: 0.82,    Length: 7,      Offset: 26

        Text: last week,        Category: DateTime,     Sub-Category: DateRange
                Score: 0.80,    Length: 9,      Offset: 34

实体链接Entity linking

创建一个名为 EntityLinkingExample() 的新函数,该函数接受你之前创建的客户端,调用其 RecognizeLinkedEntities() 函数并循环访问结果。Create a new function called EntityLinkingExample() that takes the client that you created earlier, call its RecognizeLinkedEntities() function and iterate through the results. 返回的 Response<LinkedEntityCollection> 对象将包含检测到的实体 LinkedEntity 的集合。The returned Response<LinkedEntityCollection> object will contain the collection of detected entities LinkedEntity. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException. 由于链接实体是唯一标识的,因此同一实体的实例将以分组形式出现在 LinkedEntity 对象下,显示为 LinkedEntityMatch 对象的列表。Since linked entities are uniquely identified, occurrences of the same entity are grouped under a LinkedEntity object as a list of LinkedEntityMatch objects.

static void EntityLinkingExample(TextAnalyticsClient client)
{
    var response = client.RecognizeLinkedEntities(
        "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, " +
        "to develop and sell BASIC interpreters for the Altair 8800. " +
        "During his career at Microsoft, Gates held the positions of chairman, " +
        "chief executive officer, president and chief software architect, " +
        "while also being the largest individual shareholder until May 2014.");
    Console.WriteLine("Linked Entities:");
    foreach (var entity in response.Value)
    {
        Console.WriteLine($"\tName: {entity.Name},\tID: {entity.DataSourceEntityId},\tURL: {entity.Url}\tData Source: {entity.DataSource}");
        Console.WriteLine("\tMatches:");
        foreach (var match in entity.Matches)
        {
            Console.WriteLine($"\t\tText: {match.Text}");
            Console.WriteLine($"\t\tScore: {match.ConfidenceScore:F2}");
            Console.WriteLine($"\t\tLength: {match.Length}");
            Console.WriteLine($"\t\tOffset: {match.Offset}\n");
        }
    }
}

输出Output

Linked Entities:
        Name: Microsoft,        ID: Microsoft,  URL: https://en.wikipedia.org/wiki/Microsoft    Data Source: Wikipedia
        Matches:
                Text: Microsoft
                Score: 0.55
                Length: 9
                Offset: 0

                Text: Microsoft
                Score: 0.55
                Length: 9
                Offset: 150

        Name: Bill Gates,       ID: Bill Gates, URL: https://en.wikipedia.org/wiki/Bill_Gates   Data Source: Wikipedia
        Matches:
                Text: Bill Gates
                Score: 0.63
                Length: 10
                Offset: 25

                Text: Gates
                Score: 0.63
                Length: 5
                Offset: 161

        Name: Paul Allen,       ID: Paul Allen, URL: https://en.wikipedia.org/wiki/Paul_Allen   Data Source: Wikipedia
        Matches:
                Text: Paul Allen
                Score: 0.60
                Length: 10
                Offset: 40

        Name: April 4,  ID: April 4,    URL: https://en.wikipedia.org/wiki/April_4      Data Source: Wikipedia
        Matches:
                Text: April 4
                Score: 0.32
                Length: 7
                Offset: 54

        Name: BASIC,    ID: BASIC,      URL: https://en.wikipedia.org/wiki/BASIC        Data Source: Wikipedia
        Matches:
                Text: BASIC
                Score: 0.33
                Length: 5
                Offset: 89

        Name: Altair 8800,      ID: Altair 8800,        URL: https://en.wikipedia.org/wiki/Altair_8800  Data Source: Wikipedia
        Matches:
                Text: Altair 8800
                Score: 0.88
                Length: 11
                Offset: 116

个人身份信息识别Personally Identifiable Information recognition

创建一个名为 RecognizePIIExample() 的新函数,该函数接受你之前创建的客户端,调用其 RecognizePiiEntities() 函数并循环访问结果。Create a new function called RecognizePIIExample() that takes the client that you created earlier, call its RecognizePiiEntities() function and iterate through the results. 返回的 PiiEntityCollection 表示检测到的 PII 实体的列表。The returned PiiEntityCollection represents the list of detected PII entities. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void RecognizePIIExample(TextAnalyticsClient client)
{
    string document = "A developer with SSN 859-98-0987 whose phone number is 800-102-1100 is building tools with our APIs.";

    PiiEntityCollection entities = client.RecognizePiiEntities(document).Value;

    Console.WriteLine($"Redacted Text: {entities.RedactedText}");
    if (entities.Count > 0)
    {
        Console.WriteLine($"Recognized {entities.Count} PII entit{(entities.Count > 1 ? "ies" : "y")}:");
        foreach (PiiEntity entity in entities)
        {
            Console.WriteLine($"Text: {entity.Text}, Category: {entity.Category}, SubCategory: {entity.SubCategory}, Confidence score: {entity.ConfidenceScore}");
        }
    }
    else
    {
        Console.WriteLine("No entities were found.");
    }
}

输出Output

Redacted Text: A developer with SSN *********** whose phone number is ************ is building tools with our APIs.
Recognized 2 PII entities:
Text: 859-98-0987, Category: U.S. Social Security Number (SSN), SubCategory: , Confidence score: 0.65
Text: 800-102-1100, Category: Phone Number, SubCategory: , Confidence score: 0.8

关键短语提取Key phrase extraction

创建一个名为 KeyPhraseExtractionExample() 的新函数,该函数接受你之前创建的客户端,并调用其 ExtractKeyPhrases() 函数。Create a new function called KeyPhraseExtractionExample() that takes the client that you created earlier, and call its ExtractKeyPhrases() function. 返回的 <Response<KeyPhraseCollection> 对象将包含检测到的关键短语的列表。The returned <Response<KeyPhraseCollection> object will contain the list of detected key phrases. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static void KeyPhraseExtractionExample(TextAnalyticsClient client)
{
    var response = client.ExtractKeyPhrases("My cat might need to see a veterinarian.");

    // Printing key phrases
    Console.WriteLine("Key phrases:");

    foreach (string keyphrase in response.Value)
    {
        Console.WriteLine($"\t{keyphrase}");
    }
}

输出Output

Key phrases:
    cat
    veterinarian

使用“分析”操作异步使用 APIUse the API asynchronously with the Analyze operation

注意

若要使用“分析”操作,请确保 Azure 资源使用 S 标准定价层。To use the Analyze operation, make sure your Azure resource is using the S standard pricing tier.

创建一个名为 AnalyzeOperationExample() 的新函数,该函数接受你之前创建的客户端,并调用其 StartAnalyzeOperationBatch() 函数。Create a new function called AnalyzeOperationExample() that takes the client that you created earlier, and call its StartAnalyzeOperationBatch() function. 返回的 AnalyzeOperation 对象将包含 AnalyzeOperationResultOperation 接口对象。The returned AnalyzeOperation object will contain the Operation interface object for AnalyzeOperationResult. 由于它是一个长期操作,因此请在 operation.WaitForCompletionAsync() 上使用 await 以便更新值。As it is a Long Running Operation, await on the operation.WaitForCompletionAsync() for the value to be updated. WaitForCompletionAsync() 完成后,应会在 operation.Value 中更新集合。Once the WaitForCompletionAsync() is finishes, the collection should be updated in the operation.Value. 如果发生错误,则会引发 RequestFailedExceptionIf there was an error, it will throw a RequestFailedException.

static async Task AnalyzeOperationExample(TextAnalyticsClient client)
{
    string inputText = "Microsoft was founded by Bill Gates and Paul Allen.";

    var batchDocuments = new List<string> { inputText };

    AnalyzeOperationOptions operationOptions = new AnalyzeOperationOptions()
    {
        EntitiesTaskParameters = new EntitiesTaskParameters(),
        DisplayName = "Analyze Operation Quick Start Example"
    };

    AnalyzeOperation operation = client.StartAnalyzeOperationBatch(batchDocuments, operationOptions, "en");

    await operation.WaitForCompletionAsync();

    AnalyzeOperationResult resultCollection = operation.Value;

    RecognizeEntitiesResultCollection entitiesResult = resultCollection.Tasks.EntityRecognitionTasks[0].Results;

    Console.WriteLine("Analyze Operation Request Details");
    Console.WriteLine($"    Status: {resultCollection.Status}");
    Console.WriteLine($"    DisplayName: {resultCollection.DisplayName}");
    Console.WriteLine("");

    Console.WriteLine("Recognized Entities");

    foreach (RecognizeEntitiesResult result in entitiesResult)
    {
        Console.WriteLine($"    Recognized the following {result.Entities.Count} entities:");

        foreach (CategorizedEntity entity in result.Entities)
        {
            Console.WriteLine($"    Entity: {entity.Text}");
            Console.WriteLine($"    Category: {entity.Category}");
            Console.WriteLine($"    Offset: {entity.Offset}");
            Console.WriteLine($"    ConfidenceScore: {entity.ConfidenceScore}");
            Console.WriteLine($"    SubCategory: {entity.SubCategory}");
        }
        Console.WriteLine("");
    }
}

将此示例添加到应用程序后,请使用 awaitmain() 方法中调用。After you add this example to your application, call in your main() method using await.

await AnalyzeOperationExample(client).ConfigureAwait(false);

OutputOutput

Analyze Operation Request Details
    Status: succeeded
    DisplayName: Analyze Operation Quick Start Example

Recognized Entities
    Recognized the following 3 entities:
    Entity: Microsoft
    Category: Organization
    Offset: 0
    ConfidenceScore: 0.83
    SubCategory: 
    Entity: Bill Gates
    Category: Person
    Offset: 25
    ConfidenceScore: 0.85
    SubCategory: 
    Entity: Paul Allen
    Category: Person
    Offset: 40
    ConfidenceScore: 0.9
    SubCategory: 

还可以使用“分析”操作来检测 PII 和关键短语提取。You can also use the Analyze operation to detect PII and key phrase extraction. 请参阅 GitHub 上的分析示例See the Analyze sample on GitHub.

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below. 如果要使用运行状况文本分析或异步操作,请参阅 Github 上的 C#PythonJava 示例If you want to use Text Analytics for health or Asynchronous operations, see the examples on Github for C#, Python or Java

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • Java 开发工具包 (JDK) 版本 8 或更高版本Java Development Kit (JDK) with version 8 or above
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • 若要使用“分析”功能,需要标准 (S) 定价层的“文本分析”资源。To use the Analyze feature, you will need a Text Analytics resource with the standard (S) pricing tier.

设置Setting up

添加客户端库Add the client library

在首选 IDE 或开发环境中创建 Maven 项目。Create a Maven project in your preferred IDE or development environment. 然后在项目的 pom.xml 文件中,添加以下依赖项。Then add the following dependency to your project's pom.xml file. 可联机找到用于其他生成工具的实现语法。You can find the implementation syntax for other build tools online.

<dependencies>
     <dependency>
        <groupId>com.azure</groupId>
        <artifactId>azure-ai-textanalytics</artifactId>
        <version>5.1.0-beta.3</version>
    </dependency>
</dependencies>

创建名为 TextAnalyticsSamples.java 的 Java 文件。Create a Java file named TextAnalyticsSamples.java. 打开 文件并添加以下 import 语句:Open the file and add the following import statements:

import com.azure.core.credential.AzureKeyCredential;
import com.azure.ai.textanalytics.models.*;
import com.azure.ai.textanalytics.TextAnalyticsClientBuilder;
import com.azure.ai.textanalytics.TextAnalyticsClient;

在 java 文件中,添加一个新类并添加你的 Azure 资源的密钥和终结点,如下所示。In the java file, add a new class and add your Azure resource's key and endpoint as shown below.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

public class TextAnalyticsSamples {
    private static String KEY = "<replace-with-your-text-analytics-key-here>";
    private static String ENDPOINT = "<replace-with-your-text-analytics-endpoint-here>";
}

将以下 main 方法添加到该类。Add the following main method to the class. 稍后将定义此处调用的方法。You will define the methods called here later.

public static void main(String[] args) {
    //You will create these methods later in the quickstart.
    TextAnalyticsClient client = authenticateClient(KEY, ENDPOINT);

    sentimentAnalysisWithOpinionMiningExample(client)
    detectLanguageExample(client);
    recognizeEntitiesExample(client);
    recognizeLinkedEntitiesExample(client);
    recognizePiiEntitiesExample(client);
    extractKeyPhrasesExample(client);
}

对象模型Object model

文本分析客户端是一个 TextAnalyticsClient 对象,该对象使用你的密钥在 Azure 中进行身份验证,并提供用于接受文本(单个字符串或批)的函数。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key, and provides functions to accept text as single strings or as a batch. 可以同步方式或异步方式将文本发送到 API。You can send text to the API synchronously, or asynchronously. 响应对象包含发送的每个文档的分析信息。The response object will contain the analysis information for each document you send.

代码示例Code examples

验证客户端Authenticate the client

使用文本分析资源的密钥和终结点创建用于实例化 TextAnalyticsClient 对象的方法。Create a method to instantiate the TextAnalyticsClient object with the key and endpoint for your Text Analytics resource. 此示例同样适用于该 API 的版本 3.0 和 3.1。This example is the same for versions 3.0 and 3.1 of the API.

static TextAnalyticsClient authenticateClient(String key, String endpoint) {
    return new TextAnalyticsClientBuilder()
        .credential(new AzureKeyCredential(key))
        .endpoint(endpoint)
        .buildClient();
}

在程序的 main() 方法中,调用身份验证方法来实例化客户端。In your program's main() method, call the authentication method to instantiate the client.

情绪分析Sentiment analysis

备注

在版本 3.1 中:In version 3.1:

  • 情绪分析包括“观点挖掘分析”(可选标志)。Sentiment Analysis includes Opinion Mining analysis which is optional flag.
  • “观点挖掘”包含角度和观点级情绪。Opinion Mining contains aspect and opinion level sentiment.

创建一个名为 sentimentAnalysisExample() 的新函数,该函数接受你之前创建的客户端,并调用其 analyzeSentiment() 函数。Create a new function called sentimentAnalysisExample() that takes the client that you created earlier, and call its analyzeSentiment() function. 如果成功,则返回的 AnalyzeSentimentResult 对象将包含 documentSentimentsentenceSentiments,否则将包含 errorMessageThe returned AnalyzeSentimentResult object will contain documentSentiment and sentenceSentiments if successful, or an errorMessage if not.

static void sentimentAnalysisExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "I had the best day of my life. I wish you were there with me.";

    DocumentSentiment documentSentiment = client.analyzeSentiment(text);
    System.out.printf(
        "Recognized document sentiment: %s, positive score: %s, neutral score: %s, negative score: %s.%n",
        documentSentiment.getSentiment(),
        documentSentiment.getConfidenceScores().getPositive(),
        documentSentiment.getConfidenceScores().getNeutral(),
        documentSentiment.getConfidenceScores().getNegative());

    for (SentenceSentiment sentenceSentiment : documentSentiment.getSentences()) {
        System.out.printf(
            "Recognized sentence sentiment: %s, positive score: %s, neutral score: %s, negative score: %s.%n",
            sentenceSentiment.getSentiment(),
            sentenceSentiment.getConfidenceScores().getPositive(),
            sentenceSentiment.getConfidenceScores().getNeutral(),
            sentenceSentiment.getConfidenceScores().getNegative());
        }
    }
}

输出Output

Recognized document sentiment: positive, positive score: 1.0, neutral score: 0.0, negative score: 0.0.
Recognized sentence sentiment: positive, positive score: 1.0, neutral score: 0.0, negative score: 0.0.
Recognized sentence sentiment: neutral, positive score: 0.21, neutral score: 0.77, negative score: 0.02.

观点挖掘Opinion mining

若要使用观点挖掘执行情绪分析,请创建一个名为 sentimentAnalysisWithOpinionMiningExample() 的新函数(该函数接受之前创建的客户端),并通过设置选项对象 AnalyzeSentimentOptions 调用其 analyzeSentiment() 函数。To perform sentiment analysis with opinion mining, create a new function called sentimentAnalysisWithOpinionMiningExample() that takes the client that you created earlier, and call its analyzeSentiment() function with setting option object AnalyzeSentimentOptions. 如果成功,则返回的 AnalyzeSentimentResult 对象将包含 documentSentimentsentenceSentiments,否则将包含 errorMessageThe returned AnalyzeSentimentResult object will contain documentSentiment and sentenceSentiments if successful, or an errorMessage if not.

static void sentimentAnalysisWithOpinionMiningExample(TextAnalyticsClient client)
{
    // The document that needs be analyzed.
    String document = "Bad atmosphere. Not close to plenty of restaurants, hotels, and transit! Staff are not friendly and helpful.";

    System.out.printf("Document = %s%n", document);

    AnalyzeSentimentOptions options = new AnalyzeSentimentOptions().setIncludeOpinionMining(true);
    final DocumentSentiment documentSentiment = client.analyzeSentiment(document, "en", options);
    SentimentConfidenceScores scores = documentSentiment.getConfidenceScores();
    System.out.printf(
            "Recognized document sentiment: %s, positive score: %f, neutral score: %f, negative score: %f.%n",
            documentSentiment.getSentiment(), scores.getPositive(), scores.getNeutral(), scores.getNegative());

    documentSentiment.getSentences().forEach(sentenceSentiment -> {
        SentimentConfidenceScores sentenceScores = sentenceSentiment.getConfidenceScores();
        System.out.printf("\tSentence sentiment: %s, positive score: %f, neutral score: %f, negative score: %f.%n",
                sentenceSentiment.getSentiment(), sentenceScores.getPositive(), sentenceScores.getNeutral(), sentenceScores.getNegative());
        sentenceSentiment.getMinedOpinions().forEach(minedOpinions -> {
            AspectSentiment aspectSentiment = minedOpinions.getAspect();
            System.out.printf("\t\tAspect sentiment: %s, aspect text: %s%n", aspectSentiment.getSentiment(),
                    aspectSentiment.getText());
            SentimentConfidenceScores aspectScores = aspectSentiment.getConfidenceScores();
            System.out.printf("\t\tAspect positive score: %f, negative score: %f.%n",
                    aspectScores.getPositive(), aspectScores.getNegative());
            for (OpinionSentiment opinionSentiment : minedOpinions.getOpinions()) {
                System.out.printf("\t\t\t'%s' opinion sentiment because of \"%s\". Is the opinion negated: %s.%n",
                        opinionSentiment.getSentiment(), opinionSentiment.getText(), opinionSentiment.isNegated());
                SentimentConfidenceScores opinionScores = opinionSentiment.getConfidenceScores();
                System.out.printf("\t\t\tOpinion positive score: %f, negative score: %f.%n",
                        opinionScores.getPositive(), opinionScores.getNegative());
            }
        });
    });
}

输出Output

Document = Bad atmosphere. Not close to plenty of restaurants, hotels, and transit! Staff are not friendly and helpful.
Recognized document sentiment: negative, positive score: 0.010000, neutral score: 0.140000, negative score: 0.850000.
    Sentence sentiment: negative, positive score: 0.000000, neutral score: 0.000000, negative score: 1.000000.
        Aspect sentiment: negative, aspect text: atmosphere
        Aspect positive score: 0.010000, negative score: 0.990000.
            'negative' opinion sentiment because of "bad". Is the opinion negated: false.
            Opinion positive score: 0.010000, negative score: 0.990000.
    Sentence sentiment: negative, positive score: 0.020000, neutral score: 0.440000, negative score: 0.540000.
    Sentence sentiment: negative, positive score: 0.000000, neutral score: 0.000000, negative score: 1.000000.
        Aspect sentiment: negative, aspect text: Staff
        Aspect positive score: 0.000000, negative score: 1.000000.
            'negative' opinion sentiment because of "friendly". Is the opinion negated: true.
            Opinion positive score: 0.000000, negative score: 1.000000.
            'negative' opinion sentiment because of "helpful". Is the opinion negated: true.
            Opinion positive score: 0.000000, negative score: 1.000000.

Process finished with exit code 0

语言检测Language detection

创建一个名为 detectLanguageExample() 的新函数,该函数接受你之前创建的客户端并调用其 detectLanguage() 函数。Create a new function called detectLanguageExample() that takes the client that you created earlier, and call its detectLanguage() function. 如果成功,则返回的 DetectLanguageResult 对象将包含检测到的主要语言和检测到的其他语言的列表,如果失败,则将包含 errorMessageThe returned DetectLanguageResult object will contain a primary language detected, a list of other languages detected if successful, or an errorMessage if not. 此示例同样适用于该 API 的版本 3.0 和 3.1。This example is the same for versions 3.0 and 3.1 of the API.

提示

在某些情况下,可能很难根据输入区分语言。In some cases it may be hard to disambiguate languages based on the input. 可以使用 countryHint 参数指定 2 个字母的国家/地区代码。You can use the countryHint parameter to specify a 2-letter country code. 默认情况下,API 使用“US”作为默认的 countryHint,要删除此行为,可以通过将此值设置为空字符串 countryHint = "" 来重置此参数。By default the API is using the "US" as the default countryHint, to remove this behavior you can reset this parameter by setting this value to empty string countryHint = "". 若要设置不同的默认值,请设置 TextAnalyticsClientOptions.DefaultCountryHint 属性,然后在客户端初始化期间传递它。To set a different default, set the TextAnalyticsClientOptions.DefaultCountryHint property and pass it during the client's initialization.

static void detectLanguageExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "Ce document est rédigé en Français.";

    DetectedLanguage detectedLanguage = client.detectLanguage(text);
    System.out.printf("Detected primary language: %s, ISO 6391 name: %s, score: %.2f.%n",
        detectedLanguage.getName(),
        detectedLanguage.getIso6391Name(),
        detectedLanguage.getConfidenceScore());
}

输出Output

Detected primary language: French, ISO 6391 name: fr, score: 1.00.

命名实体识别 (NER)Named Entity recognition (NER)

备注

在版本 3.1 中:In version 3.1:

  • NER 包含单独用于检测个人信息的方法。NER includes separate methods for detecting personal information.
  • 实体链接是一个独立于 NER 的请求。Entity linking is a separate request than NER.

创建一个名为 recognizeEntitiesExample() 的新函数,该函数接受你之前创建的客户端,并调用其 recognizeEntities() 函数。Create a new function called recognizeEntitiesExample() that takes the client that you created earlier, and call its recognizeEntities() function. 如果成功,则返回的 CategorizedEntityCollection 对象将包含 CategorizedEntity 的列表,否则将包含 errorMessageThe returned CategorizedEntityCollection object will contain a list of CategorizedEntity if successful, or an errorMessage if not.

static void recognizeEntitiesExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "I had a wonderful trip to Seattle last week.";

    for (CategorizedEntity entity : client.recognizeEntities(text)) {
        System.out.printf(
            "Recognized entity: %s, entity category: %s, entity sub-category: %s, score: %s, offset: %s, length: %s.%n",
            entity.getText(),
            entity.getCategory(),
            entity.getSubcategory(),
            entity.getConfidenceScore(),
            entity.getOffset(),
            entity.getLength());
    }
}

输出Output

Recognized entity: trip, entity category: Event, entity sub-category: null, score: 0.61, offset: 8, length: 4.
Recognized entity: Seattle, entity category: Location, entity sub-category: GPE, score: 0.82, offset: 16, length: 7.
Recognized entity: last week, entity category: DateTime, entity sub-category: DateRange, score: 0.8, offset: 24, length: 9.

实体链接Entity linking

创建一个名为 recognizeLinkedEntitiesExample() 的新函数,该函数接受你之前创建的客户端,并调用其 recognizeLinkedEntities() 函数。Create a new function called recognizeLinkedEntitiesExample() that takes the client that you created earlier, and call its recognizeLinkedEntities() function. 如果成功,则返回的 LinkedEntityCollection 对象将包含 LinkedEntity 的列表,否则将包含 errorMessageThe returned LinkedEntityCollection object will contain a list of LinkedEntity if successful, or an errorMessage if not. 由于链接实体是唯一标识的,因此同一实体的实例将以分组形式出现在 LinkedEntity 对象下,显示为 LinkedEntityMatch 对象的列表。Since linked entities are uniquely identified, occurrences of the same entity are grouped under a LinkedEntity object as a list of LinkedEntityMatch objects.

static void recognizeLinkedEntitiesExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, " +
        "to develop and sell BASIC interpreters for the Altair 8800. " +
        "During his career at Microsoft, Gates held the positions of chairman, " +
        "chief executive officer, president and chief software architect, " +
        "while also being the largest individual shareholder until May 2014.";

    System.out.printf("Linked Entities:%n");
    for (LinkedEntity linkedEntity : client.recognizeLinkedEntities(text)) {
        System.out.printf("Name: %s, ID: %s, URL: %s, Data Source: %s.%n",
            linkedEntity.getName(),
            linkedEntity.getDataSourceEntityId(),
            linkedEntity.getUrl(),
            linkedEntity.getDataSource());
        System.out.printf("Matches:%n");
        for (LinkedEntityMatch linkedEntityMatch : linkedEntity.getMatches()) {
            System.out.printf("Text: %s, Score: %.2f, Offset: %s, Length: %s%n",
            linkedEntityMatch.getText(),
            linkedEntityMatch.getConfidenceScore(),
            linkedEntityMatch.getOffset(),
            linkedEntityMatch.getLength());
        }
    }
}

输出Output

Linked Entities:
Name: Microsoft, ID: Microsoft, URL: https://en.wikipedia.org/wiki/Microsoft, Data Source: Wikipedia.
Matches:
Text: Microsoft, Score: 0.55, Offset: 9, Length: 0
Text: Microsoft, Score: 0.55, Offset: 9, Length: 150
Name: Bill Gates, ID: Bill Gates, URL: https://en.wikipedia.org/wiki/Bill_Gates, Data Source: Wikipedia.
Matches:
Text: Bill Gates, Score: 0.63, Offset: 10, Length: 25
Text: Gates, Score: 0.63, Offset: 5, Length: 161
Name: Paul Allen, ID: Paul Allen, URL: https://en.wikipedia.org/wiki/Paul_Allen, Data Source: Wikipedia.
Matches:
Text: Paul Allen, Score: 0.60, Offset: 10, Length: 40
Name: April 4, ID: April 4, URL: https://en.wikipedia.org/wiki/April_4, Data Source: Wikipedia.
Matches:
Text: April 4, Score: 0.32, Offset: 7, Length: 54
Name: BASIC, ID: BASIC, URL: https://en.wikipedia.org/wiki/BASIC, Data Source: Wikipedia.
Matches:
Text: BASIC, Score: 0.33, Offset: 5, Length: 89
Name: Altair 8800, ID: Altair 8800, URL: https://en.wikipedia.org/wiki/Altair_8800, Data Source: Wikipedia.
Matches:
Text: Altair 8800, Score: 0.88, Offset: 11, Length: 116

个人身份信息识别Personally Identifiable Information Recognition

创建一个名为 recognizePiiEntitiesExample() 的新函数,该函数接受你之前创建的客户端,并调用其 recognizePiiEntities() 函数。Create a new function called recognizePiiEntitiesExample() that takes the client that you created earlier, and call its recognizePiiEntities() function. 如果成功,则返回的 PiiEntityCollection 对象将包含 PiiEntity 的列表,否则将包含 errorMessageThe returned PiiEntityCollection object will contain a list of PiiEntity if successful, or an errorMessage if not. 它还将包含已修订的文本,该文本由输入文本组成,其中所有可识别实体均替换为 *****It will also contain the redacted text, which consists of the input text with all identifiable entities replaced with *****.

static void recognizePiiEntitiesExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String document = "My SSN is 859-98-0987";
    PiiEntityCollection piiEntityCollection = client.recognizePiiEntities(document);
    System.out.printf("Redacted Text: %s%n", piiEntityCollection.getRedactedText());
    piiEntityCollection.forEach(entity -> System.out.printf(
        "Recognized Personally Identifiable Information entity: %s, entity category: %s, entity subcategory: %s,"
            + " confidence score: %f.%n",
        entity.getText(), entity.getCategory(), entity.getSubcategory(), entity.getConfidenceScore()));
}

输出Output

Redacted Text: My SSN is ***********
Recognized Personally Identifiable Information entity: 859-98-0987, entity category: U.S. Social Security Number (SSN), entity subcategory: null, confidence score: 0.650000.

关键短语提取Key phrase extraction

创建一个名为 extractKeyPhrasesExample() 的新函数,该函数接受你之前创建的客户端,并调用其 extractKeyPhrases() 函数。Create a new function called extractKeyPhrasesExample() that takes the client that you created earlier, and call its extractKeyPhrases() function. 如果成功,则返回的 ExtractKeyPhraseResult 对象将包含关键短语的列表,否则将包含 errorMessageThe returned ExtractKeyPhraseResult object will contain a list of key phrases if successful, or an errorMessage if not. 此示例同样适用于该 API 的版本 3.0 和 3.1。This example is the same for version 3.0 and 3.1 of the API.

static void extractKeyPhrasesExample(TextAnalyticsClient client)
{
    // The text that need be analyzed.
    String text = "My cat might need to see a veterinarian.";

    System.out.printf("Recognized phrases: %n");
    for (String keyPhrase : client.extractKeyPhrases(text)) {
        System.out.printf("%s%n", keyPhrase);
    }
}

输出Output

Recognized phrases: 
cat
veterinarian

使用“分析”操作异步使用 APIUse the API asynchronously with the Analyze operation

注意

若要使用“分析”操作,请确保 Azure 资源使用 S 标准定价层。To use the Analyze operation, make sure your Azure resource is using the S standard pricing tier.

创建名为 analyzeOperationExample() 的新函数,它将调用 beginAnalyzeTasks() 函数。Create a new function called analyzeOperationExample(), which calls the beginAnalyzeTasks() function. 结果将是一个长期操作,将轮询该操作以获得结果。The result will be a long running operation which will be polled for results.

static void analyzeOperationExample(TextAnalyticsClient client)
{
        List<TextDocumentInput> documents = Arrays.asList(
                        new TextDocumentInput("0", "Microsoft was founded by Bill Gates and Paul Allen.")
                        );

        SyncPoller<TextAnalyticsOperationResult, PagedIterable<AnalyzeTasksResult>> syncPoller =
                        client.beginAnalyzeTasks(documents,
                                        new AnalyzeTasksOptions().setDisplayName("{tasks_display_name}")
                                                        .setEntitiesRecognitionTasks(Arrays.asList(new EntitiesTask())),
                                        Context.NONE);

        syncPoller.waitForCompletion();
        PagedIterable<AnalyzeTasksResult> result = syncPoller.getFinalResult();

        result.forEach(analyzeJobState -> {
                System.out.printf("Job Display Name: %s, Job ID: %s.%n", analyzeJobState.getDisplayName(),
                                analyzeJobState.getJobId());
                System.out.printf("Total tasks: %s, completed: %s, failed: %s, in progress: %s.%n",
                                analyzeJobState.getTotal(), analyzeJobState.getCompleted(), analyzeJobState.getFailed(),
                                analyzeJobState.getInProgress());

                List<RecognizeEntitiesResultCollection> entityRecognitionTasks =
                                analyzeJobState.getEntityRecognitionTasks();
                if (entityRecognitionTasks != null) {
                        entityRecognitionTasks.forEach(taskResult -> {
                                // Recognized entities for each of documents from a batch of documents
                                AtomicInteger counter = new AtomicInteger();
                                for (RecognizeEntitiesResult entitiesResult : taskResult) {
                                        System.out.printf("%n%s%n", documents.get(counter.getAndIncrement()));
                                        if (entitiesResult.isError()) {
                                                // Erroneous document
                                                System.out.printf("Cannot recognize entities. Error: %s%n",
                                                                entitiesResult.getError().getMessage());
                                        } else {
                                                // Valid document
                                                entitiesResult.getEntities().forEach(entity -> System.out.printf(
                                                                "Recognized entity: %s, entity category: %s, entity subcategory: %s, "
                                                                                + "confidence score: %f.%n",
                                                                entity.getText(), entity.getCategory(), entity.getSubcategory(),
                                                                entity.getConfidenceScore()));
                                        }
                                }
                        });
                }
        });
    }

将此示例添加到应用程序后,请在 main() 方法中调用它。After you add this example to your application, call it in your main() method.

analyzeOperationExample(client);

OutputOutput

Job Display Name: {tasks_display_name}, Job ID: 84fd4db4-0734-47ec-b263-ac5451e83f2a_637432416000000000.
Total tasks: 1, completed: 1, failed: 0, in progress: 0.

Text = Microsoft was founded by Bill Gates and Paul Allen., Id = 0, Language = null
Recognized entity: Microsoft, entity category: Organization, entity subcategory: null, confidence score: 0.960000.
Recognized entity: Bill Gates, entity category: Person, entity subcategory: null, confidence score: 1.000000.
Recognized entity: Paul Allen, entity category: Person, entity subcategory: null, confidence score: 0.990000.

还可以使用“分析”操作来检测 PII 和关键短语提取。You can also use the Analyze operation to detect PII and key phrase extraction. 请参阅 GitHub 上的分析示例See the Analyze sample on GitHub.

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
    • 确保只按所用版本的说明操作。Be sure to only follow the instructions for the version you are using.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below.
  • 还可在浏览器中运行此版本的文本分析客户端库。You can also run this version of the Text Analytics client library in your browser.

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • 最新版本的 Node.jsThe current version of Node.js.
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • 若要使用“分析”功能,需要标准 (S) 定价层的“文本分析”资源。To use the Analyze feature, you will need a Text Analytics resource with the standard (S) pricing tier.

设置Setting up

创建新的 Node.js 应用程序Create a new Node.js application

在控制台窗口(例如 cmd、PowerShell 或 Bash)中,为应用创建一个新目录并导航到该目录。In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp 

cd myapp

运行 npm init 命令以使用 package.json 文件创建一个 node 应用程序。Run the npm init command to create a node application with a package.json file.

npm init

安装客户端库Install the client library

安装 @azure/ai-text-analytics NPM 包:Install the @azure/ai-text-analytics NPM packages:

npm install --save @azure/ai-text-analytics@5.1.0-beta.3

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

应用的 package.json 文件将使用依赖项进行更新。Your app's package.json file will be updated with the dependencies. 创建一个名为 index.js 的文件,并添加以下内容:Create a file named index.js and add the following:

"use strict";

const { TextAnalyticsClient, AzureKeyCredential } = require("@azure/ai-text-analytics");

为资源的 Azure 终结点和密钥创建变量。Create variables for your resource's Azure endpoint and key.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

const key = '<paste-your-text-analytics-key-here>';
const endpoint = '<paste-your-text-analytics-endpoint-here>';

对象模型Object model

文本分析客户端是一个 TextAnalyticsClient 对象,它使用你的密钥向 Azure 进行身份验证。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure using your key. 该客户端提供了几种方法来分析文本,文本可以是单个字符串,也可以是批处理。The client provides several methods for analyzing text, as a single string, or a batch.

文本将以 documents 的列表的形式发送到 API,该项是包含 idtextlanguage 属性的组合的 dictionary 对象,具体取决于所用的方法。Text is sent to the API as a list of documents, which are dictionary objects containing a combination of id, text, and language attributes depending on the method used. text 属性存储要以源 language 分析的文本,而 id 则可以是任何值。The text attribute stores the text to be analyzed in the origin language, and the id can be any value.

响应对象是一个列表,其中包含每个文档的分析信息。The response object is a list containing the analysis information for each document.

代码示例Code examples

客户端身份验证Client Authentication

创建一个新的 TextAnalyticsClient 对象并使用你的密钥和终结点作为参数。Create a new TextAnalyticsClient object with your key and endpoint as parameters.

const textAnalyticsClient = new TextAnalyticsClient(endpoint,  new AzureKeyCredential(key));

情绪分析Sentiment analysis

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 analyzeSentiment() 方法,并获取返回的 SentimentBatchResult 对象。Call the client's analyzeSentiment() method and get the returned SentimentBatchResult object. 循环访问结果列表,输出每个文档的 ID、文档级别情绪以及置信度分数。Iterate through the list of results, and print each document's ID, document level sentiment with confidence scores. 对于每个文档,结果都包含句子级别情绪以及偏移量、长度和置信度分数。For each document, result contains sentence level sentiment along with offsets, length, and confidence scores.

async function sentimentAnalysis(client){

    const sentimentInput = [
        "I had the best day of my life. I wish you were there with me."
    ];
    const sentimentResult = await client.analyzeSentiment(sentimentInput);

    sentimentResult.forEach(document => {
        console.log(`ID: ${document.id}`);
        console.log(`\tDocument Sentiment: ${document.sentiment}`);
        console.log(`\tDocument Scores:`);
        console.log(`\t\tPositive: ${document.confidenceScores.positive.toFixed(2)} \tNegative: ${document.confidenceScores.negative.toFixed(2)} \tNeutral: ${document.confidenceScores.neutral.toFixed(2)}`);
        console.log(`\tSentences Sentiment(${document.sentences.length}):`);
        document.sentences.forEach(sentence => {
            console.log(`\t\tSentence sentiment: ${sentence.sentiment}`)
            console.log(`\t\tSentences Scores:`);
            console.log(`\t\tPositive: ${sentence.confidenceScores.positive.toFixed(2)} \tNegative: ${sentence.confidenceScores.negative.toFixed(2)} \tNeutral: ${sentence.confidenceScores.neutral.toFixed(2)}`);
        });
    });
}
sentimentAnalysis(textAnalyticsClient)

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

ID: 0
        Document Sentiment: positive
        Document Scores:
                Positive: 1.00  Negative: 0.00  Neutral: 0.00
        Sentences Sentiment(2):
                Sentence sentiment: positive
                Sentences Scores:
                Positive: 1.00  Negative: 0.00  Neutral: 0.00
                Sentence sentiment: neutral
                Sentences Scores:
                Positive: 0.21  Negative: 0.02  Neutral: 0.77

观点挖掘Opinion mining

若要使用观点挖掘进行情绪分析,请创建一个包含要分析的文档的字符串数组。In order to do sentiment analysis with opinion mining, create an array of strings containing the document you want to analyze. 调用客户端的 analyzeSentiment() 方法(添加了选项标志 includeOpinionMining: true),并获取返回的 SentimentBatchResult 对象。Call the client's analyzeSentiment() method with adding option flag includeOpinionMining: true and get the returned SentimentBatchResult object. 循环访问结果列表,输出每个文档的 ID、文档级别情绪以及置信度分数。Iterate through the list of results, and print each document's ID, document level sentiment with confidence scores. 对于每个文档,结果不仅包含如上所述的句子级情绪,而且还包含角度和观点级情绪。For each document, result contains not only sentence level sentiment as above, but also aspect and opinion level sentiment.

async function sentimentAnalysisWithOpinionMining(client){

    const sentimentInput = [
        {
            text: "The food and service were unacceptable, but the concierge were nice",
            id: "0",
            language: "en"
        }
    ];
    const sentimentResult = await client.analyzeSentiment(sentimentInput, { includeOpinionMining: true });

    sentimentResult.forEach(document => {
        console.log(`ID: ${document.id}`);
        console.log(`\tDocument Sentiment: ${document.sentiment}`);
        console.log(`\tDocument Scores:`);
        console.log(`\t\tPositive: ${document.confidenceScores.positive.toFixed(2)} \tNegative: ${document.confidenceScores.negative.toFixed(2)} \tNeutral: ${document.confidenceScores.neutral.toFixed(2)}`);
        console.log(`\tSentences Sentiment(${document.sentences.length}):`);
        document.sentences.forEach(sentence => {
            console.log(`\t\tSentence sentiment: ${sentence.sentiment}`)
            console.log(`\t\tSentences Scores:`);
            console.log(`\t\tPositive: ${sentence.confidenceScores.positive.toFixed(2)} \tNegative: ${sentence.confidenceScores.negative.toFixed(2)} \tNeutral: ${sentence.confidenceScores.neutral.toFixed(2)}`);
            console.log("\tMined opinions");
            for (const { aspect, opinions } of sentence.minedOpinions) {
                console.log(`\t\tAspect text: ${aspect.text}`);
                console.log(`\t\tAspect sentiment: ${aspect.sentiment}`);
                console.log(`\t\tAspect Positive: ${aspect.confidenceScores.positive.toFixed(2)} \tNegative: ${aspect.confidenceScores.negative.toFixed(2)}`);
                console.log("\t\tAspect opinions:");
                for (const { text, sentiment, confidenceScores } of opinions) {
                    console.log(`\t\tOpinion text: ${text}`);
                    console.log(`\t\tOpinion sentiment: ${sentiment}`);
                    console.log(`\t\tOpinion Positive: ${confidenceScores.positive.toFixed(2)} \tNegative: ${confidenceScores.negative.toFixed(2)}`);
                }
            }
        });
    });
}
sentimentAnalysisWithOpinionMining(textAnalyticsClient)

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

ID: 0
        Document Sentiment: positive
        Document Scores:
                Positive: 0.84  Negative: 0.16  Neutral: 0.00
        Sentences Sentiment(1):
                Sentence sentiment: positive
                Sentences Scores:
                Positive: 0.84  Negative: 0.16  Neutral: 0.00
        Mined opinions
                Aspect text: food
                Aspect sentiment: negative
                Aspect Positive: 0.01   Negative: 0.99
                Aspect opinions:
                Opinion text: unacceptable
                Opinion sentiment: negative
                Opinion Positive: 0.01  Negative: 0.99
                Aspect text: service
                Aspect sentiment: negative
                Aspect Positive: 0.01   Negative: 0.99
                Aspect opinions:
                Opinion text: unacceptable
                Opinion sentiment: negative
                Opinion Positive: 0.01  Negative: 0.99
                Aspect text: concierge
                Aspect sentiment: positive
                Aspect Positive: 1.00   Negative: 0.00
                Aspect opinions:
                Opinion text: nice
                Opinion sentiment: positive
                Opinion Positive: 1.00  Negative: 0.00

语言检测Language detection

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 detectLanguage() 方法,并获取返回的 DetectLanguageResultCollectionCall the client's detectLanguage() method and get the returned DetectLanguageResultCollection. 然后循环访问结果,输出每个文档的 ID 以及各自的主要语言。Then iterate through the results, and print each document's ID with respective primary language.

async function languageDetection(client) {

    const languageInputArray = [
        "Ce document est rédigé en Français."
    ];
    const languageResult = await client.detectLanguage(languageInputArray);

    languageResult.forEach(document => {
        console.log(`ID: ${document.id}`);
        console.log(`\tPrimary Language ${document.primaryLanguage.name}`)
    });
}
languageDetection(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

ID: 0
        Primary Language French

命名实体识别 (NER)Named Entity Recognition (NER)

备注

在版本 3.1 中:In version 3.1:

  • 实体链接是一个独立于 NER 的请求。Entity linking is a separate request than NER.

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 recognizeEntities() 方法,并获取 RecognizeEntitiesResult 对象。Call the client's recognizeEntities() method and get the RecognizeEntitiesResult object. 循环访问结果列表,并输出实体名称、类型、子类型、偏移量、长度和分数。Iterate through the list of results, and print the entity name, type, subtype, offset, length, and score.

async function entityRecognition(client){

    const entityInputs = [
        "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800",
        "La sede principal de Microsoft se encuentra en la ciudad de Redmond, a 21 kilómetros de Seattle."
    ];
    const entityResults = await client.recognizeEntities(entityInputs);

    entityResults.forEach(document => {
        console.log(`Document ID: ${document.id}`);
        document.entities.forEach(entity => {
            console.log(`\tName: ${entity.text} \tCategory: ${entity.category} \tSubcategory: ${entity.subCategory ? entity.subCategory : "N/A"}`);
            console.log(`\tScore: ${entity.confidenceScore}`);
        });
    });
}
entityRecognition(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

Document ID: 0
        Name: Microsoft         Category: Organization  Subcategory: N/A
        Score: 0.29
        Name: Bill Gates        Category: Person        Subcategory: N/A
        Score: 0.78
        Name: Paul Allen        Category: Person        Subcategory: N/A
        Score: 0.82
        Name: April 4, 1975     Category: DateTime      Subcategory: Date
        Score: 0.8
        Name: 8800      Category: Quantity      Subcategory: Number
        Score: 0.8
Document ID: 1
        Name: 21        Category: Quantity      Subcategory: Number
        Score: 0.8
        Name: Seattle   Category: Location      Subcategory: GPE
        Score: 0.25

实体链接Entity Linking

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 recognizeLinkedEntities() 方法,并获取 RecognizeLinkedEntitiesResult 对象。Call the client's recognizeLinkedEntities() method and get the RecognizeLinkedEntitiesResult object. 循环访问结果列表,并输出实体名称、ID、数据源、URL 和匹配项。Iterate through the list of results, and print the entity name, ID, data source, url, and matches. matches 数组中的每个对象都将包含该匹配项的偏移量、长度和分数。Every object in matches array will contain offset, length, and score for that match.

async function linkedEntityRecognition(client){

    const linkedEntityInput = [
        "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, to develop and sell BASIC interpreters for the Altair 8800. During his career at Microsoft, Gates held the positions of chairman, chief executive officer, president and chief software architect, while also being the largest individual shareholder until May 2014."
    ];
    const entityResults = await client.recognizeLinkedEntities(linkedEntityInput);

    entityResults.forEach(document => {
        console.log(`Document ID: ${document.id}`);
        document.entities.forEach(entity => {
            console.log(`\tName: ${entity.name} \tID: ${entity.dataSourceEntityId} \tURL: ${entity.url} \tData Source: ${entity.dataSource}`);
            console.log(`\tMatches:`)
            entity.matches.forEach(match => {
                console.log(`\t\tText: ${match.text} \tScore: ${match.confidenceScore.toFixed(2)}`);
        })
        });
    });
}
linkedEntityRecognition(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

Document ID: 0
        Name: Altair 8800       ID: Altair 8800         URL: https://en.wikipedia.org/wiki/Altair_8800  Data Source: Wikipedia
        Matches:
                Text: Altair 8800       Score: 0.88
        Name: Bill Gates        ID: Bill Gates  URL: https://en.wikipedia.org/wiki/Bill_Gates   Data Source: Wikipedia
        Matches:
                Text: Bill Gates        Score: 0.63
                Text: Gates     Score: 0.63
        Name: Paul Allen        ID: Paul Allen  URL: https://en.wikipedia.org/wiki/Paul_Allen   Data Source: Wikipedia
        Matches:
                Text: Paul Allen        Score: 0.60
        Name: Microsoft         ID: Microsoft   URL: https://en.wikipedia.org/wiki/Microsoft    Data Source: Wikipedia
        Matches:
                Text: Microsoft         Score: 0.55
                Text: Microsoft         Score: 0.55
        Name: April 4   ID: April 4     URL: https://en.wikipedia.org/wiki/April_4      Data Source: Wikipedia
        Matches:
                Text: April 4   Score: 0.32
        Name: BASIC     ID: BASIC       URL: https://en.wikipedia.org/wiki/BASIC        Data Source: Wikipedia
        Matches:
                Text: BASIC     Score: 0.33

个人身份信息 (PII) 识别Personally Identifying Information (PII) Recognition

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 recognizePiiEntities() 方法,并获取 RecognizePIIEntitiesResult 对象。Call the client's recognizePiiEntities() method and get the RecognizePIIEntitiesResult object. 循环访问结果列表,并输出实体名称、类型和分数。Iterate through the list of results, and print the entity name, type, and score.

async function piiRecognition(client) {

    const documents = [
        "The employee's phone number is (555) 555-5555."
    ];

    const results = await client.recognizePiiEntities(documents, "en");
    for (const result of results) {
        if (result.error === undefined) {
            console.log("Redacted Text: ", result.redactedText);
            console.log(" -- Recognized PII entities for input", result.id, "--");
            for (const entity of result.entities) {
                console.log(entity.text, ":", entity.category, "(Score:", entity.confidenceScore, ")");
            }
        } else {
            console.error("Encountered an error:", result.error);
        }
    }
}
piiRecognition(textAnalyticsClient)

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

Redacted Text:  The employee's phone number is **************.
 -- Recognized PII entities for input 0 --
(555) 555-5555 : Phone Number (Score: 0.8 )

关键短语提取Key phrase extraction

创建一个字符串数组,使其包含要分析的文档。Create an array of strings containing the document you want to analyze. 调用客户端的 extractKeyPhrases() 方法,并获取返回的 ExtractKeyPhrasesResult 对象。Call the client's extractKeyPhrases() method and get the returned ExtractKeyPhrasesResult object. 循环访问结果,输出每个文档的 ID 以及任何检测到的密钥短语。Iterate through the results and print each document's ID, and any detected key phrases.

async function keyPhraseExtraction(client){

    const keyPhrasesInput = [
        "My cat might need to see a veterinarian.",
    ];
    const keyPhraseResult = await client.extractKeyPhrases(keyPhrasesInput);
    
    keyPhraseResult.forEach(document => {
        console.log(`ID: ${document.id}`);
        console.log(`\tDocument Key Phrases: ${document.keyPhrases}`);
    });
}
keyPhraseExtraction(textAnalyticsClient);

在控制台窗口中使用 node index.js 运行代码。Run your code with node index.js in your console window.

输出Output

ID: 0
        Document Key Phrases: cat,veterinarian

使用“分析”操作异步使用 APIUse the API asynchronously with the Analyze operation

注意

若要使用“分析”操作,必须使用标准 (S) 定价层的“文本分析”资源。To use Analyze operations, you must use a Text Analytics resource with the standard (S) pricing tier.

创建名为 analyze_example() 的新函数,它将调用 beginAnalyze() 函数。Create a new function called analyze_example(), which calls the beginAnalyze() function. 结果将是一个长期操作,将轮询该操作以获得结果。The result will be a long running operation which will be polled for results.

const documents = [
  "Microsoft was founded by Bill Gates and Paul Allen.",
];

async function analyze_example(client) {
  console.log("== Analyze Sample ==");

  const tasks = {
    entityRecognitionTasks: [{ modelVersion: "latest" }]
  };
  const poller = await client.beginAnalyze(documents, tasks);
  const resultPages = await poller.pollUntilDone();

  for await (const page of resultPages) {
    const entitiesResults = page.entitiesRecognitionResults![0];
    for (const doc of entitiesResults) {
      console.log(`- Document ${doc.id}`);
      if (!doc.error) {
        console.log("\tEntities:");
        for (const entity of doc.entities) {
          console.log(`\t- Entity ${entity.text} of type ${entity.category}`);
        }
      } else {
        console.error("  Error:", doc.error);
      }
    }
  }
}

analyze_example(textAnalyticsClient);

OutputOutput

== Analyze Sample ==
- Document 0
        Entities:
        - Entity Microsoft of type Organization
        - Entity Bill Gates of type Person
        - Entity Paul Allen of type Person

还可以使用“分析”操作来检测 PII 和关键短语提取。You can also use the Analyze operation to detect PII and key phrase extraction. 请参阅 GitHub 上的 JavaScriptTypeScript 分析示例。See the Analyze samples for JavaScript and TypeScript on GitHub.

运行应用程序Run the application

在快速入门文件中使用 node 命令运行应用程序。Run the application with the node command on your quickstart file.

node index.js

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
    • 确保只按所用版本的说明操作。Be sure to only follow the instructions for the version you are using.
  • 为了简单起见,本文中的代码使用了同步方法和不受保护的凭据存储。The code in this article uses synchronous methods and un-secured credentials storage for simplicity reasons. 对于生产方案,我们建议使用批处理的异步方法来提高性能和可伸缩性。For production scenarios, we recommend using the batched asynchronous methods for performance and scalability. 请参阅下面的参考文档。See the reference documentation below. 如果要使用运行状况文本分析或异步操作,请参阅 Github 上的 C#PythonJava 示例If you want to use Text Analytics for health or Asynchronous operations, see the examples on Github for C#, Python or Java

先决条件Prerequisites

  • Azure 订阅 - 免费创建订阅Azure subscription - Create one for free
  • Python 3.xPython 3.x
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • 若要使用“分析”功能,需要标准 (S) 定价层的“文本分析”资源。To use the Analyze feature, you will need a Text Analytics resource with the standard (S) pricing tier.

设置Setting up

安装客户端库Install the client library

在安装 Python 后,可以通过以下命令安装客户端库:After installing Python, you can install the client library with:

pip install azure-ai-textanalytics --pre

提示

想要立即查看整个快速入门代码文件?Want to view the whole quickstart code file at once? 可以在 GitHub 上找到它,其中包含此快速入门中的代码示例。You can find it on GitHub, which contains the code examples in this quickstart.

创建新的 Python 应用程序Create a new python application

创建一个新的 Python 文件,为资源的 Azure 终结点和订阅密钥创建变量。Create a new Python file and create variables for your resource's Azure endpoint and subscription key.

重要

转到 Azure 门户。Go to the Azure portal. 如果在“先决条件”部分中创建的文本分析资源已成功部署,请单击“后续步骤”下的“转到资源”按钮 。If the Text Analytics resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. 在资源的“密钥和终结点”页的“资源管理”下可以找到密钥和终结点 。You can find your key and endpoint in the resource's key and endpoint page, under resource management.

完成后,请记住将密钥从代码中删除,并且永远不要公开发布该密钥。Remember to remove the key from your code when you're done, and never post it publicly. 对于生产环境,请考虑使用安全的方法来存储和访问凭据。For production, consider using a secure way of storing and accessing your credentials. 例如,Azure 密钥保管库For example, Azure key vault.

key = "<paste-your-text-analytics-key-here>"
endpoint = "<paste-your-text-analytics-endpoint-here>"

对象模型Object model

文本分析客户端是向 Azure 进行身份验证的 TextAnalyticsClient 对象。The Text Analytics client is a TextAnalyticsClient object that authenticates to Azure. 该客户端提供了几种方法来分析文本。The client provides several methods for analyzing text.

将处理文本以 documents 列表的形式发送到 API 时,该列表可以是字符串列表、类似于字典的表示形式列表或 TextDocumentInput/DetectLanguageInput 列表。When processing text is sent to the API as a list of documents, which is either as a list of string, a list of dict-like representation, or as a list of TextDocumentInput/DetectLanguageInput. dict-like 对象包含 idtextlanguage/country_hint 的组合。A dict-like object contains a combination of id, text, and language/country_hint. text 属性存储要以源 country_hint 分析的文本,而 id 则可以是任何值。The text attribute stores the text to be analyzed in the origin country_hint, and the id can be any value.

响应对象是一个列表,其中包含每个文档的分析信息。The response object is a list containing the analysis information for each document.

代码示例Code examples

这些代码片段展示了如何使用适用于 Python 的文本分析客户端库执行以下任务:These code snippets show you how to do the following tasks with the Text Analytics client library for Python:

验证客户端Authenticate the client

创建一个函数,以便通过上面创建的 keyendpoint 来实例化 TextAnalyticsClient 对象。Create a function to instantiate the TextAnalyticsClient object with your key AND endpoint created above. 然后创建一个新客户端。Then create a new client.

from azure.ai.textanalytics import TextAnalyticsClient
from azure.core.credentials import AzureKeyCredential

def authenticate_client():
    ta_credential = AzureKeyCredential(key)
    text_analytics_client = TextAnalyticsClient(
            endpoint=endpoint, 
            credential=ta_credential)
    return text_analytics_client

client = authenticate_client()

情绪分析Sentiment analysis

创建一个名为 sentiment_analysis_example() 的新函数,该函数采用客户端作为参数,然后调用 analyze_sentiment() 函数。Create a new function called sentiment_analysis_example() that takes the client as an argument, then calls the analyze_sentiment() function. 返回的响应对象将包含整个输入文档的情绪标签和分数,以及每个句子的情绪分析。The returned response object will contain the sentiment label and score of the entire input document, as well as a sentiment analysis for each sentence.

def sentiment_analysis_example(client):

    documents = ["I had the best day of my life. I wish you were there with me."]
    response = client.analyze_sentiment(documents=documents)[0]
    print("Document Sentiment: {}".format(response.sentiment))
    print("Overall scores: positive={0:.2f}; neutral={1:.2f}; negative={2:.2f} \n".format(
        response.confidence_scores.positive,
        response.confidence_scores.neutral,
        response.confidence_scores.negative,
    ))
    for idx, sentence in enumerate(response.sentences):
        print("Sentence: {}".format(sentence.text))
        print("Sentence {} sentiment: {}".format(idx+1, sentence.sentiment))
        print("Sentence score:\nPositive={0:.2f}\nNeutral={1:.2f}\nNegative={2:.2f}\n".format(
            sentence.confidence_scores.positive,
            sentence.confidence_scores.neutral,
            sentence.confidence_scores.negative,
        ))
          
sentiment_analysis_example(client)

输出Output

Document Sentiment: positive
Overall scores: positive=1.00; neutral=0.00; negative=0.00 

Sentence: I had the best day of my life.
Sentence 1 sentiment: positive
Sentence score:
Positive=1.00
Neutral=0.00
Negative=0.00

Sentence: I wish you were there with me.
Sentence 2 sentiment: neutral
Sentence score:
Positive=0.21
Neutral=0.77
Negative=0.02

观点挖掘Opinion mining

若要使用观点挖掘进行情绪分析,请创建一个名为 sentiment_analysis_with_opinion_mining_example() 的新函数(采用客户端作为参数),然后使用选项标志 show_opinion_mining=True 调用 analyze_sentiment() 函数。In order to do sentiment analysis with opinion mining, create a new function called sentiment_analysis_with_opinion_mining_example() that takes the client as an argument, then calls the analyze_sentiment() function with option flag show_opinion_mining=True. 返回的响应对象不仅包含整个输入文档的情绪标签和分数以及每个句子的情绪分析,还包含角度和观点级情绪分析。The returned response object will contain not only the sentiment label and score of the entire input document with sentiment analysis for each sentence, but also aspect and opinion level sentiment analysis.

def sentiment_analysis_with_opinion_mining_example(client):

    documents = [
        "The food and service were unacceptable, but the concierge were nice"
    ]

    result = client.analyze_sentiment(documents, show_opinion_mining=True)
    doc_result = [doc for doc in result if not doc.is_error]

    positive_reviews = [doc for doc in doc_result if doc.sentiment == "positive"]
    negative_reviews = [doc for doc in doc_result if doc.sentiment == "negative"]

    positive_mined_opinions = []
    mixed_mined_opinions = []
    negative_mined_opinions = []

    for document in doc_result:
        print("Document Sentiment: {}".format(document.sentiment))
        print("Overall scores: positive={0:.2f}; neutral={1:.2f}; negative={2:.2f} \n".format(
            document.confidence_scores.positive,
            document.confidence_scores.neutral,
            document.confidence_scores.negative,
        ))
        for sentence in document.sentences:
            print("Sentence: {}".format(sentence.text))
            print("Sentence sentiment: {}".format(sentence.sentiment))
            print("Sentence score:\nPositive={0:.2f}\nNeutral={1:.2f}\nNegative={2:.2f}\n".format(
                sentence.confidence_scores.positive,
                sentence.confidence_scores.neutral,
                sentence.confidence_scores.negative,
            ))
            for mined_opinion in sentence.mined_opinions:
                aspect = mined_opinion.aspect
                print("......'{}' aspect '{}'".format(aspect.sentiment, aspect.text))
                print("......Aspect score:\n......Positive={0:.2f}\n......Negative={1:.2f}\n".format(
                    aspect.confidence_scores.positive,
                    aspect.confidence_scores.negative,
                ))
                for opinion in mined_opinion.opinions:
                    print("......'{}' opinion '{}'".format(opinion.sentiment, opinion.text))
                    print("......Opinion score:\n......Positive={0:.2f}\n......Negative={1:.2f}\n".format(
                        opinion.confidence_scores.positive,
                        opinion.confidence_scores.negative,
                    ))
            print("\n")
        print("\n")
          
sentiment_analysis_with_opinion_mining_example(client)

输出Output

Document Sentiment: positive
Overall scores: positive=0.84; neutral=0.00; negative=0.16

Sentence: The food and service were unacceptable, but the concierge were nice
Sentence sentiment: positive
Sentence score:
Positive=0.84
Neutral=0.00
Negative=0.16

......'negative' aspect 'food'
......Aspect score:
......Positive=0.01
......Negative=0.99

......'negative' opinion 'unacceptable'
......Opinion score:
......Positive=0.01
......Negative=0.99

......'negative' aspect 'service'
......Aspect score:
......Positive=0.01
......Negative=0.99

......'negative' opinion 'unacceptable'
......Opinion score:
......Positive=0.01
......Negative=0.99

......'positive' aspect 'concierge'
......Aspect score:
......Positive=1.00
......Negative=0.00

......'positive' opinion 'nice'
......Opinion score:
......Positive=1.00
......Negative=0.00





Press any key to continue . . .

语言检测Language detection

创建一个名为 language_detection_example() 的新函数,该函数采用客户端作为参数,然后调用 detect_language() 函数。Create a new function called language_detection_example() that takes the client as an argument, then calls the detect_language() function. 如果成功,则返回的响应对象将在 primary_language 中包含检测到的语言,否则将包含 errorThe returned response object will contain the detected language in primary_language if successful, and an error if not.

提示

在某些情况下,可能很难根据输入区分语言。In some cases it may be hard to disambiguate languages based on the input. 可以使用 country_hint 参数指定 2 个字母的国家/地区代码。You can use the country_hint parameter to specify a 2-letter country code. 默认情况下,API 使用“US”作为默认的 countryHint,要删除此行为,可以通过将此值设置为空字符串 country_hint : "" 来重置此参数。By default the API is using the "US" as the default countryHint, to remove this behavior you can reset this parameter by setting this value to empty string country_hint : "".

def language_detection_example(client):
    try:
        documents = ["Ce document est rédigé en Français."]
        response = client.detect_language(documents = documents, country_hint = 'us')[0]
        print("Language: ", response.primary_language.name)

    except Exception as err:
        print("Encountered exception. {}".format(err))
language_detection_example(client)

输出Output

Language:  French

命名实体识别 (NER)Named Entity recognition (NER)

备注

在版本 3.1 中:In version 3.1:

  • 实体链接是一个独立于 NER 的请求。Entity linking is a separate request than NER.

创建一个名为 entity_recognition_example 的新函数,该函数采用客户端作为参数,然后调用 recognize_entities() 函数并循环访问结果。Create a new function called entity_recognition_example that takes the client as an argument, then calls the recognize_entities() function and iterates through the results. 如果成功,则返回的响应对象将在 entity 中包含检测到的实体列表,否则将包含 errorThe returned response object will contain the list of detected entities in entity if successful, and an error if not. 对于检测到的每个实体,输出其类别和子类别(如果存在)。For each detected entity, print its Category and Sub-Category if exists.

def entity_recognition_example(client):

    try:
        documents = ["I had a wonderful trip to Seattle last week."]
        result = client.recognize_entities(documents = documents)[0]

        print("Named Entities:\n")
        for entity in result.entities:
            print("\tText: \t", entity.text, "\tCategory: \t", entity.category, "\tSubCategory: \t", entity.subcategory,
                    "\n\tConfidence Score: \t", round(entity.confidence_score, 2), "\tLength: \t", entity.length, "\tOffset: \t", entity.offset, "\n")

    except Exception as err:
        print("Encountered exception. {}".format(err))
entity_recognition_example(client)

输出Output

Named Entities:

        Text:    trip   Category:        Event  SubCategory:     None
        Confidence Score:        0.61   Length:          4      Offset:          18

        Text:    Seattle        Category:        Location       SubCategory:     GPE
        Confidence Score:        0.82   Length:          7      Offset:          26

        Text:    last week      Category:        DateTime       SubCategory:     DateRange
        Confidence Score:        0.8    Length:          9      Offset:          34

实体链接Entity Linking

创建一个名为 entity_linking_example() 的新函数,该函数采用客户端作为参数,然后调用 recognize_linked_entities() 函数并循环访问结果。Create a new function called entity_linking_example() that takes the client as an argument, then calls the recognize_linked_entities() function and iterates through the results. 如果成功,则返回的响应对象将在 entities 中包含检测到的实体列表,否则将包含 errorThe returned response object will contain the list of detected entities in entities if successful, and an error if not. 由于链接实体是唯一标识的,因此同一实体的实例将以分组形式出现在 entity 对象下,显示为 match 对象的列表。Since linked entities are uniquely identified, occurrences of the same entity are grouped under a entity object as a list of match objects.

def entity_linking_example(client):

    try:
        documents = ["""Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975, 
        to develop and sell BASIC interpreters for the Altair 8800. 
        During his career at Microsoft, Gates held the positions of chairman,
        chief executive officer, president and chief software architect, 
        while also being the largest individual shareholder until May 2014."""]
        result = client.recognize_linked_entities(documents = documents)[0]

        print("Linked Entities:\n")
        for entity in result.entities:
            print("\tName: ", entity.name, "\tId: ", entity.data_source_entity_id, "\tUrl: ", entity.url,
            "\n\tData Source: ", entity.data_source)
            print("\tMatches:")
            for match in entity.matches:
                print("\t\tText:", match.text)
                print("\t\tConfidence Score: {0:.2f}".format(match.confidence_score))
                print("\t\tOffset: {}".format(match.offset))
                print("\t\tLength: {}".format(match.length))
            
    except Exception as err:
        print("Encountered exception. {}".format(err))
entity_linking_example(client)

输出Output

Linked Entities:

        Name:  Microsoft        Id:  Microsoft  Url:  https://en.wikipedia.org/wiki/Microsoft
        Data Source:  Wikipedia
        Matches:
                Text: Microsoft
                Confidence Score: 0.55
                Offset: 0
                Length: 9
                Text: Microsoft
                Confidence Score: 0.55
                Offset: 168
                Length: 9
        Name:  Bill Gates       Id:  Bill Gates         Url:  https://en.wikipedia.org/wiki/Bill_Gates
        Data Source:  Wikipedia
        Matches:
                Text: Bill Gates
                Confidence Score: 0.63
                Offset: 25
                Length: 10
                Text: Gates
                Confidence Score: 0.63
                Offset: 179
                Length: 5
        Name:  Paul Allen       Id:  Paul Allen         Url:  https://en.wikipedia.org/wiki/Paul_Allen
        Data Source:  Wikipedia
        Matches:
                Text: Paul Allen
                Confidence Score: 0.60
                Offset: 40
                Length: 10
        Name:  April 4  Id:  April 4    Url:  https://en.wikipedia.org/wiki/April_4
        Data Source:  Wikipedia
        Matches:
                Text: April 4
                Confidence Score: 0.32
                Offset: 54
                Length: 7
        Name:  BASIC    Id:  BASIC      Url:  https://en.wikipedia.org/wiki/BASIC
        Data Source:  Wikipedia
        Matches:
                Text: BASIC
                Confidence Score: 0.33
                Offset: 98
                Length: 5
        Name:  Altair 8800      Id:  Altair 8800        Url:  https://en.wikipedia.org/wiki/Altair_8800
        Data Source:  Wikipedia
        Matches:
                Text: Altair 8800
                Confidence Score: 0.88
                Offset: 125
                Length: 11

个人身份信息识别Personally Identifiable Information recognition

创建一个名为 pii_recognition_example 的新函数,该函数采用客户端作为参数,然后调用 recognize_pii_entities() 函数并循环访问结果。Create a new function called pii_recognition_example that takes the client as an argument, then calls the recognize_pii_entities() function and iterates through the results. 如果成功,则返回的响应对象将在 entity 中包含检测到的实体列表,否则将包含 errorThe returned response object will contain the list of detected entities in entity if successful, and an error if not. 对于检测到的每个实体,输出其类别和子类别(如果存在)。For each detected entity, print its Category and Sub-Category if exists.

def pii_recognition_example(client):
    documents = [
        "The employee's SSN is 859-98-0987.",
        "The employee's phone number is 555-555-5555."
    ]
    response = client.recognize_pii_entities(documents, language="en")
    result = [doc for doc in response if not doc.is_error]
    for doc in result:
        print("Redacted Text: {}".format(doc.redacted_text))
        for entity in doc.entities:
            print("Entity: {}".format(entity.text))
            print("\tCategory: {}".format(entity.category))
            print("\tConfidence Score: {}".format(entity.confidence_score))
            print("\tOffset: {}".format(entity.offset))
            print("\tLength: {}".format(entity.length))
pii_recognition_example(client)

输出Output

Redacted Text: The employee's SSN is ***********.
Entity: 859-98-0987
        Category: U.S. Social Security Number (SSN)
        Confidence Score: 0.65
        Offset: 22
        Length: 11
Redacted Text: The employee's phone number is ************.
Entity: 555-555-5555
        Category: Phone Number
        Confidence Score: 0.8
        Offset: 31
        Length: 12

关键短语提取Key phrase extraction

创建一个名为 key_phrase_extraction_example() 的新函数,该函数采用客户端作为参数,然后调用 extract_key_phrases() 函数。Create a new function called key_phrase_extraction_example() that takes the client as an argument, then calls the extract_key_phrases() function. 如果成功,结果将包含 key_phrases 中检测到的关键短语列表,如果失败,则将包含 errorThe result will contain the list of detected key phrases in key_phrases if successful, and an error if not. 输出任何检测到的关键短语。Print any detected key phrases.

def key_phrase_extraction_example(client):

    try:
        documents = ["My cat might need to see a veterinarian."]

        response = client.extract_key_phrases(documents = documents)[0]

        if not response.is_error:
            print("\tKey Phrases:")
            for phrase in response.key_phrases:
                print("\t\t", phrase)
        else:
            print(response.id, response.error)

    except Exception as err:
        print("Encountered exception. {}".format(err))
        
key_phrase_extraction_example(client)

输出Output

    Key Phrases:
         cat
         veterinarian

使用“分析”操作异步使用 APIUse the API asynchronously with the Analyze operation

注意

若要使用“分析”操作,请确保 Azure 资源使用 S 标准定价层。To use the Analyze operation, make sure your Azure resource is using the S standard pricing tier.

创建一个名为 analyze_example() 的新函数,该函数采用客户端作为参数,然后调用 begin_analyze() 函数。Create a new function called analyze_example() that takes the client as an argument, then calls the begin_analyze() function. 结果将是一个长期操作,将轮询该操作以获得结果。The result will be a long running operation which will be polled for results.

    def analyze_example(client):
        documents = [
            "Microsoft was founded by Bill Gates and Paul Allen."
        ]

        poller = text_analytics_client.begin_analyze(
            documents,
            display_name="Sample Text Analysis",
            entities_recognition_tasks=[EntitiesRecognitionTask()]
        )

        result = poller.result()

        for page in result:
            for task in page.entities_recognition_results:
                print("Results of Entities Recognition task:")
                
                docs = [doc for doc in task.results if not doc.is_error]
                for idx, doc in enumerate(docs):
                    print("\nDocument text: {}".format(documents[idx]))
                    for entity in doc.entities:
                        print("Entity: {}".format(entity.text))
                        print("...Category: {}".format(entity.category))
                        print("...Confidence Score: {}".format(entity.confidence_score))
                        print("...Offset: {}".format(entity.offset))
                    print("------------------------------------------")

analyze_example(client)

OutputOutput

Results of Entities Recognition task:
Document text: Microsoft was founded by Bill Gates and Paul Allen.
Entity: Microsoft
...Category: Organization
...Confidence Score: 0.83
...Offset: 0
Entity: Bill Gates
...Category: Person
...Confidence Score: 0.85
...Offset: 25
Entity: Paul Allen
...Category: Person
...Confidence Score: 0.9
...Offset: 40
------------------------------------------

还可以使用“分析”操作来检测 PII 和关键短语提取。You can also use the Analyze operation to detect PII and key phrase extraction. 请参阅 GitHub 上的分析示例See the Analyze sample on GitHub.

重要

  • 文本分析 API 的最新稳定版本为 3.0The latest stable version of the Text Analytics API is 3.0.
    • 确保只按所用版本的说明操作。Be sure to only follow the instructions for the version you are using.

先决条件Prerequisites

  • 最新版本的 cURLThe current version of cURL.
  • 你有了 Azure 订阅后,将在 Azure 门户中创建文本分析资源 ,以获取你的密钥和终结点。Once you have your Azure subscription, create a Text Analytics resource in the Azure portal to get your key and endpoint. 部署后,单击“转到资源”。After it deploys, click Go to resource.
    • 你需要从创建的资源获取密钥和终结点,以便将应用程序连接到文本分析 API。You will need the key and endpoint from the resource you create to connect your application to the Text Analytics API. 你稍后会在快速入门中将密钥和终结点粘贴到下方的代码中。You'll paste your key and endpoint into the code below later in the quickstart.
    • 可以使用免费定价层 (F0) 试用该服务,然后再升级到付费层进行生产。You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

备注

  • 以下 BASH 示例使用 \ 行继续符。The following BASH examples use the \ line continuation character. 如果你的控制台或终端使用不同的行继续符,请使用该字符。If your console or terminal uses a different line continuation character, use that character.
  • 可以在 GitHub 上找到特定于语言的示例。You can find language specific samples on GitHub.
  • 访问 Azure 门户,并查找在满足先决条件的过程中创建的文本分析资源的密钥和终结点。Go to the Azure portal and find the key and endpoint for the Text Analytics resource you created in the prerequisites. 它们将位于资源的“密钥和终结点”页的“资源管理”下。They will be located on the resource's key and endpoint page, under resource management. 然后,将以下代码中的字符串替换为你的密钥和终结点。Then replace the strings in the code below with your key and endpoint. 若要调用文本分析 API,需要以下信息:To call the Text Analytics API, you need the following information:
参数 (parameter)parameter 说明Description
-X POST <endpoint> 指定用于访问 API 的终结点。Specifies your endpoint for accessing the API.
-H Content-Type: application/json 用于发送 JSON 数据的内容类型。The content type for sending JSON data.
-H "Ocp-Apim-Subscription-Key:<key> 指定用于访问 API 的密钥。Specifies the key for accessing the API.
-d <documents> 包含要发送的文档的 JSON。The JSON containing the documents you want to send.

以下 cURL 命令从 BASH shell 中执行。The following cURL commands are executed from a BASH shell. 请使用自己的资源名称、资源密钥和 JSON 值编辑这些命令。Edit these commands with your own resource name, resource key, and JSON values.

情绪分析Sentiment Analysis

  1. 将命令复制到文本编辑器中。Copy the command into a text editor.
  2. 必要时在命令中进行如下更改:Make the following changes in the command where needed:
    1. 将值 <your-text-analytics-key-here> 替换为你的值。Replace the value <your-text-analytics-key-here> with your key.
    2. 将请求 URL 的第一部分 (<your-text-analytics-endpoint-here>) 替换为你自己的终结点 URL。Replace the first part of the request URL <your-text-analytics-endpoint-here> with the your own endpoint URL.
  3. 打开命令提示符窗口。Open a command prompt window.
  4. 将文本编辑器中的命令粘贴到命令提示符窗口,然后运行命令。Paste the command from the text editor into the command prompt window, and then run the command.

备注

以下示例包含使用 opinionMining=true 对情绪分析的观点挖掘功能发出的请求,它精细地描述了对文本中某些方面(例如产品或服务的属性)的观点。The below example includes a request for the Opinion Mining feature of Sentiment Analysis using the opinionMining=true parameter, which provides granular information about the opinions related to aspects (such as the attributes of products or services) in text.

curl -X POST https://<your-text-analytics-endpoint-here>/text/analytics/v3.1-preview.3/sentiment?opinionMining=true \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-text-analytics-key-here>" \
-d '{ documents: [{ id: "1", text: "The customer service here is really good."}]}'

JSON 响应JSON response

{
   "documents":[
      {
         "id":"1",
         "sentiment":"positive",
         "confidenceScores":{
            "positive":1.0,
            "neutral":0.0,
            "negative":0.0
         },
         "sentences":[
            {
               "sentiment":"positive",
               "confidenceScores":{
                  "positive":1.0,
                  "neutral":0.0,
                  "negative":0.0
               },
               "offset":0,
               "length":41,
               "text":"The customer service here is really good.",
               "aspects":[
                  {
                     "sentiment":"positive",
                     "confidenceScores":{
                        "positive":1.0,
                        "negative":0.0
                     },
                     "offset":4,
                     "length":16,
                     "text":"customer service",
                     "relations":[
                        {
                           "relationType":"opinion",
                           "ref":"#/documents/0/sentences/0/opinions/0"
                        }
                     ]
                  }
               ],
               "opinions":[
                  {
                     "sentiment":"positive",
                     "confidenceScores":{
                        "positive":1.0,
                        "negative":0.0
                     },
                     "offset":36,
                     "length":4,
                     "text":"good",
                     "isNegated":false
                  }
               ]
            }
         ],
         "warnings":[
            
         ]
      }
   ],
   "errors":[
      
   ],
   "modelVersion":"2020-04-01"
}

语言检测Language detection

  1. 将命令复制到文本编辑器中。Copy the command into a text editor.
  2. 必要时在命令中进行如下更改:Make the following changes in the command where needed:
    1. 将值 <your-text-analytics-key-here> 替换为你的值。Replace the value <your-text-analytics-key-here> with your key.
    2. 将请求 URL 的第一部分 (<your-text-analytics-endpoint-here>) 替换为你自己的终结点 URL。Replace the first part of the request URL <your-text-analytics-endpoint-here> with the your own endpoint URL.
  3. 打开命令提示符窗口。Open a command prompt window.
  4. 将文本编辑器中的命令粘贴到命令提示符窗口,然后运行命令。Paste the command from the text editor into the command prompt window, and then run the command.
curl -X POST https://<your-text-analytics-endpoint-here>/text/analytics/v3.1-preview.3/languages/ \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-text-analytics-key-here>" \
-d '{ documents: [{ id: "1", text: "This is a document written in English."}]}'

JSON 响应JSON response

{
   "documents":[
      {
         "id":"1",
         "detectedLanguage":{
            "name":"English",
            "iso6391Name":"en",
            "confidenceScore":0.99
         },
         "warnings":[
            
         ]
      }
   ],
   "errors":[
      
   ],
   "modelVersion":"2020-09-01"
}

命名实体识别 (NER)Named Entity Recognition (NER)

  1. 将命令复制到文本编辑器中。Copy the command into a text editor.
  2. 必要时在命令中进行如下更改:Make the following changes in the command where needed:
    1. 将值 <your-text-analytics-key-here> 替换为你的值。Replace the value <your-text-analytics-key-here> with your key.
    2. 将请求 URL 的第一部分 (<your-text-analytics-endpoint-here>) 替换为你自己的终结点 URL。Replace the first part of the request URL <your-text-analytics-endpoint-here> with the your own endpoint URL.
  3. 打开命令提示符窗口。Open a command prompt window.
  4. 将文本编辑器中的命令粘贴到命令提示符窗口,然后运行命令。Paste the command from the text editor into the command prompt window, and then run the command.
curl -X POST https://<your-text-analytics-endpoint-here>/text/analytics/v3.1-preview.3/entities/recognition/general \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-text-analytics-key-here>" \
-d '{ documents: [{ id: "1", language:"en", text: "I had a wonderful trip to Seattle last week."}]}'

JSON 响应JSON response

{
   "documents":[
      {
         "id":"1",
         "entities":[
            {
               "text":"trip",
               "category":"Event",
               "offset":18,
               "length":4,
               "confidenceScore":0.61
            },
            {
               "text":"Seattle",
               "category":"Location",
               "subcategory":"GPE",
               "offset":26,
               "length":7,
               "confidenceScore":0.82
            },
            {
               "text":"last week",
               "category":"DateTime",
               "subcategory":"DateRange",
               "offset":34,
               "length":9,
               "confidenceScore":0.8
            }
         ],
         "warnings":[
            
         ]
      }
   ],
   "errors":[
      
   ],
   "modelVersion":"2020-04-01"
}

检测个人身份信息Detecting personally identifying information

  1. 将命令复制到文本编辑器中。Copy the command into a text editor.
  2. 必要时在命令中进行如下更改:Make the following changes in the command where needed:
    1. 将值 <your-text-analytics-key-here> 替换为你的值。Replace the value <your-text-analytics-key-here> with your key.
    2. 将请求 URL 的第一部分 (<your-text-analytics-endpoint-here>) 替换为你自己的终结点 URL。Replace the first part of the request URL <your-text-analytics-endpoint-here> with the your own endpoint URL.
  3. 打开命令提示符窗口。Open a command prompt window.
  4. 将文本编辑器中的命令粘贴到命令提示符窗口,然后运行命令。Paste the command from the text editor into the command prompt window, and then run the command.
curl -X POST https://your-text-analytics-endpoint-here>/text/analytics/v3.1-preview.3/entities/recognition/pii \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-text-analytics-key-here>" \
-d '{ documents: [{ id: "1", language:"en", text: "Call our office at 312-555-1234, or send an email to support@contoso.com"}]}'

JSON 响应JSON response

{
   "documents":[
      {
         "redactedText":"Insurance policy for *** on file 123-12-1234 is here by approved.",
         "id":"1",
         "entities":[
            {
               "text":"SSN",
               "category":"Organization",
               "offset":21,
               "length":3,
               "confidenceScore":0.45
            }
         ],
         "warnings":[
            
         ]
      }
   ],
   "errors":[
      
   ],
   "modelVersion":"2020-07-01"
}

实体链接Entity linking

  1. 将命令复制到文本编辑器中。Copy the command into a text editor.
  2. 必要时在命令中进行如下更改:Make the following changes in the command where needed:
    1. 将值 <your-text-analytics-key-here> 替换为你的值。Replace the value <your-text-analytics-key-here> with your key.
    2. 将请求 URL 的第一部分 (<your-text-analytics-endpoint-here>) 替换为你自己的终结点 URL。Replace the first part of the request URL <your-text-analytics-endpoint-here> with the your own endpoint URL.
  3. 打开命令提示符窗口。Open a command prompt window.
  4. 将文本编辑器中的命令粘贴到命令提示符窗口,然后运行命令。Paste the command from the text editor into the command prompt window, and then run the command.
curl -X POST https://<your-text-analytics-endpoint-here>/text/analytics/v3.1-preview.3/entities/linking \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-text-analytics-key-here>" \
-d '{ documents: [{ id: "1", language:"en", text: "Microsoft was founded by Bill Gates and Paul Allen on April 4, 1975."}]}'

JSON 响应JSON response

{
   "documents":[
      {
         "id":"1",
         "entities":[
            {
               "bingId":"a093e9b9-90f5-a3d5-c4b8-5855e1b01f85",
               "name":"Microsoft",
               "matches":[
                  {
                     "text":"Microsoft",
                     "offset":0,
                     "length":9,
                     "confidenceScore":0.48
                  }
               ],
               "language":"en",
               "id":"Microsoft",
               "url":"https://en.wikipedia.org/wiki/Microsoft",
               "dataSource":"Wikipedia"
            },
            {
               "bingId":"0d47c987-0042-5576-15e8-97af601614fa",
               "name":"Bill Gates",
               "matches":[
                  {
                     "text":"Bill Gates",
                     "offset":25,
                     "length":10,
                     "confidenceScore":0.52
                  }
               ],
               "language":"en",
               "id":"Bill Gates",
               "url":"https://en.wikipedia.org/wiki/Bill_Gates",
               "dataSource":"Wikipedia"
            },
            {
               "bingId":"df2c4376-9923-6a54-893f-2ee5a5badbc7",
               "name":"Paul Allen",
               "matches":[
                  {
                     "text":"Paul Allen",
                     "offset":40,
                     "length":10,
                     "confidenceScore":0.54
                  }
               ],
               "language":"en",
               "id":"Paul Allen",
               "url":"https://en.wikipedia.org/wiki/Paul_Allen",
               "dataSource":"Wikipedia"
            },
            {
               "bingId":"52535f87-235e-b513-54fe-c03e4233ac6e",
               "name":"April 4",
               "matches":[
                  {
                     "text":"April 4",
                     "offset":54,
                     "length":7,
                     "confidenceScore":0.38
                  }
               ],
               "language":"en",
               "id":"April 4",
               "url":"https://en.wikipedia.org/wiki/April_4",
               "dataSource":"Wikipedia"
            }
         ],
         "warnings":[
            
         ]
      }
   ],
   "errors":[
      
   ],
   "modelVersion":"2020-02-01"
}

关键短语提取Key phrase extraction

  1. 将命令复制到文本编辑器中。Copy the command into a text editor.
  2. 必要时在命令中进行如下更改:Make the following changes in the command where needed:
    1. 将值 <your-text-analytics-key-here> 替换为你的值。Replace the value <your-text-analytics-key-here> with your key.
    2. 将请求 URL 的第一部分 (<your-text-analytics-endpoint-here>) 替换为你自己的终结点 URL。Replace the first part of the request URL <your-text-analytics-endpoint-here> with the your own endpoint URL.
  3. 打开命令提示符窗口。Open a command prompt window.
  4. 将文本编辑器中的命令粘贴到命令提示符窗口,然后运行命令。Paste the command from the text editor into the command prompt window, and then run the command.
curl -X POST https://<your-text-analytics-endpoint-here>/text/analytics/v3.1-preview.3/keyPhrases \
-H "Content-Type: application/json" \
-H "Ocp-Apim-Subscription-Key: <your-text-analytics-key-here>" \
-d '{ documents: [{ id: "1", language:"en", text: "Hello world. This is some input text that I love."}]}'
{
   "documents":[
      {
         "id":"1",
         "keyPhrases":[
            "wonderful trip",
            "Seattle",
            "week"
         ],
         "warnings":[
            
         ]
      }
   ],
   "errors":[
      
   ],
   "modelVersion":"2020-07-01"
}

清理资源Clean up resources

如果想要清理并删除认知服务订阅,可以删除资源或资源组。If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. 删除资源组同时也会删除与之相关联的任何其他资源。Deleting the resource group also deletes any other resources associated with it.

后续步骤Next steps