您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

文本分析 API 中有哪些新功能?What's new in the Text Analytics API?

文本分析 API 会持续更新。The Text Analytics API is updated on an ongoing basis. 为了让大家随时了解最新的开发成果,本文介绍了新版本和新功能。To stay up-to-date with recent developments, this article provides you with information about new releases and features.

2021 年 2 月February 2021

  • 2021-01-15命名实体识别-预览版. x 中 PII 终结点的模型版本,它提供The 2021-01-15 model version for the PII endpoint in Named Entity Recognition v3.1-preview.x, which provides
    • 对9种新语言的扩展支持Expanded support for 9 new languages
    • 提高了支持的语言的已命名实体类别的 AI 质量。Improved AI quality of named entity categories for supported languages.
  • S0 到 S4 定价层将于2021年3月8日停用。The S0 through S4 pricing tiers are being retired on March 8th, 2021. 如果现有文本分析资源使用 S0 到 S4 定价层,则应将其更新为使用标准 (S) 定价层If you have an existing Text Analytics resource using the S0 through S4 pricing tier, you should update it to use the Standard (S) pricing tier.
  • 语言检测容器现已正式发布。The language detection container is now generally available.
  • 此 API 的2.1 版正在停用。v2.1 of the API is being retired.

2021 年 1 月January 2021

这些模型版本目前在美国东部地区不可用。These model versions are currently unavailable in the East US region.

2020 年 12 月December 2020

2020 年 11 月November 2020

  • 一个 新的终结点 ,具有文本分析 API 的3.1 版-预览版。3对于新的异步 分析 API,它支持批处理 NER、PII 和关键短语提取操作。A new endpoint with Text Analytics API v3.1-preview.3 for the new asynchronous Analyze API, which supports batch processing for NER, PII, and key phrase extraction operations.

  • 一个 新的终结点 ,具有文本分析 API 的3.1 版-预览版。3对于 运行状况 托管 API 的新异步文本分析,支持批处理。A new endpoint with Text Analytics API v3.1-preview.3 for the new asynchronous Text Analytics for health hosted API with support for batch processing.

  • 上面列出的两项新功能仅在以下区域提供: West US 2East US 2Central US North EuropeWest Europe 区域。Both new features listed above are only available in the following regions: West US 2, East US 2, Central US, North Europe and West Europe regions.

  • (巴西) pt-BR 现在支持 情绪分析 v3. x,从模型版本开始 2020-04-01Portuguese (Brazil) pt-BR is now supported in Sentiment Analysis v3.x, starting with model version 2020-04-01. 它增加了对葡萄牙语的现有 pt-PT 支持。It adds to the existing pt-PT support for Portuguese.

  • 更新了客户端库,其中包括异步分析和运行状况操作的文本分析。Updated client libraries, which include asynchronous Analyze, and Text Analytics for health operations. 可在 GitHub 上找到示例:You can find examples on GitHub:

2020 年 10 月October 2020

  • 从模型版本开始,对情绪分析 v3. x 的印地语支持 2020-04-01Hindi support for Sentiment Analysis v3.x, starting with model version 2020-04-01.
  • 2020-09-01V3/languages 终结点的模型版本,增加了语言检测和准确性改进。Model version 2020-09-01 for the v3 /languages endpoint, which adds increased language detection and accuracy improvements.
  • 印度中部和阿拉伯联合酋长国北部中的 v3 可用性。v3 availability in Central India and UAE North.

2020 年 9 月September 2020

常规 API 更新General API updates

  • 为文本分析3.1 公共预览版发布新 URL,以支持对以下命名实体识别 v3 终结点进行更新:Release of a new URL for the Text Analytics v3.1 public preview to support updates to the following Named Entity Recognition v3 endpoints:
    • /pii 现在,终结点 redactedText 在响应 JSON 中包含新的属性,其中,在输入文本中检测到的 PII 实体将替换 * 为这些实体中的每个字符的。/pii endpoint now includes the new redactedText property in the response JSON where detected PII entities in the input text are replaced by an * for each character of those entities.
    • /linking 现在,终结点在 bingID 响应 JSON 中包括链接实体的属性。/linking endpoint now includes the bingID property in the response JSON for linked entities.
  • 以下文本分析预览版 API 终结点于2020年9月4日停用:The following Text Analytics preview API endpoints were retired on September 4th, 2020:
    • ws 2.1-预览v2.1-preview
    • v3.0-previewv3.0-preview
    • 3.0-预览。1v3.0-preview.1

运行状况容器更新的文本分析Text Analytics for health container updates

以下更新仅特定于运行状况容器文本分析的九月版本。The following updates are specific to the September release of the Text Analytics for health container only.

  • 已将带有新型号版本的标记的新容器映像 1.1.013530001-amd64-preview 2020-09-03 发布到容器预览存储库。A new container image with tag 1.1.013530001-amd64-preview with the new model-version 2020-09-03 has been released to the container preview repository.
  • 此模型版本提供实体识别、缩写检测和延迟增强功能的改进。This model version provides improvements in entity recognition, abbreviation detection, and latency enhancements.

2020 年 8 月August 2020

常规 API 更新General API updates

  • V3 的模型版本 2020-07-01 /keyphrases /pii/languages 终结点,这些终结点用于添加:Model version 2020-07-01 for the v3 /keyphrases, /pii and /languages endpoints, which adds:
    • 命名实体识别的其他政府和国家特定 实体类别Additional government and country specific entity categories for Named Entity Recognition.
    • 情绪分析 v3 中的挪威语和土耳其语支持。Norwegian and Turkish support in Sentiment Analysis v3.
  • 对于超过已发布 数据限制的 v3 API 请求,将返回 HTTP 400 错误。An HTTP 400 error will now be returned for v3 API requests that exceed the published data limits.
  • 返回偏移量的终结点现在支持可选的 stringIndexType 参数,该参数将调整返回的 offset length 值和值,以匹配支持的 字符串索引方案Endpoints that return an offset now support the optional stringIndexType parameter, which adjusts the returned offset and length values to match a supported string index scheme.

运行状况容器更新的文本分析Text Analytics for health container updates

以下更新仅特定于运行状况容器文本分析的8月发行版。The following updates are specific to the August release of the Text Analytics for health container only.

  • 新模型-版本文本分析用于运行状况: 2020-07-24New model-version for Text Analytics for health: 2020-07-24
  • 用于发送运行状况请求文本分析的新 URL: http://<serverURL>:5000/text/analytics/v3.2-preview.1/entities/health (请注意,需要使用浏览器缓存才能使用此新容器映像中包含的 demo web 应用) New URL for sending Text Analytics for health requests: http://<serverURL>:5000/text/analytics/v3.2-preview.1/entities/health (Please note that a browser cache clearing will be needed in order to use the demo web app included in this new container image)

JSON 响应中的以下属性已更改:The following properties in the JSON response have changed:

  • type 已重名为 categorytype has been renamed to category
  • score 已重名为 confidenceScorescore has been renamed to confidenceScore
  • JSON 输出的字段中的实体 category 现在采用 pascal 大小写格式。Entities in the category field of the JSON output are now in pascal case. 以下实体已重命名:The following entities have been renamed:
    • EXAMINATION_RELATION 已重命名为 RelationalOperatorEXAMINATION_RELATION has been renamed to RelationalOperator.
    • EXAMINATION_UNIT 已重命名为 MeasurementUnitEXAMINATION_UNIT has been renamed to MeasurementUnit.
    • EXAMINATION_VALUE 已重命名为 MeasurementValueEXAMINATION_VALUE has been renamed to MeasurementValue.
    • ROUTE_OR_MODE 已重命名 MedicationRouteROUTE_OR_MODE has been renamed MedicationRoute.
    • 关系实体已 ROUTE_OR_MODE_OF_MEDICATION 重命名为 RouteOfMedicationThe relational entity ROUTE_OR_MODE_OF_MEDICATION has been renamed to RouteOfMedication.

添加了以下实体:The following entities have been added:

  • NERNER

    • AdministrativeEvent
    • CareEnvironment
    • HealthcareProfession
    • MedicationForm
  • 关系提取Relation extraction

    • DirectionOfCondition
    • DirectionOfExamination
    • DirectionOfTreatment

2020 年 7 月July 2020

文本分析 for health 容器-公共封闭预览Text Analytics for health container - Public gated preview

运行状况容器的文本分析现在处于公共封闭预览版中,可让你从临床文档中的非结构化英语文本中提取信息,例如:患者进气窗体、医生说明、研究论文和解雇汇总。The Text Analytics for health container is now in public gated preview, which lets you extract information from unstructured English-language text in clinical documents such as: patient intake forms, doctor's notes, research papers and discharge summaries. 目前,你不会对运行状况容器使用情况的文本分析计费。Currently, you will not be billed for Text Analytics for health container usage.

容器提供以下功能:The container offers the following features:

  • 命名实体识别Named Entity Recognition
  • 关系提取Relation extraction
  • 实体链接Entity linking
  • 否定Negation

2020 年 5 月May 2020

文本分析 API v3 正式发布Text Analytics API v3 General Availability

文本分析 API v3 现已正式发布,具有以下更新:Text Analysis API v3 is now generally available with the following updates:

已在 JSON 响应中添加了以下属性:The following properties in the JSON response have been added:

  • 情绪分析中的 SentenceTextSentenceText in Sentiment Analysis
  • 每个文档的 WarningsWarnings for each document

JSON 响应中以下属性的名称已更改(如果适用):The names of the following properties in the JSON response have been changed, where applicable:

  • score 已重名为 confidenceScorescore has been renamed to confidenceScore
    • confidenceScore 的精度为小数点后两位。confidenceScore has two decimal points of precision.
  • type 已重名为 categorytype has been renamed to category
  • subtype 已重名为 subcategorysubtype has been renamed to subcategory

文本分析 API 3.1 公共预览版Text Analytics API v3.1 Public Preview

  • 新情绪分析功能- 观点挖掘New Sentiment Analysis feature - Opinion Mining
  • PII) 域筛选器的新个人 ( () 中的受保护的健康信息 PHINew Personal (PII) domain filter for protected health information (PHI).

2020 年 2 月February 2020

SDK 对文本分析 API v3 公共预览版的支持SDK support for Text Analytics API v3 Public Preview

作为统一 Azure SDK 版本的一部分,文本分析 API v3 SDK 现已作为以下编程语言的公共预览版提供:As part of the unified Azure SDK release, the Text Analytics API v3 SDK is now available as a public preview for the following programming languages:

命名实体识别 v3 公共预览版Named Entity Recognition v3 public preview

其他实体类型现已在命名实体识别 (NER) v3 公共预览服务中提供,因为我们展开了在文本中找到的 "常规" 和 "个人" 信息实体的检测。Additional entity types are now available in the Named Entity Recognition (NER) v3 public preview service as we expand the detection of general and personal information entities found in text. 此更新引入了 模型版本 2020-02-01 ,其中包括:This update introduces model version 2020-02-01, which includes:

  • 以下常规实体类型的识别仅 (英语) :Recognition of the following general entity types (English only):

    • PersonTypePersonType
    • ProductsProduct
    • 事件Event
    • 地缘政治实体 (GPE) 作为 "位置" 下的子类型Geopolitical Entity (GPE) as a subtype under Location
    • 技能Skill
  • 识别以下个人信息实体类型 (仅) 英语:Recognition of the following personal information entity types (English only):

    • 人员Person
    • 组织Organization
    • 在数量下作为子类型的年龄Age as a subtype under Quantity
    • 日期作为 DateTime 下的子类型Date as a subtype under DateTime
    • 电子邮件Email
    • 电话号码 (仅限我们) Phone Number (US only)
    • 代码URL
    • IP 地址IP Address

2019 年 10 月October 2019

命名实体识别 (NER)Named Entity Recognition (NER)

  • 用于识别个人信息实体类型的 新终结点 仅 (英语) A new endpoint for recognizing personal information entity types (English only)

  • 用于 实体识别实体链接的不同终结点。Separate endpoints for entity recognition and entity linking.

  • 模型版本 2019-10-01 ,其中包括:Model version 2019-10-01, which includes:

    • 扩展了文本中实体的检测和分类。Expanded detection and categorization of entities found in text.
    • 以下新实体类型的识别:Recognition of the following new entity types:
      • 电话号码Phone number
      • IP 地址IP address

实体链接支持英语和西班牙语。Entity linking supports English and Spanish. NER 语言支持因实体类型而异。NER language support varies by the entity type.

情绪分析 v3 公共预览版Sentiment Analysis v3 public preview

  • 用于分析情绪的 新终结点A new endpoint for analyzing sentiment.

  • 模型版本 2019-10-01 ,其中包括:Model version 2019-10-01, which includes:

    • API 文本分类和评分的准确性和详细信息的显著改进。Significant improvements in the accuracy and detail of the API's text categorization and scoring.
    • 为文本中的不同情绪自动添加标签。Automatic labeling for different sentiments in text.
    • 在文档和句子级别上情绪分析和输出。Sentiment analysis and output on a document and sentence level.

它支持英语 (en) 、日语 (ja) 、简体中文 (zh-Hans) 、繁体中文 (zh-Hant) 、法语 () 、 fr 意大利语 () 、 it 西班牙语 (es) 、荷兰语 (nl) 、葡萄牙语 (pt) 和德语 (de) ,以及以下区域提供:、 Australia East Central CanadaCentral US East Asia East US East US 2 North Europe Southeast Asia South Central US UK South West Europe West US 2 、、、、、、、、、、、、、、和。It supports English (en), Japanese (ja), Chinese Simplified (zh-Hans), Chinese Traditional (zh-Hant), French (fr), Italian (it), Spanish (es), Dutch (nl), Portuguese (pt), and German (de), and is available in the following regions: Australia East, Central Canada, Central US, East Asia, East US, East US 2, North Europe, Southeast Asia, South Central US, UK South, West Europe, and West US 2.

后续步骤Next steps