您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

识别打印文本和手写文本Recognize printed and handwritten text

计算机视觉提供多个服务用于检测和提取图像中显示的打印文本或手写文本。Computer Vision provides a number of services that detect and extract printed or handwritten text that appears in images. 在笔录、医疗记录、保安和银行等场景中,这些服务非常有用。This is useful in a variety of scenarios such as note taking, medical records, security, and banking. 以下三个部分详细介绍了针对不同用例优化的三个不同文本识别 API。The following three sections detail three different text recognition APIs, each optimized for different use cases.

读取 APIRead API

读取 API 使用我们最新的识别模型检测图像中的文本内容,并将已识别的文本转换为机器可读的字符流。The Read API detects text content in an image using our latest recognition models and converts the identified text into a machine-readable character stream. 该 API 已针对包含大量文本的图像(例如,数码扫描的文档)以及包含大量视觉噪点的图像进行优化。It's optimized for text-heavy images (such as documents that have been digitally scanned) and for images with a lot of visual noise. 它将确定用于每行文本的识别模型,并支持包含印刷文本和手写文本的图像。It will determine which recognition model to use for each line of text, supporting images with both printed and handwritten text. 读取 API 以异步方式执行,因为处理较大文档时,可能需要花费好几分钟才能返回结果。The Read API executes asynchronously because larger documents can take several minutes to return a result.

“读取”操作会在其输出中保留已识别字的原始行分组。The Read operation maintains the original line groupings of recognized words in its output. 每一行附带边框坐标,行中的每个字也有其自身的坐标。Each line comes with bounding box coordinates, and each word within the line also has its own coordinates. 如果某个字的识别置信度较低,该结果中也会反映该信息。If a word was recognized with low confidence, that information is conveyed as well. 有关详细信息,请参阅阅读 API 参考文档See the Read API reference docs to learn more.

备注

此功能仅适用于英语文本。This feature is only available for English text.

图像要求Image requirements

读取 API 可以处理符合以下要求的图像:The Read API works with images that meet the following requirements:

  • 图像必须以 JPEG、PNG、BMP、PDF 或 TIFF 格式显示。The image must be presented in JPEG, PNG, BMP, PDF, or TIFF format.
  • 图像的尺寸必须介于 50 x 50 和 10000 x 10000 像素之间。The dimensions of the image must be between 50 x 50 and 10000 x 10000 pixels. PDF 页面必须为 17 x 17 英寸或更小。PDF pages must be 17 x 17 inches or smaller.
  • 图像的文件大小必须小于 20 MB。The file size of the image must be less than 20 megabytes (MB).

限制Limitations

如果使用的是免费层订阅,读取 API 只会处理 PDF 或 TIFF 文档的前两个页面。If you are using a free-tier subscription, the Read API will only process the first two pages of a PDF or TIFF document. 使用付费订阅时,它最多可以处理 200 个页面。With a paid subscription, it will process up to 200 pages. 另请注意,该 API 最多检测每个页面中的 300 行。Also note that the API will detect a maximum of 300 lines per page.

OCR(光学字符识别)APIOCR (optical character recognition) API

计算机视觉的光学字符识别 (OCR) API 类似于读取 API,但它以同步方式执行,且未针对大型文档进行优化。Computer Vision's optical character recognition (OCR) API is similar to the Read API, but it executes synchronously and is not optimized for large documents. 该 API 使用早期的识别模型,但可以处理更多的语言;如需所支持的语言的完整列表,请参阅语言支持It uses an earlier recognition model but works with more languages; see Language support for a full list of the supported languages.

如有必要,OCR 会更正已识别文本的旋转,方法是以度为单位返回有关水平图像轴的旋转偏移量。If necessary, OCR corrects the rotation of the recognized text by returning the rotational offset in degrees about the horizontal image axis. OCR 还提供每个字的帧坐标,如下图所示。OCR also provides the frame coordinates of each word, as seen in the following illustration.

正在旋转的图像及其正在读取和画出的文本

有关详细信息,请参阅 OCR 参考文档See the OCR reference docs to learn more.

图像要求Image requirements

OCR API 可以处理符合以下要求的图像:The OCR API works on images that meet the following requirements:

  • 图像必须以 JPEG、PNG、GIF 或 BMP 格式显示。The image must be presented in JPEG, PNG, GIF, or BMP format.
  • 输入图像的大小必须介于 50 x 50 和 4200 x 4200 像素之间。The size of the input image must be between 50 x 50 and 4200 x 4200 pixels.
  • 图像中的文本可以旋转 90 度的任何倍数再加最多 40 度的小角度。The text in the image can be rotated by any multiple of 90 degrees plus a small angle of up to 40 degrees.

限制Limitations

在以文本为主的照片上,误报可能来自部分识别的字词。On photographs where text is dominant, false positives may come from partially recognized words. 在某些照片上,尤其是在不包含任何文本的照片上,精度因图像的类型而异。On some photographs, especially photos without any text, precision can vary depending on the type of image.

识别文本 APIRecognize Text API

备注

随着读取 API 的推出,识别文本 API 已弃用。The Recognize Text API is being deprecated in favor of the Read API. 读取 API 具有类似的功能,经更新后可以处理 PDF、TIFF 和多页文件。The Read API has similar capabilities and is updated to handle PDF, TIFF, and multi-page files.

识别文本 API 类似于 OCR,但它以异步方式执行,并使用更新的识别模型。The Recognize Text API is similar to OCR, but it executes asynchronously and uses updated recognition models. 有关详细信息,请参阅识别文本 API 参考文档See the Recognize Text API reference docs to learn more.

图像要求Image requirements

识别文本 API 可以处理符合以下要求的图像:The Recognize Text API works with images that meet the following requirements:

  • 图像必须以 JPEG、PNG 或 BMP 格式显示。The image must be presented in JPEG, PNG, or BMP format.
  • 图像的尺寸必须介于 50 x 50 和 4200 x 4200 像素之间。The dimensions of the image must be between 50 x 50 and 4200 x 4200 pixels.
  • 图像的文件大小必须小于 4 MB。The file size of the image must be less than 4 megabytes (MB).

限制Limitations

文本识别操作的准确度取决于图像的质量。The accuracy of text recognition operations depends on the quality of the images. 以下因素可能导致读取结果不准确:The following factors may cause an inaccurate reading:

  • 模糊的图像。Blurry images.
  • 手写文本或草体文本。Handwritten or cursive text.
  • 艺术字体样式。Artistic font styles.
  • 文本太小。Small text size.
  • 文本上的复杂背景、阴影、炫光或透视变形。Complex backgrounds, shadows, or glare over text or perspective distortion.
  • 文本过长,或单词的开头缺少大写字母。Oversized or missing capital letters at the beginnings of words.
  • 下标、上标或删除线文本。Subscript, superscript, or strikethrough text.

后续步骤Next steps

按照提取印刷体文本 (OCR) 快速入门,在简单的 C# 应用中实现文本识别。Follow the Extract printed text (OCR) quickstart to implement text recognition in a simple C# app.