What's new in Computer Vision
Learn what's new in the service. These items may be release notes, videos, blog posts, and other types of information. Bookmark this page to stay up to date with the service.
September 2021
OCR (Read) API Public Preview supports 122 languages
Computer Vision's OCR (Read) API expands supported languages to 122 with its latest preview:
- OCR support for print text in 49 new languages including Russian, Bulgarian, and other Cyrillic and more Latin languages.
- OCR support for handwritten text in 6 new languages that include English, Chinese Simplified, French, German, Italian, Portuguese, and Spanish.
- Enhancements for processing digital PDFs and Machine Readable Zone (MRZ) text in identity documents.
- General performance and AI quality improvements
See the OCR how-to guide to learn how to use the new preview features.
August 2021
Image tagging language expansion
The latest version (v3.2) of the Image tagger now supports tags in 50 languages. See the language support page for more information.
May 2021
Spatial Analysis container update
A new version of the Spatial Analysis container has been released with a new feature set. This Docker container lets you analyze real-time streaming video to understand spatial relationships between people and their movement through physical environments.
Spatial Analysis operations can be now configured to detect the orientation that a person is facing.
- An orientation classifier can be enabled for the
personcrossinglineandpersoncrossingpolygonoperations by configuring theenable_orientationparameter. It is set to off by default.
- An orientation classifier can be enabled for the
Spatial Analysis operations now also offers configuration to detect a person's speed while walking/running
- Speed can be detected for the
personcrossinglineandpersoncrossingpolygonoperations by turning on theenable_speedclassifier, which is off by default. The output is reflected in thespeed,avgSpeed, andminSpeedoutputs.
- Speed can be detected for the
April 2021
Computer Vision v3.2 GA
The Computer Vision API v3.2 is now generally available with the following updates:
- Improved image tagging model: analyzes visual content and generates relevant tags based on objects, actions, and content displayed in the image. This model is available through the Tag Image API. See the Image Analysis how-to guide and overview to learn more.
- Updated content moderation model: detects presence of adult content and provides flags to filter images containing adult, racy, and gory visual content. This model is available through the Analyze API. See the Image Analysis how-to guide and overview to learn more.
- OCR (Read) available for 73 languages including Simplified and Traditional Chinese, Japanese, Korean, and Latin languages.
- OCR (Read) also available as a Distroless container for on-premise deployment.
March 2021
Computer Vision 3.2 Public Preview update
The Computer Vision API v3.2 public preview has been updated. The preview release has all Computer Vision features along with updated Read and Analyze APIs.
February 2021
Read API v3.2 Public Preview with OCR support for 73 languages
The Computer Vision Read API v3.2 public preview, available as cloud service and Docker container, includes these updates:
- OCR for 73 languages including Simplified and Traditional Chinese, Japanese, Korean, and Latin languages.
- Natural reading order for the text line output (Latin languages only)
- Handwriting style classification for text lines along with a confidence score (Latin languages only).
- Extract text only for selected pages for a multi-page document.
- Available as a Distroless container for on-premise deployment.
See the Read API how-to guide to learn more.
January 2021
Spatial Analysis container update
A new version of the Spatial Analysis container has been released with a new feature set. This Docker container lets you analyze real-time streaming video to understand spatial relationships between people and their movement through physical environments.
- Spatial Analysis operations can be now configured to detect if a person is wearing a protective face covering such as a mask.
- A mask classifier can be enabled for the
personcount,personcrossinglineandpersoncrossingpolygonoperations by configuring theENABLE_FACE_MASK_CLASSIFIERparameter. - The attributes
face_maskandface_noMaskwill be returned as metadata with confidence score for each person detected in the video stream
- A mask classifier can be enabled for the
- The personcrossingpolygon operation has been extended to allow the calculation of the dwell time a person spends in a zone. You can set the
typeparameter in the Zone configuration for the operation tozonedwelltimeand a new event of type personZoneDwellTimeEvent will include thedurationMsfield populated with the number of milliseconds that the person spent in the zone. - Breaking change: The personZoneEvent event has been renamed to personZoneEnterExitEvent. This event is raised by the personcrossingpolygon operation when a person enters or exits the zone and provides directional info with the numbered side of the zone that was crossed.
- Video URL can be provided as "Private Parameter/obfuscated" in all operations. Obfuscation is optional now and it will only work if
KEYandIVare provided as environment variables. - Calibration is enabled by default for all operations. Set the
do_calibration: falseto disable it. - Added support for auto recalibration (by default disabled) via the
enable_recalibrationparameter, please refer to Spatial Analysis operations for details - Camera calibration parameters to the
DETECTOR_NODE_CONFIG. Refer to Spatial Analysis operations for details.
October 2020
Computer Vision API v3.1 GA
The Computer Vision API in General Availability has been upgraded to v3.1.
September 2020
Spatial Analysis container preview
The Spatial Analysis container is now in preview. The Spatial Analysis feature of Computer Vision lets you analyze real-time streaming video to understand spatial relationships between people and their movement through physical environments. Spatial Analysis is a Docker container you can use on-premises.
Read API v3.1 Public Preview adds OCR for Japanese
The Computer Vision Read API v3.1 public preview adds these capabilities:
OCR for Japanese language
For each text line, indicate whether the appearance is Handwriting or Print style, along with a confidence score (Latin languages only).
For a multi-page document extract text only for selected pages or page range.
This preview version of the Read API supports English, Dutch, French, German, Italian, Japanese, Portuguese, Simplified Chinese, and Spanish languages.
See the Read API how-to guide to learn more.
July 2020
Read API v3.1 Public Preview with OCR for Simplified Chinese
The Computer Vision Read API v3.1 public preview adds support for Simplified Chinese.
- This preview version of the Read API supports English, Dutch, French, German, Italian, Portuguese, Simplified Chinese, and Spanish languages.
See the Read API how-to guide to learn more.
May 2020
Computer Vision API v3.0 entered General Availability, with updates to the Read API:
- Support for English, Dutch, French, German, Italian, Portuguese, and Spanish
- Improved accuracy
- Confidence score for each extracted word
- New output format
See the OCR overview to learn more.
March 2020
- TLS 1.2 is now enforced for all HTTP requests to this service. For more information, see Azure Cognitive Services security.
January 2020
Read API 3.0 Public Preview
You now can use version 3.0 of the Read API to extract printed or handwritten text from images. Compared to earlier versions, 3.0 provides:
- Improved accuracy
- New output format
- Confidence score for each extracted word
- Support for both Spanish and English languages with the language parameter
Follow an Extract text quickstart to get starting using the 3.0 API.