question

AcharyaRakesh-5831 avatar image
1 Vote"
AcharyaRakesh-5831 asked ramr-msft commented

Azure Vision (Text Extraction OCR) Issue

I’m reaching out to you with a request around text extraction from scanned documents using Azure Vision API. We had built a solution to extract data from scanned documents (Architectural drawings), for which we are using Azure Vision API. The requirement is to extract Drawing Title, Drawing Number and Revision from Architectural drawings. For certain images the Azure api is unable to extract data (images with size less than 5kb).


Can you help us in getting this issue resolved? I can schedule the call to discuss this issue further.

azure-cognitive-servicesazure-computer-vision
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

ramr-msft avatar image
0 Votes"
ramr-msft answered ramr-msft commented

@AcharyaRakesh-5831 Thanks for the question, Azure Cognitive Services provides Industry's best optical character recognition (OCR) capability with Read API. The Computer Vision Read API is Azure's latest OCR technology (learn what's new) that extracts printed text (in several languages), handwritten text (English only), digits, and currency symbols from images and multi-page PDF documents. It's optimized to extract text from text-heavy images and multi-page PDF documents with mixed languages. If possible can you please share the sample input images and the output that is unable to extract data.


Also we have built a form recognition service seems promising for your application. Can you please try with the Form Recognizer Layout API that Detects and extracts text and layout of documents.
https://azure.microsoft.com/en-us/services/cognitive-services/form-recognizer/



In the following outlines the traditional challenges of doing OCR in the wild, and what are the ways in which deep learning algorithms are being applied to transform these solutions.
https://twimlai.com/how-deep-learning-has-revolutionized-ocr-with-cha-zhang/
Resources
Computer Vision
Microsoft Form Recognizer
Paper: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper: LayoutLM: Pre-training of Text and Layout for Document Image Understanding


· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

33691-55.jpg33692-47.jpg33701-46.jpg




Attached are few examples which Azure OCR is failing to extract the data. Kindly suggest

0 Votes 0 ·
55.jpg (5.9 KiB)
47.jpg (4.6 KiB)
46.jpg (5.2 KiB)

@AcharyaRakesh-5831 Thanks for the details, Please try with the Read API to extract text and please find the snapshot for the same.


33985-image.png


0 Votes 0 ·
image.png (59.2 KiB)