question

AmroGhoneim-1570 avatar image
0 Votes"
AmroGhoneim-1570 asked AmroGhoneim-1570 published

Best practices for arabic OCR

I am trying to extract all the information given in an Egyptian national ID which mainly contains Arabic letters and digits. I am doing this using Azure's OCR API. The API is working but so far it is inaccurate. What are the best practices that I can use to enhance my results when doing Arabic OCR?

azure-computer-vision
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

romungi-MSFT avatar image
1 Vote"
romungi-MSFT answered AmroGhoneim-1570 published

@AmroGhoneim-1570 Since you are using arabic language the only available option is the OCR API. The results of the OCR API as mostly based on the quality of the image and the requirements should confer to these pre-requisites. Is there a sample image to replicate the issue?

While, most other languages support the Read API you can try to use it if your requirement is to only extract the numbers from your input image.


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Even the numbers to be extracted are in Arabic.

I attached a sample ID
Even though this image is good quality, the arabic numbers below were not captured. I had to apply some filters to increase contrast and it captured some of the numbers.

I am also expecting that the provided IDs to me will be of lower quality than the sample provided.

Moderator: We have removed your picture due to security concerns.

0 Votes 0 ·