question

ta-4170 avatar image
0 Votes"
ta-4170 asked ramr-msft edited

Azure form recognizer less accuracy and more time

I really need some suggestions regarding azure form recognizer. Can I ask please?

I am working on app where user will upload image of ID cards, (format can be jpeg, jpg, pdf). and i have to extract information with mapping. for that i have used form recognizer. but the problem was the accuracy is less for bad images and it was collection some garbage sometimes, and the time to process is also 45 seconds for one image.

so i thought to use azure ocr for it, i used classifier code and then azure ocr. now i have to map the json file.

my output will be kind of like this below:
public virtual int Id { get; set; }

my question is , am i in the wrong path? or the right one. if i am in the wrong path what should i follow?
also is there any AZURE classifier which i can use before form recognizer?
and is there any automatic json mapping service in azure?

Would you kindly give your suggestions what i should do to increase the accuracy and reduce the timing/?

should i move to OCR? or stick to form recognizer?
or should I build a custom model using python?

public virtual string FirstName { get; set; }

public virtual string LastName { get; set; }

public virtual string Sex { get; set; }

public virtual DateTime BirthDate { get; set; }

public virtual string IdNumber { get; set; }

public virtual string Address { get; set; }

public virtual string Province { get; set; }

public virtual string PostalCode { get; set; }

public virtual string ProvinceLetters { get; set; }

public virtual string City { get; set; }

public virtual string DdNumber { get; set; }

public virtual string Class { get; set; }

public virtual string Rest { get; set; }

public virtual DateTime IssueDate { get; set; }

public virtual DateTime ExpiryDate { get; set; }

azure-form-recognizer
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@ta-4170 Thanks for the question. You can use the Azure Form Recognizer custom forms, prebuilt, and layout APIs to extract information from your documents in an organized manner. Can you please add more details about the model (i.e., Prebuilt or Custom) that you have tried.

0 Votes 0 ·

1 Answer

ramr-msft avatar image
0 Votes"
ramr-msft answered ramr-msft edited

@ta-4170 Generally we start with 5 documents as training set and you should be able to add more documents to your training incrementally to see an improvement in the results. If you don’t see improvements after doing that, We will forward to the form recognizer team. Please follow the document to Train a custom model using the sample labeling tool.

Have you checked out the Knowledge Extraction Recipes resource? https://github.com/microsoft/knowledge-extraction-recipes-forms

Comparison of form recognizer solution: https://cazton.com/blogs/executive/form-recognition-azure-aws-gcp


· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi,
Thanks for your reply. I have used the custom training model and used 60 images for it. But if i could send the bad images then the accuracy gets les. for processing each image output it takes 45 seconds. does azure has any classifier service where i can say its good or bad image? or do you suggest me to build a ocr and classifier and mapping coding by myself? please suggest me.

0 Votes 0 ·

yes please, it would be geat if you could forward to form recognizer team. thanks. :)

0 Votes 0 ·

@ramr-msft need suggestions please.

0 Votes 0 ·

@ta-4170 Thanks for the details. Can you please share the good and bad images that you are trying and output that has been extracted. We have forwarded to the product team to check further on this.

0 Votes 0 ·