Is it possible to use Azure/ Recognizer to import data (line items) from a PDF into rows on Excel?

Question

Is it possible to use Azure/ Recognizer to import data (line items) from a PDF into rows on Excel? If there is an automated tool that would do the extraction I am sure of the logic (giving it a bit of direction) that would be needed to perform that task, at least in my case. It ends up saving lots of time and makes the entire process efficient and worth doing.

Answer

@Akesserwani It is not directly possible to extract a PDF document to an excel file. However, using the cognitive services computer vision service you can extract the text of a PDF file as a JSON response. This is possible using the read API to extract the pages in the document as text. This can be converted to excel by processing the JSON using any standard libraries.

Form recognizer is another service that can be used to extract the data from the form but you need to custom train a model to pick specific data or use any pre-built model to extract data from certain document types like a standard receipt. The response is again available as a JSON which needs to be processed and converted to excel. I hope this helps!!

If an answer is helpful, please click on or upvote which might help other community members reading this thread.

Is it possible to use Azure/ Recognizer to import data (line items) from a PDF into rows on Excel?

1 answer