Extract table data and put them into dictionary with azure form recognizer

Titanium 1 Reputation point
2022-05-16T12:17:14.1+00:00

I have searched related to my question but none found.

Below is my tried working code:

import json
from azure.core.exceptions import ResourceNotFoundError
from azure.ai.formrecognizer import FormRecognizerClient, FormTrainingClient
from azure.core.credentials import AzureKeyCredential

credentials = json.load(open("creds.json"))

API_KEY = credentials["API_KEY"]
ENDPOINT = credentials["ENDPOINT"]

url = "https://some_pdf_url_which_contains_tables.pdf" #or image url which contains
#table

form_recognizer_client = FormRecognizerClient(ENDPOINT, AzureKeyCredential(API_KEY))
poller = form_recognizer_client.begin_recognize_content_from_url(url)
form_data = poller.result()

for page in form_data:
for table in page.tables:
for cell in table.cells:
for item in cell.text:
print(item)
## But I need table in dictionary format with header names in keys and
## values in values. Not just plain text.

Azure ISV (Independent Software Vendors) and Startups
Azure ISV (Independent Software Vendors) and Startups
Azure: A cloud computing platform and infrastructure for building, deploying and managing applications and services through a worldwide network of Microsoft-managed datacenters.ISV (Independent Software Vendors) and Startups: A Microsoft program that helps customers adopt Microsoft Cloud solutions and drive user adoption.
89 questions
Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,405 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,616 Reputation points
    2022-05-17T09:13:09.123+00:00

    @Titanium Thanks for the details. Can you please add more details about your input document and usecase.

    Extract Column Header Information:
    Layout supports column header recognition - The updated Layout API table feature adds header recognition with column headers that can span multiple rows. Each table cell has an attribute that indicates whether it's part of a header or not. This can be used to identify which rows make up the table header.

    Please follow the document that could help.

    Here is link to General document model that Analyze and extract text, tables, structure, key-value pairs, and named entities.
    202751-image.png

    0 comments No comments