Is the order of words in page elements logical or random?

Arman 45 Reputation points
2024-04-18T08:02:10.93+00:00

My question is related to the words field in the data structure return for each page within a document (see example below).

Looking at some responses from Document Intelligence's layout mode, the order of words appear logical but that point isn't explicitly documented anywhere.

Question: is the order of elements in the words field random or logical? If yes, will it continue to be logical when a page contains multiple columns?

Thanks

"pages": [
    {
        "pageNumber": 1,
        "angle": 0,
        "width": 915,
        "height": 1190,
        "unit": "pixel",
        "words": [],
        "lines": [],
        "spans": []
    }
]
Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,379 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,382 questions
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 4,275 Reputation points Microsoft Vendor
    2024-04-18T08:44:42.18+00:00

    Hi @Arman,

    Thank you for reaching out to Microsoft Q&A forum!

    The order of elements in the words field returned by Document Intelligence's layout mode is logical, not random. The service arranges semantically contiguous elements together, even if they cross line or column boundaries. When the reading order among paragraphs and other layout elements is ambiguous, the service generally returns the content in a left-to-right, top-to-bottom order.

    Regarding your second question, when a page contains multiple columns, the service still maintains the logical order of the words. The words field contains all the words in the document, sorted by reading order that arranges semantically contiguous elements together, even if they cross line or column boundaries.

    See: Analyze document API response.

    I hope you understand. And, if you have any further query do let us know.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    0 comments No comments