Can Document Intelligence extract information from images

Ritesh Panditi 10 Reputation points
2024-04-01T22:21:22.9733333+00:00

Is it possible to utilize Document Intelligence to extract data from images? Alternatively, does Image Analysis offer a similar function? Specifically, I would like to understand if Document Intelligence can comprehend technical information contained within charts, plots, or figures. For example, if I provide a research paper, can I extract relevant information including text, sections, tables, and image data to import into an application that uses a RAG interface? How can I extract image-related information?

Azure Computer Vision
Azure Computer Vision
An Azure artificial intelligence service that analyzes content in images and video.
329 questions
Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,462 questions
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 15,161 Reputation points
    2024-04-02T01:41:42.2+00:00

    Hello @Ritesh Panditi , Thanks for using Microsoft Q&A Platform.

    I would suggest Pre-built Layout model with the latest 2024-02-29-preview version would be the best option as per your use case. The layout model Markdown-formatted output is LLM-friendly and ensures smooth integration into your workflows. Markdown is widely used for enabling semantic chunking in RAG (Retrieval-Augmented Generation). You can refer to the documentation for more details.

    The "prebuilt-layout" model can detect figures and extract information such as their spatial locations, text spans, and related text elements, caption.User's image

    Additionally, the "prebuilt-layout" model can identify two types of roles in a document layout: geometric roles (such as text, tables, figures, and selection marks) and logical roles (such as titles, headings, and footers).

    The Azure AI Vision Image Analysis service can extract a wide variety of visual features from your images. For example, it can determine whether an image contains adult content, find specific brands or objects, or find human faces.

    Please refer to these features and choose that fits best for your use case.

    I hope this helps.

    Regards,

    Vasavi

    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.