Document Intelligence Studio

Important

  • Document Intelligence public preview releases provide early access to features that are in active development.
  • Features, approaches, and processes may change, prior to General Availability (GA), based on user feedback.
  • The public preview version of Document Intelligence client libraries default to REST API version 2024-02-29-preview.
  • Public preview version 2024-02-29-preview is currently only available in the following Azure regions:
  • East US
  • West US2
  • West Europe

This content applies to:checkmark v4.0 (preview) | Previous versions: blue-checkmark v3.1 (GA) blue-checkmark v3.0 (GA)

This content applies to: checkmark v3.1 (GA) | Latest version: purple-checkmark v4.0 (preview) | Previous versions: blue-checkmark v3.0

This content applies to: checkmark v3.0 (GA) | Latest versions: purple-checkmark v4.0 (preview) purple-checkmark v3.1

Document Intelligence Studio is an online tool for visually exploring, understanding, and integrating features from the Document Intelligence service into your applications. Use the Document Intelligence Studio to:

  • Learn more about the different capabilities in Document Intelligence.
  • Use your Document Intelligence resource to test models on sample documents or upload your own documents.
  • Experiment with different add-on and preview features to adapt the output to your needs.
  • Train custom classification models to classify documents.
  • Train custom extraction models to extract fields from documents.
  • Get sample code for the language-specific SDKs to integrate into your applications.

Use the Document Intelligence Studio quickstart to get started analyzing documents with document analysis or prebuilt models. Build custom models and reference the models in your applications using one of the language specific SDKs and other quickstarts.

The following image shows the landing page for Document Intelligence Studio.

Document Intelligence Studio Homepage

Getting started

If you're visiting the Studio for the first time, follow the getting started guide to set up the Studio for use.

Analyze options

  • Document Intelligence supports sophisticated analysis capabilities. The Studio allows one entry point (Analyze options button) for configuring the add-on capabilities with ease.

  • Depending on the document extraction scenario, configure the analysis range, document page range, optional detection, and premium detection features.

    Screenshot of the analyze-options dialog window.

    Note

    Font extraction is not visualized in Document Intelligence Studio. However, you can check the styles section of the JSON output for the font detection results.

✔️ Auto labeling documents with prebuilt models or one of your own models

  • In custom extraction model labeling page, you can now auto label your documents using one of Document Intelligent Service prebuilt models or your trained models.

    Animated screenshot showing auto labeling in Studio.

  • For some documents, duplicate labels after running autolabel are possible. Make sure to modify the labels so that there are no duplicate labels in the labeling page afterwards.

    Screenshot showing duplicate label warning after auto labeling.

✔️ Auto labeling tables

  • In custom extraction model labeling page, you can now auto label the tables in the document without having to label the tables manually.

    Animated screenshot showing auto table labeling in Studio.

✔️ Add test files directly to your training dataset

  • Once you train a custom extraction model, make use of the test page to improve your model quality by uploading test documents to training dataset if needed.

  • If a low confidence score is returned for some labels, make sure they're correctly labeled. If not, add them to the training dataset and relabel to improve the model quality.

Animated screenshot showing how to add test files to training dataset.

✔️ Make use of the document list options and filters in custom projects

  • In custom extraction model labeling page, you can now navigate through your training documents with ease by making use of the search, filter and sort by feature.

  • Utilize the grid view to preview documents or use the list view to scroll through the documents more easily.

    Screenshot of document list view options and filters.

✔️ Project sharing

Document Intelligence model support

  • Read: Try out Document Intelligence's Read feature to extract text lines, words, detected languages, and handwritten style if detected. Start with the Studio Read feature. Explore with sample documents and your documents. Use the interactive visualization and JSON output to understand how the feature works. See the Read overview to learn more and get started with the Python SDK quickstart for Layout.

  • Layout: Try out Document Intelligence's Layout feature to extract text, tables, selection marks, and structure information. Start with the Studio Layout feature. Explore with sample documents and your documents. Use the interactive visualization and JSON output to understand how the feature works. See the Layout overview to learn more and get started with the Python SDK quickstart for Layout.

  • Prebuilt models: Document Intelligence's prebuilt models enable you to add intelligent document processing to your apps and flows without having to train and build your own models. As an example, start with the Studio Invoice feature. Explore with sample documents and your documents. Use the interactive visualization, extracted fields list, and JSON output to understand how the feature works. See the Models overview to learn more and get started with the Python SDK quickstart for Prebuilt Invoice.

  • Custom extraction models: Document Intelligence's custom models enable you to extract fields and values from models trained with your data, tailored to your forms and documents. Create standalone custom models or combine two or more custom models to create a composed model to extract data from multiple form types. Start with the Studio Custom models feature. Use the help wizard, labeling interface, training step, and visualizations to understand how the feature works. Test the custom model with your sample documents and iterate to improve the model. See the Custom models overview to learn more.

  • Custom classification models: Document classification is a new scenario supported by Document Intelligence. the document classifier API supports classification and splitting scenarios. Train a classification model to identify the different types of documents your application supports. The input file for the classification model can contain multiple documents and classifies each document within an associated page range. See custom classification models to learn more.

  • Add-on Capabilities: Document Intelligence now supports more sophisticated analysis capabilities. These optional capabilities can be enabled and disabled in the studio using the Analze Options button in each model page. There are four add-on capabilities available: highResolution, formula, font, and barcode extraction capabilities. See Add-on capabilities to learn more.

Next steps