Form Recognizer custom models

Form Recognizer uses advanced machine learning technology to detect and extract information from forms and documents and returns the extracted data in a structured JSON output. With Form Recognizer, you can use pre-built or pre-trained models or you can train standalone custom models. Custom models extract and analyze distinct data and use cases from forms and documents specific to your business. Standalone custom models can be combined to create composed models.

To create a custom model, you label a dataset of documents with the values you want extracted and train the model on the labeled dataset. You only need five examples of the same form or document type to get started.

Custom model types

Custom models can be one of two types, custom template or custom form and custom neural or custom document models. The labeling and training process for both models is identical, but the models differ as follows:

Custom template model

The custom template or custom form model relies on a consistent visual template to extract the labeled data. The accuracy of your model is affected by variances in the visual structure of your documents. Structured forms such as questionnaires or applications are examples of consistent visual templates.

Your training set will consist of structured documents where the formatting and layout are static and constant from one document instance to the next. Custom template models support key-value pairs, selection marks, tables, signature fields, and regions. Template models and can be trained on documents in any of the supported languages. For more information, see custom template models.

Tip

To confirm that your training documents present a consistent visual template, remove all the user-entered data from each form in the set. If the blank forms are identical in appearance, they represent a consistent visual template.

For more information, see Interpret and improve accuracy and confidence for custom models.

Custom neural model

The custom neural (custom document) model uses deep learning models and base model trained on a large collection of documents. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents. When you're choosing between the two model types, start with a neural model to determine if it meets your functional needs. See neural models to learn more about custom document models.

Build mode

The build custom model operation has added support for the template and neural custom models. Previous versions of the REST API and SDKs only supported a single build mode that is now known as the template mode.

  • Template models only accept documents that have the same basic page structure—a uniform visual appearance—or the same relative positioning of elements within the document.

  • Neural models support documents that have the same information, but different page structures. Examples of these documents include United States W2 forms, which share the same information, but may vary in appearance across companies. Neural models currently only support English text.

This table provides links to the build mode programming language SDK references and code samples on GitHub:

Programming language SDK reference Code sample
C#/.NET DocumentBuildMode Struct Sample_BuildCustomModelAsync.cs
Java DocumentBuildMode Class BuildModel.java
JavaScript DocumentBuildMode type buildModel.js
Python DocumentBuildMode Enum sample_build_model.py

Compare model features

The table below compares custom template and custom neural features:

Feature Custom template (form) Custom neural (document)
Document structure Template, form, and structured Structured, semi-structured, and unstructured
Training time 1 to 5 minutes 20 minutes to 1 hour
Data extraction Key-value pairs, tables, selection marks, coordinates, and signatures Key-value pairs and selection marks
Document variations Requires a model per each variation Uses a single model for all variations
Language support Multiple language support United States English (en-US) language support

Custom model tools

The following tools are supported by Form Recognizer v2.1:

Feature Resources Model ID
Custom model custom-model-id

The following tools are supported by Form Recognizer v3.0:

Feature Resources Model ID
Custom model custom-model-id

Try Form Recognizer

Try extracting data from your specific or unique documents using custom models. You need the following resources:

  • An Azure subscription. You can create one for free.

  • A Form Recognizer instance in the Azure portal. You can use the free pricing tier (F0) to try the service. After your resource deploys, select Go to resource to get your key and endpoint.

    Screenshot that shows the keys and endpoint location in the Azure portal.

Form Recognizer Studio (preview)

Note

Form Recognizer Studio is available with the preview (v3.0) API.

  1. On the Form Recognizer Studio home page, select Custom form.

  2. Under My Projects, select Create a project.

  3. Complete the project details fields.

  4. Configure the service resource by adding your Storage account and Blob container to Connect your training data source.

  5. Review and create your project.

  6. Use the sample documents to build and test your custom model.

Sample Labeling tool (API v2.1)

Feature Custom Template Custom Neural
Document structure Template, fixed form, and structured documents. Structured, semi-structured, and unstructured documents.
Training time 1 - 5 minutes 20 - 60 minutes
Data extraction Key-value pairs, tables, selection marks, signatures, and regions Key-value pairs and selections marks.
Models per Document type Requires one model per each document-type variation Supports a single model for all document-type variations.
Language support See custom template model language support The custom neural model currently supports English-language documents only.

Model capabilities

This table compares the supported data extraction areas:

Model Form fields Selection marks Structured fields (Tables) Signature Region labeling
Custom template
Custom neural n/a n/a n/a

Table symbols: ✔—supported; ✱—preview; **n/a—currently unavailable

Tip

When choosing between the two model types, start with a custom neural model if it meets your functional needs. See custom neural to learn more about custom neural models.

Custom model development options

The following table describes the features available with the associated tools and SDKs. As a best practice, ensure that you use the compatible tools listed here.

Document type REST API SDK Label and Test Models
Custom form 2.1 Form Recognizer 2.1 GA API Form Recognizer SDK Sample labeling tool
Custom template 3.0 Form Recognizer 3.0 (preview) Form Recognizer Preview SDK Form Recognizer Studio
Custom neural Form Recognizer 3.0 (preview) Form Recognizer Preview SDK Form Recognizer Studio

Note

Custom template models trained with the 3.0 API will have a few improvements over the 2.1 API stemming from improvements to the OCR engine. Datasets used to train a custom template model using the 2.1 API can still be used to train a new model using the 3.0 API.

  • For best results, provide one clear photo or high-quality scan per document.

  • Supported file formats are JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.

  • For PDF and TIFF files, up to 2,000 pages can be processed. With a free tier subscription, only the first two pages are processed.

  • The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.

  • Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.

  • PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.

  • The total size of the training data is 500 pages or less.

  • If your PDFs are password-locked, you must remove the lock before submission.

    Tip

    Training data:

    • If possible, use text-based PDF documents instead of image-based documents. Scanned PDFs are handled as images.
    • Please supply only a single instance of the form per document.
    • For filled-in forms, use examples that have all their fields filled in.
    • Use forms with different values in each field.
    • If your form images are of lower quality, use a larger dataset. For example, use 10 to 15 images.

The Sample Labeling tool doesn't support the BMP file format. This limitation relates to the tool, not the Form Recognizer service.

Supported languages and locales

The Form Recognizer preview version introduces more language support for custom models. For a list of supported handwritten and printed text, see Language support.

Form Recognizer v3.0 (preview)

Form Recognizer v3.0 (preview) introduces several new features and capabilities:

  • Custom model API (v3.0): This version supports signature detection for custom forms. When you train custom models, you can specify certain fields as signatures. When a document is analyzed with your custom model, it indicates whether a signature was detected or not.
  • Form Recognizer v3.0 migration guide: This guide shows you how to use the preview version in your applications and workflows.
  • REST API (preview): This API shows you more about the preview version and new capabilities.

Try signature detection

  1. Build your training dataset.

  2. Go to Form Recognizer Studio. Under Custom models, select Custom form.

    Screenshot that shows selecting the Form Recognizer Studio Custom form page.

  3. Follow the workflow to create a new project:

    1. Follow the Custom model input requirements.

    2. Label your documents. For signature fields, use Region labeling for better accuracy.

      Screenshot that shows the Label signature field.

After your training set is labeled, you can train your custom model and use it to analyze documents. The signature fields specify whether a signature was detected or not.

Next steps

Explore Form Recognizer quickstarts and REST APIs:

Quickstart REST API
v3.0 Studio quickstart Form Recognizer v3.0 API 2022-06-30
v2.1 quickstart Form Recognizer API v2.1