What is Azure Form Recognizer?

Azure Form Recognizer is a cloud-based Azure Applied AI Service that uses machine-learning models to extract key-value pairs, text, and tables from your documents. Form Recognizer analyzes your forms and documents, extracts text and data, maps field relationships as key-value pairs, and returns a structured JSON output. You quickly get accurate results that are tailored to your specific content without excessive manual intervention or extensive data science expertise. Use Form Recognizer to automate your data processing in applications and workflows, enhance data-driven strategies, and enrich document search capabilities.

Form Recognizer uses the following models to easily identify, extract, and analyze document data:

Document analysis models

  • Read model | Extract printed and handwritten text lines, words, locations, and detected languages from documents and images.
  • Layout model | Extract text, tables, selection marks, and structure information from documents (PDF and TIFF) and images (JPG, PNG, and BMP).
  • General document model | Extract key-value pairs, selection marks, and entities from documents.

Prebuilt models

  • W-2 form model | Extract text and key information from US W2 tax forms.
  • Invoice model | Extract text, selection marks, tables, key-value pairs, and key information from invoices.
  • Receipt model | Extract text and key information from receipts.
  • ID document model | Extract text and key information from driver licenses and international passports.
  • Business card model | Extract text and key information from business cards.

Custom models

  • Custom model | Extract and analyze distinct data and use cases from forms and documents specific to your business.
  • Composed model | Compose a collection of custom models and assign them to a single model built from your form types.

Which Form Recognizer feature should I use?

This section helps you decide which Form Recognizer v3.0 supported feature you should use for your application:

What type of document do you want to analyze? How is the document formatted? Your best solution
  • W-2 Form
Is your W-2 document composed in United States English (en-US) text?
  • Text-only document
Is your text-only document printed in a supported language or, if handwritten, is it composed in English?
  • Invoice
Is your invoice document composed in English or Spanish text?
  • Receipt
  • Business card
Is your receipt or business card document composed in English text?
  • ID document
Is your ID document a US driver's license or an international passport?
  • Form or Document
Is your form or document an industry-standard format commonly used in your business or industry?

Form Recognizer features and development options

The following features and development options are supported by the Form Recognizer service v3.0. Use the links in the table to learn more about each feature and browse the API references.

Feature Description Development options
🆕 Read Extract text lines, words, detected languages, and handwritten style if detected.
🆕 W-2 Form Extract information reported in each box on a W-2 form.
🆕 General document model Extract text, tables, structure, key-value pairs and, named entities.
Layout model Extract text, selection marks, and tables structures, along with their bounding box coordinates, from forms and documents.

Layout API has been updated to a prebuilt model.
Custom model (updated) Extraction and analysis of data from forms and documents specific to distinct business data and use cases.
  • Custom model API v3.0 supports signature detection for custom template (custom form) models.

  • Custom model API v3.0 offers a new model type Custom Neural or custom document to analyze unstructured documents.
Form Recognizer Studio
  • REST API
  • C# SDK
  • Python SDK
  • Java SDK
  • JavaScript
  • Invoice model Automated data processing and extraction of key information from sales invoices.
    Receipt model (updated) Automated data processing and extraction of key information from sales receipts.

    Receipt model v3.0 supports processing of single-page hotel receipts.
    ID document model (updated) Automated data processing and extraction of key information from US driver's licenses and international passports.

    Prebuilt ID document API supports the extraction of endorsements, restrictions, and vehicle classifications from US driver's licenses.
    Business card model Automated data processing and extraction of key information from business cards.

    How to use Form Recognizer documentation

    This documentation contains the following article types:

    • Concepts provide in-depth explanations of the service functionality and features.
    • Quickstarts are getting-started instructions to guide you through making requests to the service.
    • How-to guides contain instructions for using the service in more specific or customized ways.
    • Tutorials are longer guides that show you how to use the service as a component in broader business solutions.

    Data privacy and security

    As with all the cognitive services, developers using the Form Recognizer service should be aware of Microsoft policies on customer data. See our Data, privacy, and security for Form Recognizer page.

    Next steps