Form Recognizer ID document model

The ID document model combines powerful Optical Character Recognition (OCR) capabilities with deep learning models to analyze and extracts key information from U.S. Driver's Licenses (all 50 states and District of Columbia) and international passport biographical pages (excluding visa and other travel documents). The API analyzes identity documents; extracts key information such as first name, last name, address, and date of birth; and returns a structured JSON data representation.

Sample U.S. Driver's License processed with Form Recognizer Studio

sample identification card

Development options

The following resources are supported by Form Recognizer v2.1:

Feature Resources
ID document model

The following resources are supported by Form Recognizer v3.0:

Feature Resources Model ID
ID document model prebuilt-idDocument

Try Form Recognizer

See how data, including name, birth date, machine-readable zone, and expiration date, is extracted from ID documents using the Form Recognizer Studio or our Sample Labeling tool. You'll need the following:

  • An Azure subscription—you can create one for free

  • A Form Recognizer instance in the Azure portal. You can use the free pricing tier (F0) to try the service. After your resource deploys, select Go to resource to get your API key and endpoint.

Screenshot: keys and endpoint location in the Azure portal.

Form Recognizer Studio (preview)

Note

Form Recognizer studio is available with the preview (v3.0) API.

  1. On the Form Recognizer Studio home page, select Invoices

  2. You can analyze the sample invoice or select the + Add button to upload your own sample.

  3. Select the Analyze button:

    Screenshot: analyze ID document menu.

Sample Labeling tool

You will need an ID document. You can use our sample ID document.

  1. On the Sample Labeling tool home page, select Use prebuilt model to get data.

  2. Select Identity documents from the Form Type dropdown menu:

    Screenshot: Sample Labeling tool dropdown prebuilt model selection menu.

Input requirements

  • For best results, provide one clear photo or high-quality scan per document.
  • Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
  • For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
  • The file size must be less than 50 MB.
  • Image dimensions must be between 50 x 50 pixels and 10000 x 10000 pixels.
  • PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
  • The total size of the training data is 500 pages or less.
  • If your PDFs are password-locked, you must remove the lock before submission.
  • For unsupervised learning (without labeled data):
    • Data must contain keys and values.
    • Keys must appear above or to the left of the values; they can't appear below or to the right.

Note

The Sample Labeling tool does not support the BMP file format. This is a limitation of the tool not the Form Recognizer Service.

Supported languages and locales v2.1

Model Language—Locale code Default
ID document
  • English (United States)—en-US (driver's license)
  • Biographical pages from international passports
    (excluding visa and other travel documents)

English (United States)—en-US

Field extraction

Name Type Description Standardized output
CountryRegion countryRegion Country or region code compliant with ISO 3166 standard
DateOfBirth Date DOB yyyy-mm-dd
DateOfExpiration Date Expiration date DOB yyyy-mm-dd
DocumentNumber String Relevant passport number, driver's license number, etc.
FirstName String Extracted given name and middle initial if applicable
LastName String Extracted surname
Nationality countryRegion Country or region code compliant with ISO 3166 standard (Passport only)
Sex String Possible extracted values include "M", "F" and "X"
MachineReadableZone Object Extracted Passport MRZ including two lines of 44 characters each "P<USABROOKS<<JENNIFER<<<<<<<<<<<<<<<<<<<<<<< 3400200135USA8001014F1905054710000307<715816"
DocumentType String Document type, for example, Passport, Driver's License "passport"
Address String Extracted address (Driver's License only)
Region String Extracted region, state, province, etc. (Driver's License only)

Form Recognizer preview v3.0

The Form Recognizer preview introduces several new features and capabilities:

  • ID document (v3.0) model supports endorsements, restrictions, and vehicle classification extraction from US driver's licenses.

ID document preview field extraction

Name Type Description Standardized output
🆕 Endorsements String Additional driving privileges granted to a driver such as Motorcycle or School bus.
🆕 Restrictions String Restricted driving privileges applicable to suspended or revoked licenses.
🆕VehicleClassification String Types of vehicles that can be driven by a driver.
CountryRegion countryRegion Country or region code compliant with ISO 3166 standard
DateOfBirth Date DOB yyyy-mm-dd
DateOfExpiration Date Expiration date DOB yyyy-mm-dd
DocumentNumber String Relevant passport number, driver's license number, etc.
FirstName String Extracted given name and middle initial if applicable
LastName String Extracted surname
Nationality countryRegion Country or region code compliant with ISO 3166 standard (Passport only)
Sex String Possible extracted values include "M", "F" and "X"
MachineReadableZone Object Extracted Passport MRZ including two lines of 44 characters each "P<USABROOKS<<JENNIFER<<<<<<<<<<<<<<<<<<<<<<< 3400200135USA8001014F1905054710000307<715816"
DocumentType String Document type, for example, Passport, Driver's License "passport"
Address String Extracted address (Driver's License only)
Region String Extracted region, state, province, etc. (Driver's License only)

Migration guide and REST API v3.0

Next steps