Form Recognizer ID document model
The ID document model combines Optical Character Recognition (OCR) with deep learning models to analyze and extracts key information from US Drivers Licenses (all 50 states and District of Columbia) and international passport biographical pages (excludes visa and other travel documents). The API analyzes identity documents, extracts key information, and returns a structured JSON data representation.
Sample U.S. Driver's License processed with Form Recognizer Studio
Development options
The following tools are supported by Form Recognizer v2.1:
| Feature | Resources |
|---|---|
| ID document model |
The following tools are supported by Form Recognizer v3.0:
| Feature | Resources | Model ID |
|---|---|---|
| ID document model | prebuilt-idDocument |
Try Form Recognizer
Extract data, including name, birth date, machine-readable zone, and expiration date, from ID documents using the Form Recognizer Studio or our Sample Labeling tool. You'll need the following resources:
An Azure subscription—you can create one for free
A Form Recognizer instance in the Azure portal. You can use the free pricing tier (
F0) to try the service. After your resource deploys, select Go to resource to get your key and endpoint.
Form Recognizer Studio (preview)
Note
Form Recognizer studio is available with the preview (v3.0) API.
On the Form Recognizer Studio home page, select Identity documents
You can analyze the sample invoice or select the + Add button to upload your own sample.
Select the Analyze button:
Sample Labeling tool (API v2.1)
You'll need an ID document. You can use our sample ID document.
On the Sample Labeling tool home page, select Use prebuilt model to get data.
Select Identity documents from the Form Type dropdown menu:
Input requirements
- For best results, provide one clear photo or high-quality scan per document.
- Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location. Additionally, the newest API version
2022-06-30-previewsupports Microsoft Word (DOCX), Excel (XLS), PowerPoint (PPT), and HTML files in Read model. - For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
- The file size for analyzing documents must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
- Image dimensions must be between 50 x 50 pixels and 10,000 px x 10,000 pixels.
- PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
- If your PDFs are password-locked, you must remove the lock before submission.
- The minimum height of the text to be extracted is 12 pixels for a 1024 x 768 pixel image. This dimension corresponds to about 8-point text at 150 dots per inch (DPI).
- For custom model training, the maximum number of pages for training data is 500 for the custom template model and 50,000 for the custom neural model.
- For custom model training, the total size of training data is 50 MB for template model and 1G-MB for the neural model.
Note
The Sample Labeling tool does not support the BMP file format. This is a limitation of the tool not the Form Recognizer Service.
Supported languages and locales v2.1
| Model | Language—Locale code | Default |
|---|---|---|
| ID document |
|
English (United States)—en-US |
Field extraction
| Name | Type | Description | Standardized output |
|---|---|---|---|
| CountryRegion | countryRegion | Country or region code compliant with ISO 3166 standard | |
| DateOfBirth | Date | DOB | yyyy-mm-dd |
| DateOfExpiration | Date | Expiration date DOB | yyyy-mm-dd |
| DocumentNumber | String | Relevant passport number, driver's license number, etc. | |
| FirstName | String | Extracted given name and middle initial if applicable | |
| LastName | String | Extracted surname | |
| Nationality | countryRegion | Country or region code compliant with ISO 3166 standard (Passport only) | |
| Sex | String | Possible extracted values include "M", "F" and "X" | |
| MachineReadableZone | Object | Extracted Passport MRZ including two lines of 44 characters each | "P<USABROOKS<<JENNIFER<<<<<<<<<<<<<<<<<<<<<<< 3400200135USA8001014F1905054710000307<715816" |
| DocumentType | String | Document type, for example, Passport, Driver's License | "passport" |
| Address | String | Extracted address (Driver's License only) | |
| Region | String | Extracted region, state, province, etc. (Driver's License only) |
Form Recognizer preview v3.0
The Form Recognizer preview v3.0 introduces several new features and capabilities:
ID document (v3.0) prebuilt model supports extraction of endorsement, restriction, and vehicle class codes from US driver's licenses.
The ID Document 2022-06-30-preview release supports the following data extraction from US driver's licenses:
- Date issued
- Height
- Weight
- Eye color
- Hair color
- Document discriminator security code
ID document preview field extraction
| Name | Type | Description | Standardized output |
|---|---|---|---|
| 🆕 DateOfIssue | Date | Issue date | yyyy-mm-dd |
| 🆕 Height | String | Height of the holder. | |
| 🆕 Weight | String | Weight of the holder. | |
| 🆕 EyeColor | String | Eye color of the holder. | |
| 🆕 HairColor | String | Hair color of the holder. | |
| 🆕 DocumentDiscriminator | String | Document discriminator is a security code that identifies where and when the license was issued. | |
| Endorsements | String | More driving privileges granted to a driver such as Motorcycle or School bus. | |
| Restrictions | String | Restricted driving privileges applicable to suspended or revoked licenses. | |
| VehicleClassification | String | Types of vehicles that can be driven by a driver. | |
| CountryRegion | countryRegion | Country or region code compliant with ISO 3166 standard | |
| DateOfBirth | Date | DOB | yyyy-mm-dd |
| DateOfExpiration | Date | Expiration date DOB | yyyy-mm-dd |
| DocumentNumber | String | Relevant passport number, driver's license number, etc. | |
| FirstName | String | Extracted given name and middle initial if applicable | |
| LastName | String | Extracted surname | |
| Nationality | countryRegion | Country or region code compliant with ISO 3166 standard (Passport only) | |
| Sex | String | Possible extracted values include "M", "F" and "X" | |
| MachineReadableZone | Object | Extracted Passport MRZ including two lines of 44 characters each | "P<USABROOKS<<JENNIFER<<<<<<<<<<<<<<<<<<<<<<< 3400200135USA8001014F1905054710000307<715816" |
| DocumentType | String | Document type, for example, Passport, Driver's License | "passport" |
| Address | String | Extracted address (Driver's License only) | |
| Region | String | Extracted region, state, province, etc. (Driver's License only) |
Migration guide and REST API v3.0
Follow our Form Recognizer v3.0 migration guide to learn how to use the preview version in your applications and workflows.
Explore our REST API (preview) to learn more about the preview version and new capabilities.
Next steps
Complete a Form Recognizer quickstart:
Explore our REST API:
Feedback
Submit and view feedback for