Form Recognizer models
Azure Form Recognizer prebuilt models enable you to add intelligent document processing to your apps and flows without having to train and build your own models. Prebuilt models use optical character recognition (OCR) combined with deep learning models to identify and extract predefined text and data fields common to specific form and document types. Form Recognizer extracts analyzes form and document data then returns an organized, structured JSON response. Form Recognizer v2.1 supports invoice, receipt, ID document, and business card models.
Model overview
| Model | Description |
|---|---|
| Document analysis | |
| 🆕Read (preview) | Extract printed and handwritten text lines, words, locations, and detected languages. |
| 🆕General document (preview) | Extract text, tables, structure, key-value pairs, and named entities. |
| Layout | Extract text and layout information from documents. |
| Prebuilt | |
| 🆕W-2 (preview) | Extract employee, employer, wage information, etc. from US W-2 forms. |
| Invoice | Extract key information from English and Spanish invoices. |
| Receipt | Extract key information from English receipts. |
| ID document | Extract key information from US driver licenses and international passports. |
| Business card | Extract key information from English business cards. |
| Custom | |
| Custom | Extract data from forms and documents specific to your business. Custom models are trained for your distinct data and use cases. |
| Composed | Compose a collection of custom models and assign them to a single model built from your form types. |
Read (preview)
The Read API analyzes and extracts ext lines, words, their locations, detected languages, and handwritten style if detected.
Sample document processed using the Form Recognizer Studio:
W-2 (preview)
The W-2 model analyzes and extracts key information reported in each box on a W-2 form. The model supports standard and customized forms from 2018 to the present, including single and multiple forms on one page.
Sample W-2 document processed using Form Recognizer Studio:
General document (preview)
The general document API supports most form types and will analyze your documents and associate values to keys and entries to tables that it discovers. It's ideal for extracting common key-value pairs from documents. You can use the general document model as an alternative to training a custom model without labels.
The general document is a pre-trained model and can be directly invoked via the REST API.
The general document model supports named entity recognition (NER) for several entity categories. NER is the ability to identify different entities in text and categorize them into pre-defined classes or types such as: person, location, event, product, and organization. Extracting entities can be useful in scenarios where you want to validate extracted values. The entities are extracted from the entire content.
Sample document processed using the Form Recognizer Studio:
Layout
The Layout API analyzes and extracts text, tables and headers, selection marks, and structure information from forms and documents.
Sample document processed using the Form Recognizer Studio:
Invoice
The invoice model analyzes and extracts key information from sales invoices. The API analyzes invoices in various formats and extracts key information such as customer name, billing address, due date, and amount due. Currently, the model supports both English and Spanish invoices.
Sample invoice processed using Form Recognizer Studio:
Receipt
The receipt model analyzes and extracts key information from printed and handwritten receipts.
Sample receipt processed using Form Recognizer Studio:
ID document
The ID document model analyzes and extracts key information from the following documents:
U.S. Driver's Licenses (all 50 states and District of Columbia)
Biographical pages from international passports (excluding visa and other travel documents). The API analyzes identity documents and extracts
Sample U.S. Driver's License processed using Form Recognizer Studio:
Business card
The business card model analyzes and extracts key information from business card images.
Sample business card processed using Form Recognizer Studio:
Custom
The custom model analyzes and extracts data from forms and documents specific to your business. The API is a machine-learning program trained to recognize form fields within your distinct content and extract key-value pairs and table data. You only need five examples of the same form type to get started and your custom model can be trained with or without labeled datasets.
Sample custom template processed using Form Recognizer Studio:
Composed custom model
A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. You can assign multiple custom models to a composed model called with a single model ID. you can assign up to 100 trained custom models to a single composed model.
Composed model dialog windowForm Recognizer Studio:
Model data extraction
| Data extraction | Text extraction | Key-Value pairs | Fields | Selection Marks | Tables | Entities |
|---|---|---|---|---|---|---|
| 🆕 prebuilt-read | ✓ | |||||
| 🆕 prebuilt-tax.us.w2 | ✓ | ✓ | ✓ | ✓ | ✓ | |
| 🆕 prebuilt-document | ✓ | ✓ | ✓ | ✓ | ✓ | |
| prebuilt-layout | ✓ | ✓ | ✓ | |||
| prebuilt-invoice | ✓ | ✓ | ✓ | ✓ | ✓ | |
| prebuilt-receipt | ✓ | ✓ | ✓ | |||
| prebuilt-idDocument | ✓ | ✓ | ✓ | |||
| prebuilt-businessCard | ✓ | ✓ | ✓ | |||
| Custom | ✓ | ✓ | ✓ | ✓ | ✓ |
Input requirements
- For best results, provide one clear photo or high-quality scan per document.
- Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
- For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
- The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
- Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
- PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
- The total size of the training data is 500 pages or less.
- If your PDFs are password-locked, you must remove the lock before submission.
Note
The Sample Labeling tool does not support the BMP file format. This is a limitation of the tool not the Form Recognizer Service.
Form Recognizer preview v3.0
Form Recognizer v3.0 (preview) introduces several new features and capabilities:
- Read (preview) model is a new API that extracts text lines, words, their locations, detected languages, and handwritten text, if detected.
- General document (preview) model is a new API that uses a pre-trained model to extract text, tables, structure, key-value pairs, and named entities from forms and documents.
- Receipt (preview) model supports single-page hotel receipt processing.
- ID document (preview) model supports endorsements, restrictions, and vehicle classification extraction from US driver's licenses.
- W-2 (preview) model supports employee, employer, wage information, etc. from US W-2 forms.
- Custom model API (preview) supports signature detection for custom forms.
Version migration
Learn how to use Form Recognizer v3.0 in your applications by following our Form Recognizer v3.0 migration guide
Next steps
Learn how to process your own forms and documents with our Form Recognizer sample tool
Complete a Form Recognizer quickstart and get started creating a document processing app in the development language of your choice.
Povratne informacije
Pošalјite i prikažite povratne informacije za