What is Form Recognizer?
TLS 1.2 is now enforced for all HTTP requests to this service. For more information, see Azure Cognitive Services security.
Azure Form Recognizer is a cognitive service that uses machine learning technology to identify and extract text, key/value pairs and table data from form documents. It ingests text from forms and outputs structured data that includes the relationships in the original file. You quickly get accurate results that are tailored to your specific content without heavy manual intervention or extensive data science expertise. Form Recognizer is comprised of custom models, the prebuilt receipt model, and the layout API. You can call Form Recognizer models by using a REST API to reduce complexity and integrate it into your workflow or application.
Form Recognizer is made up of the following services:
- Custom models - Extract key/value pairs and table data from forms. These models are trained with your own data, so they're tailored to your forms.
- Prebuilt receipt model - Extract data from USA sales receipts using a prebuilt model.
- Layout API - Extract text and table structures, along with their bounding box coordinates, from documents.
Form Recognizer custom models train to your own data, and you only need five sample input forms to start. A trained model can output structured data that includes the relationships in the original form document. After you train the model, you can test and retrain it and eventually use it to reliably extract data from more forms according to your needs.
You have the following options when you train custom models: training with labeled data and without labeled data.
Train without labels
By default, Form Recognizer uses unsupervised learning to understand the layout and relationships between fields and entries in your forms. When you submit your input forms, the algorithm clusters the forms by type, discovers what keys and tables are present, and associates values to keys and entries to tables. This doesn't require manual data labeling or intensive coding and maintenance, and we recommend you try this method first.
Train with labels
When you train with labeled data, the model does supervised learning to extract values of interest, using the labeled forms you provide. This results in better-performing models and can produce models that work with complex forms or forms containing values without keys.
Form Recognizer uses the Layout API to learn the expected sizes and positions of printed and handwritten text elements. Then it uses user-specified labels to learn the key/value associations in the documents. We recommend that you use five manually labeled forms of the same type to get started when training a new model and add more labeled data as needed to improve the model accuracy.
Prebuilt receipt model
Form Recognizer also includes a model for reading English sales receipts from the United States—the type used by restaurants, gas stations, retail, and so on (sample receipt). This model extracts key information such as the time and date of the transaction, merchant information, amounts of taxes and totals and more. In addition, the prebuilt receipt model is trained to recognize and return all of the text on a receipt.
Form Recognizer can also extract text and table structure (the row and column numbers associated with the text) using high-definition optical character recognition (OCR).
Follow a quickstart to get started extracting data from your forms. We recommend that you use the free service when you're learning the technology. Remember that the number of free pages is limited to 500 per month.
- Client library quickstart (all languages, multiple scenarios)
- Web UI quickstarts
- REST quickstarts
Review the REST APIs
You'll use the following APIs to train models and extract structured data from forms.
|Train Custom Model||Train a new model to analyze your forms by using five forms of the same type. Set the useLabelFile parameter to
|Analyze Form||Analyze a single document passed in as a stream to extract text, key/value pairs and tables from the form with your custom model.|
|Analyze Receipt||Analyze a single receipt document to extract key information and other receipt text.|
|Analyze Layout||Analyze the layout of a form to extract text and table structure.|
Form Recognizer works on input documents that meet these requirements:
- Format must be JPG, PNG, PDF (text or scanned), or TIFF. Text-embedded PDFs are best because there's no possibility of error in character extraction and location.
- If your PDFs are password-locked, you must remove the lock before submitting them.
- PDF and TIFF documents must be 200 pages or less, and the total size of the training data set must be 500 pages or less.
- For images, dimensions must be between 600 x 100 pixels and 4200 x 4200 pixels.
- If scanned from paper documents, forms should be high-quality scans.
- Text must use the Latin alphabet (English characters).
- For unsupervised learning (without labeled data), data must contain keys and values.
- For unsupervised learning (without labeled data), keys must appear above or to the left of the values; they can't appear below or to the right.
Form Recognizer doesn't currently support these types of input data:
- Complex tables (nested tables, merged headers or cells, and so on).
- Checkboxes or radio buttons.
Prebuilt receipt model
The input requirements for the receipt model are slightly different.
- Format must be JPEG, PNG, PDF (text or scanned) or TIFF.
- File size must be less than 20 MB.
- Image dimensions must be between 50 x 50 pixels and 10000 x 10000 pixels.
- PDF dimensions must be at most 17 x 17 inches, corresponding to Legal or A3 paper sizes and smaller.
- For PDF and TIFF, only the first 200 pages are processed (with a free tier subscription, only the first two pages are processed).
Data privacy and security
As with all the cognitive services, developers using the Form Recognizer service should be aware of Microsoft policies on customer data. See the Cognitive Services page on the Microsoft Trust Center to learn more.