Compose custom models v2.1
Note
This how-to guide references Form Recognizer v2.1 (GA). To try Form Recognizer v3.0 (preview), see Compose custom models v3.0 (preview).
Form Recognizer uses advanced machine-learning technology to detect and extract information from document images and return the extracted data in a structured JSON output. With Form Recognizer, you can train standalone custom models or combine custom models to create composed models.
Custom models. Form Recognizer custom models enable you to analyze and extract data from forms and documents specific to your business. Custom models are trained for your distinct data and use cases.
Composed models. A composed model is created by taking a collection of custom models and assigning them to a single model that encompasses your form types. When a document is submitted to a composed model, the service performs a classification step to decide which custom model accurately represents the form presented for analysis.
In this article, you'll learn how to create Form Recognizer custom and composed models using our Form Recognizer Sample Labeling tool, REST APIs, or client-library SDKs.
Sample Labeling tool
Try extracting data from custom forms using our Sample Labeling tool. You'll need the following resources:
An Azure subscription—you can create one for free
A Form Recognizer instance in the Azure portal. You can use the free pricing tier (
F0) to try the service. After your resource deploys, select Go to resource to get your key and endpoint.
In the Form Recognizer UI:
Select Use Custom to train a model with labels and get key value pairs.
In the next window, select New project:
Create your models
The steps for building, training, and using custom and composed models are as follows:
- Assemble your training dataset
- Upload your training set to Azure blob storage
- Train your custom model
- Compose custom models
- Analyze documents
- Manage your custom models
Assemble your training dataset
Building a custom model begins with establishing your training dataset. You'll need a minimum of five completed forms of the same type for your sample dataset. They can be of different file types (jpg, png, pdf, tiff) and contain both text and handwriting. Your forms must follow the input requirements for Form Recognizer.
Upload your training dataset
You'll need to upload your training data to an Azure blob storage container. If you don't know how to create an Azure storage account with a container, see Azure Storage quickstart for Azure portal. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
Train your custom model
You train your model with labeled data sets. Labeled datasets rely on the prebuilt-layout API, but supplementary human input is included such as your specific labels and field locations. Start with at least five completed forms of the same type for your labeled training data.
When you train with labeled data, the model uses supervised learning to extract values of interest, using the labeled forms you provide. Labeled data results in better-performing models and can produce models that work with complex forms or forms containing values without keys.
Form Recognizer uses the Layout API to learn the expected sizes and positions of typeface and handwritten text elements and extract tables. Then it uses user-specified labels to learn the key/value associations and tables in the documents. We recommend that you use five manually labeled forms of the same type (same structure) to get started when training a new model. Add more labeled data as needed to improve the model accuracy. Form Recognizer enables training a model to extract key value pairs and tables using supervised learning capabilities.
Get started with Train with labels
Create a composed model
Note
Model Compose is only available for custom models trained with labels. Attempting to compose unlabeled models will produce an error.
With the Model Compose operation, you can assign up to 100 trained custom models to a single model ID. When you call Analyze with the composed model ID, Form Recognizer will first classify the form you submitted, choose the best matching assigned model, and then return results for that model. This operation is useful when incoming forms may belong to one of several templates.
Using the Form Recognizer Sample Labeling tool, the REST API, or the Client-library SDKs, follow the steps below to set up a composed model:
Gather your custom model IDs
Once the training process has successfully completed, your custom model will be assigned a model ID. You can retrieve a model ID as follows:
When you train models using the Form Recognizer Sample Labeling tool, the model ID is located in the Train Result window:
Compose your custom models
After you've gathered your custom models corresponding to a single form type, you can compose them into a single model.
The Sample Labeling tool enables you to quickly get started training models and composing them to a single model ID.
After you have completed training, compose your models as follows:
On the left rail menu, select the Model Compose icon (merging arrow).
In the main window, select the models you wish to assign to a single model ID. Models with the arrows icon are already composed models.
Choose the Compose button from the upper-left corner.
In the pop-up window, name your newly composed model and select Compose.
When the operation completes, your newly composed model will appear in the list.
Analyze documents with your custom or composed model
The custom form Analyze operation requires you to provide the modelID in the call to Form Recognizer. You can provide a single custom model ID or a composed model ID for the modelID parameter.
On the tool's left-pane menu, select the Analyze icon (light bulb).
Choose a local file or image URL to analyze.
Select the Run Analysis button.
The tool will apply tags in bounding boxes and report the confidence percentage for each tag.
Test your newly trained models by analyzing forms that weren't part of the training dataset. Depending on the reported accuracy, you may want to do further training to improve the model. You can continue further training to improve results.
Manage your custom models
You can manage your custom models throughout their lifecycle by viewing a list of all custom models under your subscription, retrieving information about a specific custom model, and deleting custom models from your account.
Great! You've learned the steps to create custom and composed models and use them in your Form Recognizer projects and applications.
Next steps
Learn more about the Form Recognizer client library by exploring our API reference documentation.
Tilbakemeldinger
Send inn og vis tilbakemelding for