Train a Form Recognizer model with labels using the sample labeling tool
In this quickstart, you'll use the Form Recognizer REST API with the sample labeling tool to train a custom model with manually labeled data. See the Train with labels section of the overview to learn more about this feature.
If you don't have an Azure subscription, create a free account before you begin.
To complete this quickstart, you must have:
- A set of at least six forms of the same type. You'll use this data to train the model and test a form. You can use a sample data set for this quickstart. Upload the training files to the root of a blob storage container in an Azure Storage account.
Create a Form Recognizer resource
Go to the Azure Portal and create a new Form Recognizer resource. In the Create pane, provide the following information:
|Name||A descriptive name for your resource. We recommend using a descriptive name, for example MyNameFormRecognizer.|
|Subscription||Select the Azure subscription which has been granted access.|
|Location||The location of your cognitive service instance. Different locations may introduce latency, but have no impact on the runtime availability of your resource.|
|Pricing tier||The cost of your resource depends on the pricing tier you choose and your usage. For more information, see the API pricing details.|
|Resource group||The Azure resource group that will contain your resource. You can create a new group or add it to a pre-existing group.|
Normally when you create a Cognitive Service resource in the Azure portal, you have the option to create a multi-service subscription key (used across multiple cognitive services) or a single-service subscription key (used only with a specific cognitive service). However, because Form Recognizer is a preview release, it is not included in the multi-service subscription, and you cannot create the single-service subscription unless you use the link provided in the Welcome email.
When your Form Recognizer resource finishes deploying, find and select it from the All resources list in the portal. Then select the Quick start tab to view your subscription data. Save the values of Key1 and Endpoint to a temporary location. You'll use them in the following steps.
Set up the sample labeling tool
You'll use the Docker engine to run the sample labeling tool. Follow these steps to set up the Docker container. For a primer on Docker and container basics, see the Docker overview.
First, install Docker on a host computer. The host computer can be your local computer (Windows, macOS, or Linux). Or, you can use a Docker hosting service in Azure, such as the Azure Kubernetes Service, Azure Container Instances, or a Kubernetes cluster deployed to an Azure Stack. The host computer must meet the following hardware requirements:
Container Minimum Recommended Sample labeling tool 2 core, 4-GB memory 4 core, 8-GB memory
Get the sample labeling tool container with the
docker pull mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool
Now you're ready to run the container with
docker run -it -p 3000:80 mcr.microsoft.com/azure-cognitive-services/custom-form/labeltool eula=accept
This command will make the sample labeling tool available through a web browser. Go to http://localhost:3000.
You can also label documents and train models using the Form Recognizer REST API. To train and Analyze with the REST API, see Train with labels using the REST API and Python.
Set up input data
First, make sure all the training documents are of the same format. If you have forms in multiple formats, organize them into subfolders based on common format. When you train, you'll need to direct the API to a subfolder.
Configure cross-domain resource sharing (CORS)
Enable CORS on your storage account. Select your storage account in the Azure portal and click the CORS tab on the left pane. On the bottom line, fill in the following values. Then click Save at the top.
- Allowed origins = *
- Allowed methods = [select all]
- Allowed headers = *
- Exposed headers = *
- Max age = 200
Connect to the sample labeling tool
The sample labeling tool connects to a source (where your original forms are) and a target (where it exports the created labels and output data).
Connections can be set up and shared across projects. They use an extensible provider model, so you can easily add new source/target providers.
To create a new connection, click the New Connections (plug) icon, in the left navigation bar.
Fill in the fields with the following values:
- Display Name - The connection display name.
- Description - Your project description.
- SAS URL - The shared access signature (SAS) URL of your Azure Blob Storage container. To retrieve the SAS URL, open the Microsoft Azure Storage Explorer, right-click your container, and select Get shared access signature. Set the expiry time to some time after you'll have used the service. Make sure the Read, Write, Delete, and List permissions are checked, and click Create. Then copy the value in the URL section. It should have the form:
https://<storage account>.blob.core.windows.net/<container name>?<SAS value>.
Create a new project
In the sample labeling tool, projects store your configurations and settings. Create a new project and fill in the fields with the following values:
- Display Name - the project display name
- Security Token - Some project settings can include sensitive values, such as API keys or other shared secrets. Each project will generate a security token that can be used to encrypt/decrypt sensitive project settings. You can find security tokens in the Application Settings by clicking the gear icon in the lower corner of the left navigation bar.
- Source Connection - The Azure Blob Storage connection you created in the previous step that you would like to use for this project.
- Folder Path - Optional - If your source forms are located in a folder on the blob container, specify the folder name here
- Form Recognizer Service Uri - Your Form Recognizer endpoint URL.
- API Key - Your Form Recognizer subscription key.
- Description - Optional - Project description
Label your forms
When you create or open a project, the main tag editor window opens. The tag editor consists of three parts:
- A resizable preview pane that contains a scrollable list of forms from the source connection.
- The main editor pane that allows you to apply tags.
- The tags editor pane that allows users to modify, lock, reorder, and delete tags.
Identify text elements
Click Run OCR on all files on the left pane to get the text layout information for each document. The labeling tool will draw bounding boxes around each text element.
Apply labels to text
Next, you'll create labels and apply them to the text elements that you want the model to recognize.
First, use the tags editor pane to create the tags (labels) you'd like to identify.
In the main editor, click and drag to select one or multiple words from the highlighted text elements.
You cannot currently select text that spans across multiple pages.
Click on the tag you want to apply, or press corresponding keyboard key. You can only apply one tag to each selected text element, and each tag can only be applied once per page.
The number keys are assigned as hotkeys for the first ten tags. You can reorder your tags using the up and down arrow icons in the tag editor pane.
Follow the above steps to label five of your forms, and then move on to the next step.
Train a custom model
Click the Train icon (the train car) on the left pane to open the Training page. Then click the Train button to begin training the model. Once the training process completes, you'll see the following information:
- Model ID - The ID of the model that was created and trained. Each training call creates a new model with its own ID. Copy this string to a secure location; you'll need it if you want to do prediction calls through the REST API.
- Average Accuracy - The model's average accuracy. You can improve model accuracy by labeling additional forms and training again to create a new model. We recommend starting by labeling five forms and adding more forms as needed.
- The list of tags, and the estimated accuracy per tag.
After training finishes, examine the Average Accuracy value. If it's low, you should add more input documents and repeat the steps above. The documents you've already labeled will remain in the project index.
You can also run the training process with a REST API call. To learn how to do this, see Train with labels using Python.
Analyze a form
Click on the Predict (rectangles) icon on the left to test your model. Upload a form document that you haven't used in the training process. Then click the Predict button on the right to get key/value predictions for the form. The tool will apply tags in bounding boxes and will report the confidence of each tag.
You can also run the Analyze API with a REST call. To learn how to do this, see Train with labels using Python.
Depending on the reported accuracy, you may want to do further training to improve the model. After you've done a prediction, examine the confidence values for each of the applied tags. If the average accuracy training value was high, but the confidence scores are low (or the results are inaccurate), you should add the file used for prediction into the training set, label it, and train again.
The reported average accuracy, confidence scores, and actual accuracy can be inconsistent when the analyzed documents differ from those used in training. Keep in mind that some documents look similar when viewed by people but can look distinct to the AI model. For example, you might train with a form type that has two variations, where the training set consists of 20% variation A and 80% variation B. During prediction, the confidence scores for documents of variation A are likely to be lower.
Save a project and resume later
To resume your project at another time or in another browser, you need to save your project's security token and reenter it later.
Get project credentials
Go to your project settings page (slider icon) and take note of the security token name. Then go to your application settings (gear icon), which shows all of the security tokens in your current browser instance. Find your project's security token and copy its name and key value to a secure location.
Restore project credentials
When you want to resume your project, you first need to create a connection to the same blob storage container. Repeat the steps above to do this. Then, go to the application settings page (gear icon) and see if your project's security token is there. If it isn't, add a new security token and copy over your token name and key from the previous step. Then click Save Settings.
Resume a project
Finally, go to the main page (house icon) and click Open Cloud Project. Then select the blob storage connection, and select your project's .vott file. The application will load all of the project's settings because it has the security token.
In this quickstart, you learned how to use the Form Recognizer sample labeling tool to train a model with manually labeled data. If you'd like to integrate the labeling tool into your own application, use the REST APIs that deal with labeled data training.