Azure Form Recognizer client library for Java - version 3.1.13
Azure Cognitive Services Form Recognizer is a cloud service that uses machine learning to recognize text and table data from form documents. It includes the following main functionalities:
- Custom models - Recognize field values and table data from forms. These models are trained with your own data, so they're tailored to your forms. You can then take these custom models and recognize forms. You can also manage the custom models you've created and see how close you are to the limit of custom models your account can hold.
- Content API - Recognize text and table structures, along with their bounding box coordinates, from documents. Corresponds to the REST service's Layout API.
- Prebuilt receipt model - Recognize data from sales receipts using a prebuilt model.
- Prebuilt invoice model - Recognize data from USA sales invoices using a prebuilt model.
- Prebuilt business card model - Recognize data from business cards using a prebuilt model.
- Prebuilt identity document model - Recognize data from identity documents using a prebuilt model.
Source code | Package (Maven) | API reference documentation | Product Documentation | Samples
Getting started
Prerequisites
- A Java Development Kit (JDK), version 8 or later.
- Azure Subscription
- Cognitive Services or Form Recognizer account to use this package.
Include the Package
Include the BOM file
Please include the azure-sdk-bom to your project to take dependency on GA version of the library. In the following snippet, replace the {bom_version_to_target} placeholder with the version number. To learn more about the BOM, see the AZURE SDK BOM README.
<dependencyManagement>
<dependencies>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-sdk-bom</artifactId>
<version>{bom_version_to_target}</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
and then include the direct dependency in the dependencies section without the version tag.
<dependencies>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-ai-formrecognizer</artifactId>
</dependency>
</dependencies>
Include direct dependency
If you want to take dependency on a particular version of the library that is not present in the BOM, add the direct dependency to your project as follows.
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-ai-formrecognizer</artifactId>
<version>3.1.12</version>
</dependency>
Note: This version of the client library defaults to the
v2.1version of the service.
This table shows the relationship between SDK versions and supported API versions of the service:
| SDK version | Supported API version of service |
|---|---|
| 3.0.x | 2.0 |
| 3.1.X - Latest GA release | 2.0, 2.1 (default) |
Create a Form Recognizer resource
Form Recognizer supports both multi-service and single-service access. Create a Cognitive Service's resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource.
You can create either resource using the
Option 1: Azure Portal
Option 2: Azure CLI
Below is an example of how you can create a Form Recognizer resource using the CLI:
# Create a new resource group to hold the Form Recognizer resource -
# if using an existing resource group, skip this step
az group create --name <your-resource-group> --location <location>
# Create Form Recognizer
az cognitiveservices account create \
--name <your-form-recognizer-resource-name> \
--resource-group <your-resource-group> \
--kind FormRecognizer \
--sku <sku> \
--location <location> \
--yes
Authenticate the client
In order to interact with the Form Recognizer service, you will need to create an instance of the Form Recognizer client.
Both the asynchronous and synchronous clients can be created by using FormRecognizerClientBuilder. Invoking buildClient()
will create the synchronous client, while invoking buildAsyncClient will create its asynchronous counterpart.
You will need an endpoint, and a key to instantiate a client object.
Looking up the endpoint
You can find the endpoint for your Form Recognizer resource in the Azure Portal, or Azure CLI.
# Get the endpoint for the resource
az cognitiveservices account show --name "resource-name" --resource-group "resource-group-name" --query "endpoint"
Create a Form Recognizer client using AzureKeyCredential
To use AzureKeyCredential authentication, provide the key as a string to the AzureKeyCredential.
This key can be found in the Azure Portal in your created Form Recognizer
resource, or by running the following Azure CLI command to get the key from the Form Recognizer resource:
az cognitiveservices account keys list --resource-group <your-resource-group-name> --name <your-resource-name>
Use the API key as the credential parameter to authenticate the client:
FormRecognizerClient formRecognizerClient = new FormRecognizerClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("{endpoint}")
.buildClient();
FormTrainingClient formTrainingClient = new FormTrainingClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("{endpoint}")
.buildClient();
The Azure Form Recognizer client library provides a way to rotate the existing key.
AzureKeyCredential credential = new AzureKeyCredential("{key}");
FormRecognizerClient formRecognizerClient = new FormRecognizerClientBuilder()
.credential(credential)
.endpoint("{endpoint}")
.buildClient();
credential.update("{new_key}");
Create a Form Recognizer client with Azure Active Directory credential
Azure SDK for Java supports an Azure Identity package, making it easy to get credentials from Microsoft identity platform.
Authentication with AAD requires some initial setup:
- Add the Azure Identity package
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-identity</artifactId>
<version>1.5.2</version>
</dependency>
- Register a new Azure Active Directory application
- Grant access to Form Recognizer by assigning the
"Cognitive Services User"role to your service principal.
After the setup, you can choose which type of credential from azure.identity to use. As an example, DefaultAzureCredential can be used to authenticate the client: Set the values of the client ID, tenant ID, and client secret of the AAD application as environment variables: AZURE_CLIENT_ID, AZURE_TENANT_ID, AZURE_CLIENT_SECRET.
Authorization is easiest using DefaultAzureCredential. It finds the best credential to use in its running environment. For more information about using Azure Active Directory authorization with Form Recognizer, please refer to the associated documentation.
TokenCredential credential = new DefaultAzureCredentialBuilder().build();
FormRecognizerClient formRecognizerClient = new FormRecognizerClientBuilder()
.endpoint("{endpoint}")
.credential(credential)
.buildClient();
Key concepts
FormRecognizerClient
The FormRecognizerClient and FormRecognizerAsyncClient provide both synchronous and asynchronous operations
Recognizing form fields and content using custom models trained to recognize your custom forms. These values are returned in a collection of
RecognizedFormobjects. See example Recognize Custom Forms.Recognizing form content, including tables, lines and words, without the need to train a model. Form content is returned in a collection of
FormPageobjects. See example Recognize Content.Recognizing common fields from the following form types using prebuilt models. These fields and meta-data are returned in a collection of
RecognizedFormobjects. Supported prebuilt models:- Receipts
- Business cards
- Invoices
- Identity Documents
See example Prebuilt Models.
FormTrainingClient
The FormTrainingClient and FormTrainingAsyncClient provide both synchronous and asynchronous operations
- Training custom models to recognize all fields and values found in your custom forms. See example Train a model.
A
CustomFormModelis returned indicating the form types the model will recognize, and the fields it will extract for each form type. See the service's documents for a more detailed explanation. - Training custom models to recognize specific fields and values you specify by labeling your custom forms.
A
CustomFormModelis returned indicating the fields the model will extract, as well as the estimated accuracy for each field. See the service's documents for a more detailed explanation. - Managing models created in your account. See example Manage models.
- Copying a custom model from one Form Recognizer resource to another.
- Creating a composed model from a collection of existing trained models with labels.
Please note that models can also be trained using a graphical user interface such as the Form Recognizer Labeling Tool.
Long-Running Operations
Long-running operations are operations which consist of an initial request sent to the service to start an operation, followed by polling the service at intervals to determine whether the operation has completed or failed, and if it has succeeded, to get the result.
Methods that train models or recognize values from forms are modeled as long-running operations. The client exposes
a begin<MethodName> method that returns a SyncPoller or PollerFlux instance.
Callers should wait for the operation to completed by calling getFinalResult() on the returned operation from the
begin<MethodName> method. Sample code snippets are provided to illustrate using long-running operations
below.
Examples
The following section provides several code snippets covering some of the most common Form Recognizer tasks, including:
- Recognize Forms Using a Custom Model
- Recognize Content
- Use Prebuilt Models
- Train a Model
- Manage Your Models
Recognize Forms Using a Custom Model
Recognize the name/value pairs and table data from forms. These models are trained with your own data, so they're tailored to your forms. You should only recognize forms of the same form type that the custom model was trained on.
String formUrl = "{form_url}";
String modelId = "{custom_trained_model_id}";
SyncPoller<FormRecognizerOperationResult, List<RecognizedForm>> recognizeFormPoller =
formRecognizerClient.beginRecognizeCustomFormsFromUrl(modelId, formUrl);
List<RecognizedForm> recognizedForms = recognizeFormPoller.getFinalResult();
for (int i = 0; i < recognizedForms.size(); i++) {
RecognizedForm form = recognizedForms.get(i);
System.out.printf("----------- Recognized custom form info for page %d -----------%n", i);
System.out.printf("Form type: %s%n", form.getFormType());
System.out.printf("Form type confidence: %.2f%n", form.getFormTypeConfidence());
form.getFields().forEach((label, formField) ->
System.out.printf("Field %s has value %s with confidence score of %f.%n", label,
formField.getValueData().getText(),
formField.getConfidence())
);
}
Recognize Content
Recognize text, table structures and selection marks like radio buttons and check boxes, along with their bounding box coordinates, from documents, without the need to train a model.
// recognize form content using file input stream
File form = new File("local/file_path/filename.png");
byte[] fileContent = Files.readAllBytes(form.toPath());
InputStream inputStream = new ByteArrayInputStream(fileContent);
SyncPoller<FormRecognizerOperationResult, List<FormPage>> recognizeContentPoller =
formRecognizerClient.beginRecognizeContent(inputStream, form.length());
List<FormPage> contentPageResults = recognizeContentPoller.getFinalResult();
for (int i = 0; i < contentPageResults.size(); i++) {
FormPage formPage = contentPageResults.get(i);
System.out.printf("----Recognizing content info for page %d ----%n", i);
// Table information
System.out.printf("Has width: %f and height: %f, measured with unit: %s.%n", formPage.getWidth(),
formPage.getHeight(),
formPage.getUnit());
formPage.getTables().forEach(formTable -> {
System.out.printf("Table has %d rows and %d columns.%n", formTable.getRowCount(),
formTable.getColumnCount());
formTable.getCells().forEach(formTableCell ->
System.out.printf("Cell has text %s.%n", formTableCell.getText()));
});
// Selection Mark
formPage.getSelectionMarks().forEach(selectionMark -> System.out.printf(
"Page: %s, Selection mark is %s within bounding box %s has a confidence score %.2f.%n",
selectionMark.getPageNumber(), selectionMark.getState(), selectionMark.getBoundingBox().toString(),
selectionMark.getConfidence()));
}
Use Prebuilt Models
Extract fields from certain types of common forms using prebuilt models provided by the Form Recognizer service. Supported prebuilt models are:
- Business cards. See fields found on a business card here.
- Invoices. See fields found on an invoice here.
- Identity documents. See fields found on an identity document here.
- Sales receipts. See fields found on a receipt here.
For example, to extract fields from a sales receipt, use the prebuilt Receipt model provided by the beginRecognizeReceiptsFromUrl method:
See StronglyTypedRecognizedForm for a suggested approach to extract information from receipts.
String receiptUrl = "https://raw.githubusercontent.com/Azure/azure-sdk-for-java/main/sdk/formrecognizer"
+ "/azure-ai-formrecognizer/src/samples/resources/sample-forms/receipts/contoso-allinone.jpg";
SyncPoller<FormRecognizerOperationResult, List<RecognizedForm>> syncPoller =
formRecognizerClient.beginRecognizeReceiptsFromUrl(receiptUrl);
List<RecognizedForm> receiptPageResults = syncPoller.getFinalResult();
for (int i = 0; i < receiptPageResults.size(); i++) {
RecognizedForm recognizedForm = receiptPageResults.get(i);
Map<String, FormField> recognizedFields = recognizedForm.getFields();
System.out.printf("----------- Recognizing receipt info for page %d -----------%n", i);
FormField merchantNameField = recognizedFields.get("MerchantName");
if (merchantNameField != null) {
if (FieldValueType.STRING == merchantNameField.getValue().getValueType()) {
String merchantName = merchantNameField.getValue().asString();
System.out.printf("Merchant Name: %s, confidence: %.2f%n",
merchantName, merchantNameField.getConfidence());
}
}
FormField merchantPhoneNumberField = recognizedFields.get("MerchantPhoneNumber");
if (merchantPhoneNumberField != null) {
if (FieldValueType.PHONE_NUMBER == merchantPhoneNumberField.getValue().getValueType()) {
String merchantAddress = merchantPhoneNumberField.getValue().asPhoneNumber();
System.out.printf("Merchant Phone number: %s, confidence: %.2f%n",
merchantAddress, merchantPhoneNumberField.getConfidence());
}
}
FormField transactionDateField = recognizedFields.get("TransactionDate");
if (transactionDateField != null) {
if (FieldValueType.DATE == transactionDateField.getValue().getValueType()) {
LocalDate transactionDate = transactionDateField.getValue().asDate();
System.out.printf("Transaction Date: %s, confidence: %.2f%n",
transactionDate, transactionDateField.getConfidence());
}
}
FormField receiptItemsField = recognizedFields.get("Items");
if (receiptItemsField != null) {
System.out.printf("Receipt Items: %n");
if (FieldValueType.LIST == receiptItemsField.getValue().getValueType()) {
List<FormField> receiptItems = receiptItemsField.getValue().asList();
receiptItems.stream()
.filter(receiptItem -> FieldValueType.MAP == receiptItem.getValue().getValueType())
.map(formField -> formField.getValue().asMap())
.forEach(formFieldMap -> formFieldMap.forEach((key, formField) -> {
if ("Quantity".equals(key)) {
if (FieldValueType.FLOAT == formField.getValue().getValueType()) {
Float quantity = formField.getValue().asFloat();
System.out.printf("Quantity: %f, confidence: %.2f%n",
quantity, formField.getConfidence());
}
}
}));
}
}
}
For more information and samples using prebuilt models see:
Train a model
Train a machine-learned model on your own form type. The resulting model will be able to recognize values from the types of forms it was trained on. Provide a container SAS url to your Azure Storage Blob container where you're storing the training documents. See details on setting this up in the service quickstart documentation.
String trainingFilesUrl = "{SAS_URL_of_your_container_in_blob_storage}";
SyncPoller<FormRecognizerOperationResult, CustomFormModel> trainingPoller =
formTrainingClient.beginTraining(trainingFilesUrl,
false,
new TrainingOptions()
.setModelName("my model trained without labels"),
Context.NONE);
CustomFormModel customFormModel = trainingPoller.getFinalResult();
// Model Info
System.out.printf("Model Id: %s%n", customFormModel.getModelId());
System.out.printf("Model name given by user: %s%n", customFormModel.getModelName());
System.out.printf("Model Status: %s%n", customFormModel.getModelStatus());
System.out.printf("Training started on: %s%n", customFormModel.getTrainingStartedOn());
System.out.printf("Training completed on: %s%n%n", customFormModel.getTrainingCompletedOn());
System.out.println("Recognized Fields:");
// looping through the subModels, which contains the fields they were trained on
// Since the given training documents are unlabeled, we still group them but they do not have a label.
customFormModel.getSubmodels().forEach(customFormSubmodel -> {
System.out.printf("Submodel Id: %s%n: ", customFormSubmodel.getModelId());
// Since the training data is unlabeled, we are unable to return the accuracy of this model
customFormSubmodel.getFields().forEach((field, customFormModelField) ->
System.out.printf("Field: %s Field Label: %s%n",
field, customFormModelField.getLabel()));
});
Manage your models
Manage the custom models in your Form Recognizer account.
// First, we see how many custom models we have, and what our limit is
AccountProperties accountProperties = formTrainingClient.getAccountProperties();
System.out.printf("The account has %d custom models, and we can have at most %d custom models",
accountProperties.getCustomModelCount(), accountProperties.getCustomModelLimit());
// Next, we get a paged list of all of our custom models
PagedIterable<CustomFormModelInfo> customModels = formTrainingClient.listCustomModels();
System.out.println("We have following models in the account:");
customModels.forEach(customFormModelInfo -> {
System.out.printf("Model Id: %s%n", customFormModelInfo.getModelId());
// get specific custom model info
CustomFormModel customModel = formTrainingClient.getCustomModel(customFormModelInfo.getModelId());
System.out.printf("Model Status: %s%n", customModel.getModelStatus());
System.out.printf("Training started on: %s%n", customModel.getTrainingStartedOn());
System.out.printf("Training completed on: %s%n", customModel.getTrainingCompletedOn());
customModel.getSubmodels().forEach(customFormSubmodel -> {
System.out.printf("Custom Model Form type: %s%n", customFormSubmodel.getFormType());
System.out.printf("Custom Model Accuracy: %f%n", customFormSubmodel.getAccuracy());
if (customFormSubmodel.getFields() != null) {
customFormSubmodel.getFields().forEach((fieldText, customFormModelField) -> {
System.out.printf("Field Text: %s%n", fieldText);
System.out.printf("Field Accuracy: %f%n", customFormModelField.getAccuracy());
});
}
});
});
// Delete Custom Model
formTrainingClient.deleteModel("{modelId}");
For more detailed examples, refer to samples.
Troubleshooting
General
Form Recognizer clients raises HttpResponseException exceptions. For example, if you try
to provide an invalid file source URL an HttpResponseException would be raised with an error indicating the failure cause.
In the following code snippet, the error is handled
gracefully by catching the exception and display the additional information about the error.
try {
formRecognizerClient.beginRecognizeContentFromUrl("invalidSourceUrl");
} catch (HttpResponseException e) {
System.out.println(e.getMessage());
}
Enable client logging
Azure SDKs for Java offer a consistent logging story to help aid in troubleshooting application errors and expedite their resolution. The logs produced will capture the flow of an application before reaching the terminal state to help locate the root issue. View the logging wiki for guidance about enabling logging.
Default HTTP Client
All client libraries by default use the Netty HTTP client. Adding the above dependency will automatically configure the client library to use the Netty HTTP client. Configuring or changing the HTTP client is detailed in the HTTP clients wiki.
Next steps
The following section provides several code snippets illustrating common patterns used in the Form Recognizer API. These code samples show common scenario operations with the Azure Form Recognizer client library.
- Recognize business card from a URL: RecognizeBusinessCardFromUrl
- Recognize identity documents from a URL: RecognizeIdentityDocumentsFromUrl
- Recognize invoice from a URL: RecognizeInvoiceFromUrl
- Recognize receipts: RecognizeReceipts
- Recognize receipts from a URL: RecognizeReceiptsFromUrl
- Recognize content: RecognizeContent
- Recognize custom forms from a URL: RecognizeCustomFormsFromUrl
- Train a model without labels: TrainModelWithoutLabels
- Train a model with labels: TrainModelWithLabels
- Manage custom models: ManageCustomModels
- Copy a model between Form Recognizer resources: CopyModel
Async APIs
All the examples shown so far have been using synchronous APIs, but we provide full support for async APIs as well.
You'll need to use FormRecognizerAsyncClient
FormRecognizerAsyncClient formRecognizerAsyncClient = new FormRecognizerClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("{endpoint}")
.buildAsyncClient();
- Recognize business card from a URL: RecognizeBusinessCardFromUrlAsync
- Recognize identity documents from a URL: RecognizeIdentityDocumentsFromUrlAsync
- Recognize invoice: RecognizeInvoiceAsync
- Recognize receipts: RecognizeReceiptsAsync
- Recognize receipts from a URL: RecognizeReceiptsFromUrlAsync
- Recognize content from a URL: RecognizeContentFromUrlAsync
- Recognize custom forms: RecognizeCustomFormsAsync
- Train a model without labels: TrainModelWithoutLabelsAsync
- Train a model with labels: TrainModelWithLabelsAsync
- Manage custom models: ManageCustomModelsAsync
- Copy a model between Form Recognizer resources: CopyModelAsync
- Create a composed model from a collection of models trained with labels: CreateComposedModelAsync
Additional documentation
For more extensive documentation on Azure Cognitive Services Form Recognizer, see the Form Recognizer documentation.
Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Saran dan Komentar
Kirim dan lihat umpan balik untuk