Quickstart: Java client library SDK v3.0 | Preview
Note
Form Recognizer v3.0 is currently in public preview. Some features may not be supported or have limited capabilities.
Reference documentation | Library source code | Package (Maven) | Samples
Get started with Azure Form Recognizer using the Java programming language. Azure Form Recognizer is a cloud-based Azure Applied AI Service that uses machine learning to extract and analyze form fields, text, and tables from your documents. You can easily call Form Recognizer models by integrating our client library SDks into your workflows and applications. We recommend that you use the free service when you're learning the technology. Remember that the number of free pages is limited to 500 per month.
To learn more about Form Recognizer features and development options, visit our Overview page.
In this quickstart you'll use following features to analyze and extract data and values from forms and documents:
🆕 General document—Analyze and extract text, tables, structure, key-value pairs and named entities.
Layout—Analyze and extract tables, lines, words, and selection marks like radio buttons and check boxes in forms documents, without the need to train a model.
Prebuilt InvoiceAnalyze and extract common fields from invoices, using a pre-trained invoice model.
Prerequisites
- Azure subscription - Create one for free.
- A Java Development Kit (JDK), version 8 or later
- A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (
F0) to try the service, and upgrade later to a paid tier for production.
Tip
Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'lll need a single-service resource if you intend to use Azure Active Directory authentication.
After your resource deploys, click Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:
Set up
Create a new Gradle project
In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app called form-recognizer-app, and navigate to it.
mkdir form-recognizer-app && form-recognizer-app
Run the gradle init command from your working directory. This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.
gradle init --type basic
When prompted to choose a DSL, select Kotlin.
Accept the default project name (form-recognizer-app)
Install the client library
This quickstart uses the Gradle dependency manager. You can find the client library and information for other dependency managers on the Maven Central Repository.
In your project's build.gradle.kts file, include the client library as an implementation statement, along with the required plugins and settings.
plugins {
java
application
}
application {
mainClass.set("FormRecognizer")
}
repositories {
mavenCentral()
}
dependencies {
implementation(group = "com.azure", name = "azure-ai-formrecognizer", version = "4.0.0-beta.1")
}
Create a Java file
From your working directory, run the following command:
mkdir -p src/main/java
You will create the following directory structure:
Navigate to the new folder and create a file called FormRecognizer.java (created during the gradle init command). Open it in your preferred editor or IDE and add the following package declaration and import statements:
package com.azure.ai.formrecognizer;
import com.azure.ai.formrecognizer.models.AnalyzeDocumentOptions;
import com.azure.ai.formrecognizer.models.AnalyzedDocument;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.http.HttpPipeline;
import com.azure.core.http.HttpPipelineBuilder;
import com.azure.core.util.Context;
import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.util.Arrays;
Select a code sample to copy and paste into your application's main method:
Important
Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. See the Cognitive Services security article for more information.
Try it: General document model
- For this example, you'll need a form document file at a URI. You can use our sample form document for this quickstart.
- To analyze a given file at a URI, you'll use the
beginAnalyzeDocumentFromUrlmethod and passprebuilt-documentas the model Id. The returned value is anAnalyzeResultobject containing data about the submitted document. - We've added the file URI value to the
documentUrlvariable in the main method. - For simplicity, all the entity fields that the service returns are not shown here. To see the list of all supported fields and corresponding types, see our General document concept page.
Update your application's FormRecognizer class, with the following code (be certain to update the key and endpoint variables with values from your Form Recognizer instance in the Azure portal):
public class FormRecognizer {
static final String key = "PASTE_YOUR_FORM_RECOGNIZER_SUBSCRIPTION_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
public static void main(String[] args) {
DocumentAnalysisClient documentAnalysisClient = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("{endpoint}")
.buildClient();
String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf";
String modelId = "prebuilt-document";
SyncPoller < DocumentOperationResult, AnalyzeResult > analyzeDocumentPoller =
documentAnalysisClient.beginAnalyzeDocumentFromUrl(modelId, documentUrl);
AnalyzeResult analyzeResult = analyzeDocumentPoller.getFinalResult();
for (int i = 0; i < analyzeResult.getDocuments().size(); i++) {
final AnalyzedDocument analyzedDocument = analyzeResult.getDocuments().get(i);
System.out.printf("----------- Analyzing document %d -----------%n", i);
System.out.printf("Analyzed document has doc type %s with confidence : %.2f%n",
analyzedDocument.getDocType(), analyzedDocument.getConfidence());
}
analyzeResult.getPages().forEach(documentPage - > {
System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
documentPage.getWidth(),
documentPage.getHeight(),
documentPage.getUnit());
// lines
documentPage.getLines().forEach(documentLine - >
System.out.printf("Line %s is within a bounding box %s.%n",
documentLine.getContent(),
documentLine.getBoundingBox().toString()));
// words
documentPage.getWords().forEach(documentWord - >
System.out.printf("Word %s has a confidence score of %.2f%n.",
documentWord.getContent(),
documentWord.getConfidence()));
});
// tables
List < DocumentTable > tables = analyzeResult.getTables();
for (int i = 0; i < tables.size(); i++) {
DocumentTable documentTable = tables.get(i);
System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
documentTable.getColumnCount());
documentTable.getCells().forEach(documentTableCell - > {
System.out.printf("Cell '%s', has row index %d and column index %d.%n",
documentTableCell.getContent(),
documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
});
System.out.println();
}
// Entities
analyzeResult.getEntities().forEach(documentEntity - > {
System.out.printf("Entity category : %s, sub-category %s%n: ",
documentEntity.getCategory(), documentEntity.getSubCategory());
System.out.printf("Entity content: %s%n: ", documentEntity.getContent());
System.out.printf("Entity confidence: %.2f%n", documentEntity.getConfidence());
});
// Key-value
analyzeResult.getKeyValuePairs().forEach(documentKeyValuePair - > {
System.out.printf("Key content: %s%n", documentKeyValuePair.getKey().getContent());
System.out.printf("Key content bounding region: %s%n",
documentKeyValuePair.getKey().getBoundingRegions().toString());
System.out.printf("Value content: %s%n", documentKeyValuePair.getValue().getContent());
System.out.printf("Value content bounding region: %s%n", documentKeyValuePair.getValue().getBoundingRegions().toString());
});
Build a custom document analysis model
In 3. x.x, creating a custom model required specifying useTrainingLabels to indicate whether to use labeled data when creating the custom model with the beginTraining method.
In 4. x.x, we introduced the new general document model(prebuilt - document) to replace the train without labels functionality from 3. x.x which extracts entities, key - value pairs, and layout from a document with the beginBuildModel method.In 4. x.x the beginBuildModel always returns labeled data otherwise.
Train a custom model using 3. x.x beginTraining:
String trainingFilesUrl = "{SAS_URL_of_your_container_in_blob_storage}";
SyncPoller < FormRecognizerOperationResult, CustomFormModel > trainingPoller =
formTrainingClient.beginTraining(trainingFilesUrl,
false,
new TrainingOptions()
.setModelName("my model trained without labels"),
Context.NONE);
CustomFormModel customFormModel = trainingPoller.getFinalResult();
// Model Info
System.out.printf("Model Id: %s%n", customFormModel.getModelId());
System.out.printf("Model name given by user: %s%n", customFormModel.getModelName());
System.out.printf("Model Status: %s%n", customFormModel.getModelStatus());
System.out.printf("Training started on: %s%n", customFormModel.getTrainingStartedOn());
System.out.printf("Training completed on: %s%n%n", customFormModel.getTrainingCompletedOn());
System.out.println("Recognized Fields:");
// looping through the subModels, which contains the fields they were trained on
// Since the given training documents are unlabeled we still group them but, they do not have a label.
customFormModel.getSubmodels().forEach(customFormSubmodel - > {
System.out.printf("Submodel Id: %s%n: ", customFormSubmodel.getModelId());
// Since the training data is unlabeled, we are unable to return the accuracy of this model
customFormSubmodel.getFields().forEach((field, customFormModelField) - >
System.out.printf("Field: %s Field Label: %s%n",
field, customFormModelField.getLabel()));
});
System.out.println();
customFormModel.getTrainingDocuments().forEach(trainingDocumentInfo - > {
System.out.printf("Document name: %s%n", trainingDocumentInfo.getName());
System.out.printf("Document status: %s%n", trainingDocumentInfo.getStatus());
System.out.printf("Document page count: %d%n", trainingDocumentInfo.getPageCount());
if (!trainingDocumentInfo.getErrors().isEmpty()) {
System.out.println("Document Errors:");
trainingDocumentInfo.getErrors().forEach(formRecognizerError - >
System.out.printf("Error code %s, Error message: %s%n", formRecognizerError.getErrorCode(),
formRecognizerError.getMessage()));
}
});
}
Try it: Layout model
Extract text, selection marks, text styles, and table structures, along with their bounding region coordinates from documents.
- For this example, you'll need a form document file at a URI. You can use our sample form document for this quickstart.
- To analyze a given file at a URI, you'll use the
beginAnalyzeDocumentFromUrlmethod and passprebuilt-layoutas the model Id. The returned value is anAnalyzeResultobject containing data about the submitted document. - We've added the file URI value to the
documentUrlvariable in the main method.
Update your application's FormRecognizer class, with the following code (be certain to update the key and endpoint variables with values from your Form Recognizer instance in the Azure portal):
public class FormRecognizer {
static final String key = "PASTE_YOUR_FORM_RECOGNIZER_SUBSCRIPTION_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
public static void main(String[] args) {
DocumentAnalysisClient documentAnalysisClient = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("{endpoint}")
.buildClient();
String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
String modelId = "prebuilt-layout";
SyncPoller < DocumentOperationResult, AnalyzeResult > analyzeLayoutResultPoller =
documentAnalysisClient.beginAnalyzeDocument(modelId, documentUrl);
AnalyzeResult analyzeLayoutResult = analyzeLayoutResultPoller.getFinalResult();
// pages
analyzeLayoutResult.getPages().forEach(documentPage - > {
System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
documentPage.getWidth(),
documentPage.getHeight(),
documentPage.getUnit());
// lines
documentPage.getLines().forEach(documentLine - >
System.out.printf("Line %s is within a bounding box %s.%n",
documentLine.getContent(),
documentLine.getBoundingBox().toString()));
// selection marks
documentPage.getSelectionMarks().forEach(documentSelectionMark - >
System.out.printf("Selection mark is %s and is within a bounding box %s with confidence %.2f.%n",
documentSelectionMark.getState().toString(),
documentSelectionMark.getBoundingBox().toString(),
documentSelectionMark.getConfidence()));
});
// tables
List < DocumentTable > tables = analyzeLayoutResult.getTables();
for (int i = 0; i < tables.size(); i++) {
DocumentTable documentTable = tables.get(i);
System.out.printf("Table %d has %d rows and %d columns.%n", i, documentTable.getRowCount(),
documentTable.getColumnCount());
documentTable.getCells().forEach(documentTableCell - > {
System.out.printf("Cell '%s', has row index %d and column index %d.%n", documentTableCell.getContent(),
documentTableCell.getRowIndex(), documentTableCell.getColumnIndex());
});
System.out.println();
}
}
Try it: Prebuilt model
This sample demonstrates how to analyze data from certain types of common documents with pre-trained models, using an invoice as an example.
- For this example, we wll analyze an invoice document using a prebuilt model. You can use our sample invoice document for this quickstart.
- To analyze a given file at a URI, you'll use the
beginAnalyzeDocumentFromUrlmethod and passprebuilt-invoiceas the model Id. The returned value is anAnalyzeResultobject containing data about the submitted document. - We've added the file URI value to the
invoiceUrlvariable in the main method. - For simplicity, all the key-value pairs that the service returns are not shown here. To see the list of all supported fields and corresponding types, see our Invoice concept page.
Choose the invoice prebuilt model ID
You are not limited to invoices—there are several prebuilt models to choose from, each of which has its own set of supported fields. The model to use for the analyze operation depends on the type of document to be analyzed. Here are the model IDs for the prebuilt models currently supported by the Form Recognizer service:
- prebuilt-invoice: extracts text, selection marks, tables, key-value pairs, and key information from invoices.
- prebuilt-receipt: extracts text and key information from receipts.
- prebuilt-idDocument: extracts text and key information from driver licenses and international passports.
- prebuilt-businessCard: extracts text and key information from business cards.
Update your application's FormRecognizer class, with the following code (be certain to update the key and endpoint variables with values from your Form Recognizer instance in the Azure portal):
public class FormRecognizer {
static final String key = "PASTE_YOUR_FORM_RECOGNIZER_SUBSCRIPTION_KEY_HERE";
static final String endpoint = "PASTE_YOUR_FORM_RECOGNIZER_ENDPOINT_HERE";
public static void main(String[] args) {
DocumentAnalysisClient documentAnalysisClient = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential("{key}"))
.endpoint("{endpoint}")
.buildClient();
String invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
String modelId = "prebuilt-invoice";
PollerFlux < DocumentOperationResult, AnalyzeResult > analyzeInvoicePoller = client.beginAnalyzeDocumentFromUrl("prebuilt-invoice", invoiceUrl);
Mono < AnalyzeResult > analyzeInvoiceResultMono = analyzeInvoicePoller
.last()
.flatMap(pollResponse - > {
if (pollResponse.getStatus().isComplete()) {
System.out.println("Polling completed successfully");
return pollResponse.getFinalResult();
} else {
return Mono.error(new RuntimeException("Polling completed unsuccessfully with status:" +
pollResponse.getStatus()));
}
});
analyzeInvoiceResultMono.subscribe(analyzeInvoiceResult - > {
for (int i = 0; i < analyzeInvoiceResult.getDocuments().size(); i++) {
AnalyzedDocument analyzedInvoice = analyzeInvoiceResult.getDocuments().get(i);
Map < String, DocumentField > invoiceFields = analyzedInvoice.getFields();
System.out.printf("----------- Analyzing invoice %d -----------%n", i);
DocumentField vendorNameField = invoiceFields.get("VendorName");
if (vendorNameField != null) {
if (DocumentFieldType.STRING == vendorNameField.getType()) {
String merchantName = vendorNameField.getValueString();
System.out.printf("Vendor Name: %s, confidence: %.2f%n",
merchantName, vendorNameField.getConfidence());
}
}
DocumentField vendorAddressField = invoiceFields.get("VendorAddress");
if (vendorAddressField != null) {
if (DocumentFieldType.STRING == vendorAddressField.getType()) {
String merchantAddress = vendorAddressField.getValueString();
System.out.printf("Vendor address: %s, confidence: %.2f%n",
merchantAddress, vendorAddressField.getConfidence());
}
}
DocumentField customerNameField = invoiceFields.get("CustomerName");
if (customerNameField != null) {
if (DocumentFieldType.STRING == customerNameField.getType()) {
String merchantAddress = customerNameField.getValueString();
System.out.printf("Customer Name: %s, confidence: %.2f%n",
merchantAddress, customerNameField.getConfidence());
}
}
DocumentField customerAddressRecipientField = invoiceFields.get("CustomerAddressRecipient");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.STRING == customerAddressRecipientField.getType()) {
String customerAddr = customerAddressRecipientField.getValueString();
System.out.printf("Customer Address Recipient: %s, confidence: %.2f%n",
customerAddr, customerAddressRecipientField.getConfidence());
}
}
DocumentField invoiceIdField = invoiceFields.get("InvoiceId");
if (invoiceIdField != null) {
if (DocumentFieldType.STRING == invoiceIdField.getType()) {
String invoiceId = invoiceIdField.getValueString();
System.out.printf("Invoice ID: %s, confidence: %.2f%n",
invoiceId, invoiceIdField.getConfidence());
}
}
DocumentField invoiceDateField = invoiceFields.get("InvoiceDate");
if (customerNameField != null) {
if (DocumentFieldType.DATE == invoiceDateField.getType()) {
LocalDate invoiceDate = invoiceDateField.getValueDate();
System.out.printf("Invoice Date: %s, confidence: %.2f%n",
invoiceDate, invoiceDateField.getConfidence());
}
}
DocumentField invoiceTotalField = invoiceFields.get("InvoiceTotal");
if (customerAddressRecipientField != null) {
if (DocumentFieldType.FLOAT == invoiceTotalField.getType()) {
Float invoiceTotal = invoiceTotalField.getValueFloat();
System.out.printf("Invoice Total: %.2f, confidence: %.2f%n",
invoiceTotal, invoiceTotalField.getConfidence());
}
}
DocumentField invoiceItemsField = invoiceFields.get("Items");
if (invoiceItemsField != null) {
System.out.printf("Invoice Items: %n");
if (DocumentFieldType.LIST == invoiceItemsField.getType()) {
List < DocumentField > invoiceItems = invoiceItemsField.getValueList();
invoiceItems.stream()
.filter(invoiceItem - > DocumentFieldType.MAP == invoiceItem.getType())
.map(formField - > formField.getValueMap())
.forEach(formFieldMap - > formFieldMap.forEach((key, formField) - > {
// See a full list of fields found on an invoice here:
// https://aka.ms/formrecognizer/invoicefields
if ("Description".equals(key)) {
if (DocumentFieldType.STRING == formField.getType()) {
String name = formField.getValueString();
System.out.printf("Description: %s, confidence: %.2fs%n",
name, formField.getConfidence());
}
}
if ("Quantity".equals(key)) {
if (DocumentFieldType.FLOAT == formField.getType()) {
Float quantity = formField.getValueFloat();
System.out.printf("Quantity: %f, confidence: %.2f%n",
quantity, formField.getConfidence());
}
}
if ("UnitPrice".equals(key)) {
if (DocumentFieldType.FLOAT == formField.getType()) {
Float unitPrice = formField.getValueFloat();
System.out.printf("Unit Price: %f, confidence: %.2f%n",
unitPrice, formField.getConfidence());
}
}
if ("ProductCode".equals(key)) {
if (DocumentFieldType.FLOAT == formField.getType()) {
Float productCode = formField.getValueFloat();
System.out.printf("Product Code: %f, confidence: %.2f%n",
productCode, formField.getConfidence());
}
}
}));
}
}
}
});
}
}
Build and run your application
Navigate back to your main project directory—form-recognizer-app.
- Build your application with the
buildcommand:
gradle build
- Run your application with the
runcommand:
gradle run
Congratulations! In this quickstart, you used the Form Recognizer Java SDK to analyze various forms and documents in different ways. Next, explore the reference documentation to learn about Form Recognizer API in more depth.