Use the Read Model
In this how-to guide, you'll learn to use Azure Form Recognizer's read model to extract typeface and handwritten text from documents. The read model can detect lines, words, locations, and languages. You can use a programming language of your choice or the REST API. We recommend that you use the free service when you're learning the technology. Remember that the number of free pages is limited to 500 per month.
The read model is the core of all the other Form Recognizer models. Layout, general document, custom, and prebuilt models all use the read model as a foundation for extracting texts from documents.
Note
Form Recognizer v3.0 is currently in public preview. Some features may not be supported or have limited capabilities.
The current API version is 2022-06-30
.
Reference documentation | Library Source Code | Package (NuGet) | Samples
Prerequisites
Azure subscription - Create one for free.
The current version of Visual Studio IDE.
A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (
F0
) to try the service, and upgrade later to a paid tier for production.
Tip
Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.
After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:
Set up
Start Visual Studio.
On the start page, choose Create a new project.
On the Create a new project page, enter console in the search box. Choose the Console Application template, then choose Next.
In the Configure your new project dialog window, enter
formRecognizer_quickstart
in the Project name box. Then choose Next.In the Additional information dialog window, select .NET 6.0 (Long-term support), and then select Create.
Install the client library with NuGet
Right-click on your formRecognizer_quickstart project and select Manage NuGet Packages... .
Select the Browse tab and type Azure.AI.FormRecognizer.
Choose the Include prerelease checkbox and select version 4.0.0-beta.3 from the dropdown menu and install the package in your project.
Read Model
To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient
class. To do so, you'll create an AzureKeyCredential
with your key
from the Azure portal and a DocumentAnalysisClient
instance with the AzureKeyCredential
and your Form Recognizer endpoint
.
Note
- Starting with .NET 6, new projects using the
console
template generate different code than previous versions. - The new output uses recent C# features that simplify the code you need to write for a program.
- When you use the newer version, you only need to write the body of the
Main
method. You don't need to include the other program elements. - For more information, see New C# templates generate top-level statements.
- For this example, you'll need a form document file from a URI. You can use our sample form document for this quickstart.
- We've added the file URI value to the
Uri fileUri
variable at the top of the script. - To extract the layout from a given file at a URI, use the
StartAnalyzeDocumentFromUri
method and passprebuilt-read
as the model ID. The returned value is anAnalyzeResult
object containing data from the submitted document.
Open the Program.cs file.
Delete the pre-existing code, including the line
Console.Writeline("Hello World!")
, and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance:
using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;
//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);
//sample document
Uri fileUri = new Uri("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png");
AnalyzeDocumentOperation operation = await client.StartAnalyzeDocumentFromUriAsync("prebuilt-read", fileUri);
await operation.WaitForCompletionAsync();
AnalyzeResult result = operation.Value;
foreach (DocumentPage page in result.Pages)
{
Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");
for (int i = 0; i < page.Lines.Count; i++)
{
DocumentLine line = page.Lines[i];
Console.WriteLine($" Line {i} has content: '{line.Content}'.");
Console.WriteLine($" Its bounding box is:");
Console.WriteLine($" Upper left => X: {line.BoundingBox[0].X}, Y= {line.BoundingBox[0].Y}");
Console.WriteLine($" Upper right => X: {line.BoundingBox[1].X}, Y= {line.BoundingBox[1].Y}");
Console.WriteLine($" Lower right => X: {line.BoundingBox[2].X}, Y= {line.BoundingBox[2].Y}");
Console.WriteLine($" Lower left => X: {line.BoundingBox[3].X}, Y= {line.BoundingBox[3].Y}");
}
}
foreach (DocumentStyle style in result.Styles)
{
// Check the style and style confidence to see if text is handwritten.
// Note that value '0.8' is used as an example.
bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;
if (isHandwritten && style.Confidence > 0.8)
{
Console.WriteLine($"Handwritten content found:");
foreach (DocumentSpan span in style.Spans)
{
Console.WriteLine($" Content: {result.Content.Substring(span.Offset, span.Length)}");
}
}
}
foreach (DocumentLanguage language in result.Languages)
{
Console.WriteLine($" Found language '{language.LanguageCode}' with confidence {language.Confidence}.");
}
Important
- Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see Cognitive Services security.
Once you've added a code sample to your application, choose the green Start button next to formRecognizer_quickstart to build and run your program, or press F5.
Read Model Output
Here's a snippet of the expected output:
Document Page 1 has 86 line(s), 697 word(s),
Line 0 has content: 'While healthcare is still in the early stages of its Al journey, we'.
Its bounding box is:
Upper left => X: 259, Y= 55
Upper right => X: 816, Y= 56
Lower right => X: 816, Y= 79
Lower left => X: 259, Y= 77
.
.
.
Found language 'en' with confidence 0.95.
To view the entire output, visit the Azure samples repository on GitHub to view the read model output.
Next step
Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.
Reference documentation | Library source code | Package (Maven) | Samples
Prerequisites
Azure subscription - Create one for free.
The latest version of Visual Studio Code or your preferred IDE. See Java in Visual Studio Code.
Tip
- Visual Studio Code offers a Coding Pack for Java for Windows and macOS.The coding pack is a bundle of VS Code, the Java Development Kit (JDK), and a collection of suggested extensions by Microsoft. The Coding Pack can also be used to fix an existing development environment.
- If you are using VS Code and the Coding Pack For Java, install the Gradle for Java extension.
If you aren't using VS Code, make sure you have the following installed in your development environment:
A Java Development Kit (JDK) version 8 or later.
Gradle, version 6.8 or later.
A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (
F0
) to try the service, and upgrade later to a paid tier for production.Tip
Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.
After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. Later, you'll paste your key and endpoint into the code below:
Set up
Create a new Gradle project
In console window (such as cmd, PowerShell, or Bash), create a new directory for your app called form-recognizer-app, and navigate to it.
mkdir form-recognizer-app && form-recognizer-app
Run the
gradle init
command from your working directory. This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.gradle init --type basic
When prompted to choose a DSL, select Kotlin.
Accept the default project name (form-recognizer-app)
Install the client library
This quickstart uses the Gradle dependency manager. You can find the client library and information for other dependency managers on the Maven Central Repository.
Open the project's build.gradle.kts file in your IDE. Copay and past the following code to include the client library as an
implementation
statement, along with the required plugins and settings.plugins { java application } application { mainClass.set("FormRecognizer") } repositories { mavenCentral() } dependencies { implementation(group = "com.azure", name = "azure-ai-formrecognizer", version = "4.0.0-beta.4") }
Create a Java application
To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient
class. To do so, you'll create an AzureKeyCredential
with your key
from the Azure portal and a DocumentAnalysisClient
instance with the AzureKeyCredential
and your Form Recognizer endpoint
.
From the form-recognizer-app directory, run the following command:
mkdir -p src/main/java
You'll create the following directory structure:
Navigate to the
java
directory and create a file namedFormRecognizer.java
.Tip
- You can create a new file using PowerShell.
- Open a PowerShell window in your project directory by holding down the Shift key and right-clicking the folder.
- Type the following command New-Item FormRecognizer.java.
Read Model
- For this example, you'll need a form document file at a URI. You can use our sample form document for this quickstart.
- To analyze a given file at a URI, you'll use the
beginAnalyzeDocumentFromUrl
method and passprebuilt-read
as the model Id. The returned value is anAnalyzeResult
object containing data about the submitted document. - We've added the file URI value to the
documentUrl
variable in the main method.
- Open the
FormRecognizer.java
file and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance.
import com.azure.ai.formrecognizer.*;
import com.azure.ai.formrecognizer.models.AnalyzeResult;
import com.azure.ai.formrecognizer.models.DocumentLine;
import com.azure.ai.formrecognizer.models.AnalyzedDocument;
import com.azure.ai.formrecognizer.models.DocumentOperationResult;
import com.azure.ai.formrecognizer.models.DocumentWord;
import com.azure.ai.formrecognizer.models.DocumentTable;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;
import java.util.List;
import java.util.Arrays;
public class FormRecognizer {
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
private static final String endpoint = "https://formrecognizer-jp.cognitiveservices.azure.com/";
private static final String key = "092e23363b8b492dbc402cbebbf1c1d9";
public static void main(String[] args) {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
.credential(new AzureKeyCredential(key))
.endpoint(endpoint)
.buildClient();
// sample document
String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png";
String modelId = "prebuilt-read";
SyncPoller < DocumentOperationResult, AnalyzeResult > analyzeLayoutResultPoller =
client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);
AnalyzeResult analyzeLayoutResult = analyzeLayoutResultPoller.getFinalResult();
// pages
analyzeLayoutResult.getPages().forEach(documentPage -> {
System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
documentPage.getWidth(),
documentPage.getHeight(),
documentPage.getUnit());
// lines
documentPage.getLines().forEach(documentLine ->
System.out.printf("Line %s is within a bounding box %s.%n",
documentLine.getContent(),
documentLine.getBoundingBox().toString()));
// words
documentPage.getWords().forEach(documentWord ->
System.out.printf("Word '%s' has a confidence score of %.2f.%n",
documentWord.getContent(),
documentWord.getConfidence()));
});
}
}
Important
Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see* the Cognitive Services security.
Navigate back to your main project directory—form-recognizer-app.
Build your application with the
build
command:gradle build
Run your application with the
run
command:gradle run
Read Model Output
Here's a snippet of the expected output:
Page has width: 915.00 and height: 1190.00, measured with unit: pixel
Line While healthcare is still in the early stages of its Al journey, we is within a bounding box [259.0, 55.0, 816.0, 56.0, 816.0, 79.0, 259.0, 77.0].
Line are seeing pharmaceutical and other life sciences organizations is within a bounding box [258.0, 83.0, 825.0, 83.0, 825.0, 106.0, 258.0, 106.0].
Line making major investments in Al and related technologies." is within a bounding box [259.0, 112.0, 784.0, 112.0, 784.0, 136.0, 259.0, 136.0].
.
.
.
Word 'While' has a confidence score of 1.00.
Word 'healthcare' has a confidence score of 1.00.
Word 'is' has a confidence score of 1.00.
To view the entire output,visit the Azure samples repository on GitHub to view the read model output.
Next step
Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.
Reference documentation | Library source code | Package (npm) | Samples
Prerequisites
Azure subscription - Create one for free.
The latest version of Visual Studio Code or your preferred IDE. For more information, see Node.js in Visual Studio Code
The latest LTS version of Node.js
A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (
F0
) to try the service, and upgrade later to a paid tier for production.Tip
Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.
After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:
Set up
Create a new Node.js Express application: In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app named
form-recognizer-app
, and navigate to it.mkdir form-recognizer-app && cd form-recognizer-app
Run the
npm init
command to initialize the application and scaffold your project.npm init
Specify your project's attributes using the prompts presented in the terminal.
- The most important attributes are name, version number, and entry point.
- We recommend keeping
index.js
for the entry point name. The description, test command, GitHub repository, keywords, author, and license information are optional attributes—they can be skipped for this project. - Accept the suggestions in parentheses by selecting Return or Enter.
- After you've completed the prompts, a
package.json
file will be created in your form-recognizer-app directory.
Install the
ai-form-recognizer
client library andazure/identity
npm packages:npm install @azure/ai-form-recognizer@4.0.0-beta.3 @azure/identity
- Your app's
package.json
file will be updated with the dependencies.
- Your app's
Create a file named
index.js
in the application directory.Tip
- You can create a new file using PowerShell.
- Open a PowerShell window in your project directory by holding down the Shift key and right-clicking the folder.
- Type the following command New-Item index.js.
Read Model
To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient
class. To do so, you'll create an AzureKeyCredential
with your key
from the Azure portal and a DocumentAnalysisClient
instance with the AzureKeyCredential
and your Form Recognizer endpoint
.
- For this example, you'll need a form document file from a URL. You can use our sample form document for this quickstart.
- We've added the file URL value to the
formUrl
variable near the top of the file. - To analyze a given file from a URL, you'll use the
beginAnalyzeDocuments
method and pass inprebuilt-read
as the model Id.
- Open the
index.js
file in Visual Studio Code or your favorite IDE and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance:
const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");
function* getTextOfSpans(content, spans) {
for (const span of spans) {
yield content.slice(span.offset, span.offset + span.length);
}
}
// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
const endpoint = "<your-endpoint>";
const key = "<your-key>";
// sample document
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png"
async function main() {
// create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
const poller = await client.beginAnalyzeDocument("prebuilt-read", formUrl);
const { content, pages, languages, styles } = await poller.pollUntilDone();
if (pages.length <= 0) {
console.log("No pages were extracted from the document.");
} else {
console.log("Pages:");
for (const page of pages) {
console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
console.log(` ${page.width}x${page.height}, angle: ${page.angle}`);
console.log(` ${page.lines.length} lines, ${page.words.length} words`);
if (page.lines.length > 0) {
console.log(" Lines:");
for (const line of page.lines) {
console.log(` - "${line.content}"`);
// The words of the line can also be iterated independently. The words are computed based on their
// corresponding spans.
for (const word of line.words()) {
console.log(` - "${word.content}"`);
}
}
}
}
}
if (languages.length <= 0) {
console.log("No language spans were extracted from the document.");
} else {
console.log("Languages:");
for (const languageEntry of languages) {
console.log(
`- Found language: ${languageEntry.languageCode} (confidence: ${languageEntry.confidence})`
);
for (const text of getTextOfSpans(content, languageEntry.spans)) {
const escapedText = text.replace(/\r?\n/g, "\\n").replace(/"/g, '\\"');
console.log(` - "${escapedText}"`);
}
}
}
if (styles.length <= 0) {
console.log("No text styles were extracted from the document.");
} else {
console.log("Styles:");
for (const style of styles) {
console.log(
`- Handwritten: ${style.isHandwritten ? "yes" : "no"} (confidence=${style.confidence})`
);
for (const word of getTextOfSpans(content, style.spans)) {
console.log(` - "${word}"`);
}
}
}
}
main().catch((error) => {
console.error("An error occurred:", error);
process.exit(1);
});
Important
Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see* the Cognitive Services security.
Once you've added a code sample to your application, navigate to the folder where you have your form recognizer application (form-recognizer-app).
Type the following command in your terminal:
node index.js
Read Model Output
Here's a snippet of the expected output:
Pages:
- Page 1 (unit: pixel)
915x1190, angle: 0
86 lines, 697 words
Lines:
- "While healthcare is still in the early stages of its Al journey, we"
- "While"
- "healthcare"
- "is"
.
.
.
Languages:
- Found language: en (confidence: 0.95)
- "While healthcare is still in the early stages of its Al journey, we\nare seeing pharmaceutical and other life sciences organizations"
- "As pharmaceutical and other life sciences organizations invest\nin and deploy advanced technologies, they are beginning to see"
- "are looking to incorporate automation and continuing smart"
.
.
.
No text styles were extracted from the document.
To view the entire output, visit the Azure samples repository on GitHub to view the read model output
Next step
Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.
Reference documentation | Library source code | Package (PyPi) | Samples
Prerequisites
Azure subscription - Create one for free
-
- Your Python installation should include pip. You can check if you have pip installed by running
pip --version
on the command line. Get pip by installing the latest version of Python.
- Your Python installation should include pip. You can check if you have pip installed by running
The latest version of Visual Studio Code or your preferred IDE. For more information, see Getting Started with Python in VS Code
A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (
F0
) to try the service, and upgrade later to a paid tier for production.
Tip
Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.
After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:
Set up
Open a terminal window in your local environment and install the Azure Form Recognizer client library for Python with pip:
pip install azure-ai-formrecognizer==3.2.0b3
Read Model
To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient
class. To do so, you'll create an AzureKeyCredential
with your key
from the Azure portal and a DocumentAnalysisClient
instance with the AzureKeyCredential
and your Form Recognizer endpoint
.
- For this example, you'll need a form document file from a URL. You can use our sample form document for this quickstart.
- We've added the file URL value to the
formUrl
variable in theanalyze_read
function. - To analyze a given file at a URL, you'll use the
begin_analyze_document_from_url
method and pass inprebuilt-read
as the model Id. The returned value is aresult
object containing data about the submitted document.
Create a new Python file called form_recognizer_quickstart.py in your preferred editor or IDE.
Open the form_recognizer_quickstart.py file and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance:
# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"
def format_bounding_box(bounding_box):
if not bounding_box:
return "N/A"
return ", ".join(["[{}, {}]".format(p.x, p.y) for p in bounding_box])
def analyze_read():
# sample form document
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png"
# create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
document_analysis_client = DocumentAnalysisClient(
endpoint=endpoint, credential=AzureKeyCredential(key)
)
poller = document_analysis_client.begin_analyze_document_from_url(
"prebuilt-read", formUrl)
result = poller.result()
print ("Document contains content: ", result.content)
for idx, style in enumerate(result.styles):
print(
"Document contains {} content".format(
"handwritten" if style.is_handwritten else "no handwritten"
)
)
for page in result.pages:
print("----Analyzing Read from page #{}----".format(page.page_number))
print(
"Page has width: {} and height: {}, measured with unit: {}".format(
page.width, page.height, page.unit
)
)
for line_idx, line in enumerate(page.lines):
print(
"...Line # {} has text content '{}' within bounding box '{}'".format(
line_idx,
line.content,
format_bounding_box(line.bounding_box),
)
)
for word in page.words:
print(
"...Word '{}' has a confidence of {}".format(
word.content, word.confidence
)
)
print("----------------------------------------")
if __name__ == "__main__":
analyze_read()
Important
Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see Cognitive Services security.
Once you've added a code sample to your application, navigate to the folder where you have your form_recognizer_quickstart.py file.
Type the following command in your terminal:
python form_recognizer_quickstart.py
Read Model Output
Here's a snippet of the expected output:
Document contains content: While healthcare is still in the early stages of its Al journey, we
are seeing pharmaceutical and other life sciences organizations
making major investments in Al and related technologies."
.
.
.
----Analyzing Read from page #1----
Page has width: 915.0 and height: 1190.0, measured with unit: pixel
...Line # 0 has text content 'While healthcare is still in the early stages of its Al journey, we' within bounding box '[259.0, 55.0], [816.0, 56.0], [816.0, 79.0], [259.0, 77.0]'
...Line # 1 has text content 'are seeing pharmaceutical and other life sciences organizations' within bounding box '[258.0, 83.0], [825.0, 83.0], [825.0, 106.0], [258.0, 106.0]'
...Line # 2 has text content 'making major investments in Al and related technologies."' within bounding box '[259.0, 112.0], [784.0, 112.0], [784.0, 136.0], [259.0, 136.0]'
.
.
.
...Word 'While' has a confidence of 0.999
...Word 'healthcare' has a confidence of 0.995
...Word 'is' has a confidence of 0.997
To view the entire output, visit the Azure samples repository on GitHub to view the read model output
Next step
Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.
Prerequisites
Azure subscription - Create one for free
cURL installed.
PowerShell version 7.*+, or a similar command-line application. To check your PowerShell version, type
Get-Host | Select-Object Version
.A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (
F0
) to try the service, and upgrade later to a paid tier for production.
Tip
Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.
After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:
Read Model
Form Recognizer v3.0 consolidates the analyze document (POST) and get result (GET) requests into single operations. The modelId
is used for POST and resultId
for GET operations.
- For this example, you'll need a form document file from a URI. You can use our sample form document for this quickstart.
- We've added the file URI value to the POST curl command below.
POST Request
Before you run the following cURL command, make the following changes:
- Replace
{endpoint}
with the endpoint value from your Azure portal Form Recognizer instance. - Replace
{key}
with the key value from your Azure portal Form Recognizer instance.
curl -v -i POST "{endpoint}/formrecognizer/documentModels/prebuilt-read:analyze?api-version=2022-06-30" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png'}"
Operation-Location
You'll receive a 202 (Success)
response that includes an Operation-Location header. The value of this header contains a resultID
that can be queried to get the status of the asynchronous operation:
Get Request
After you've called the Analyze document API, call the Get analyze result API to get the status of the operation and the extracted data. Before you run the command, make these changes:
- Replace
{endpoint}
with the endpoint value from your Azure portal Form Recognizer instance. - Replace
{key}
with the key value from your Azure portal Form Recognizer instance. - Replace
{resultID}
with the result ID from the Operation-Location header.
curl -v -X GET "{endpoint}/formrecognizer/documentModels/prebuilt-read/analyzeResults/{resultId}?api-version=2022-06-30" -H "Ocp-Apim-Subscription-Key: {key}"
Read Model Output
You'll receive a 200 (Success)
response with JSON output. The first field, "status"
, indicates the status of the operation. If the operation isn't complete, the value of "status"
will be "running"
or "notStarted"
, and you should call the API again, either manually or through a script. We recommend an interval of one second or more between calls.
{
"status": "succeeded",
"createdDateTime": "2022-04-08T00:36:48Z",
"lastUpdatedDateTime": "2022-04-08T00:36:50Z",
"analyzeResult": {
"apiVersion": "2022-06-30",
"modelId": "prebuilt-read",
"stringIndexType": "textElements",
"content": "While healthcare is still in the early stages of its Al journey, we\nare seeing...",
"pages": [
{
"pageNumber": 1,
"angle": 0,
"width": 915,
"height": 1190,
"unit": "pixel",
"words": [
{
"content": "While",
"boundingBox": [
260,
56,
307,
56,
306,
76,
260,
76
],
"confidence": 0.999,
"span": {
"offset": 0,
"length": 5
}
}
]
}
]
}
}
To view the entire output,visit the Azure samples repository on GitHub to view the read model output.
Next step
Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.
Feedback
Submit and view feedback for