Use the Read Model

In this how-to guide, you'll learn to use Azure Form Recognizer's read model to extract typeface and handwritten text from documents. The read model can detect lines, words, locations, and languages. You can use a programming language of your choice or the REST API. We recommend that you use the free service when you're learning the technology. Remember that the number of free pages is limited to 500 per month.

The read model is the core of all the other Form Recognizer models. Layout, general document, custom, and prebuilt models all use the read model as a foundation for extracting texts from documents.

Note

Form Recognizer v3.0 is currently in public preview. Some features may not be supported or have limited capabilities. The current API version is 2022-06-30.

Reference documentation | Library Source Code | Package (NuGet) | Samples

Prerequisites

  • Azure subscription - Create one for free.

  • The current version of Visual Studio IDE.

  • A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Tip

Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.

  • After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:

    Screenshot: keys and endpoint location in the Azure portal.

Set up

  1. Start Visual Studio.

  2. On the start page, choose Create a new project.

    Screenshot: Visual Studio start window.

  3. On the Create a new project page, enter console in the search box. Choose the Console Application template, then choose Next.

    Screenshot: Visual Studio's create new project page.

  4. In the Configure your new project dialog window, enter formRecognizer_quickstart in the Project name box. Then choose Next.

    Screenshot: Visual Studio's configure new project dialog window.

  5. In the Additional information dialog window, select .NET 6.0 (Long-term support), and then select Create.

    Screenshot: Visual Studio's additional information dialog window.

Install the client library with NuGet

  1. Right-click on your formRecognizer_quickstart project and select Manage NuGet Packages... .

    Screenshot: select-nuget-package.png

  2. Select the Browse tab and type Azure.AI.FormRecognizer.

    Screenshot: select-form-recognizer-package.png

  3. Choose the Include prerelease checkbox and select version 4.0.0-beta.3 from the dropdown menu and install the package in your project.

Read Model

To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient class. To do so, you'll create an AzureKeyCredential with your key from the Azure portal and a DocumentAnalysisClient instance with the AzureKeyCredential and your Form Recognizer endpoint.

Note

  • Starting with .NET 6, new projects using the console template generate different code than previous versions.
  • The new output uses recent C# features that simplify the code you need to write for a program.
  • When you use the newer version, you only need to write the body of the Main method. You don't need to include the other program elements.
  • For more information, see New C# templates generate top-level statements.
  • For this example, you'll need a form document file from a URI. You can use our sample form document for this quickstart.
  • We've added the file URI value to the Uri fileUri variable at the top of the script.
  • To extract the layout from a given file at a URI, use the StartAnalyzeDocumentFromUri method and pass prebuilt-read as the model ID. The returned value is an AnalyzeResult object containing data from the submitted document.
  1. Open the Program.cs file.

  2. Delete the pre-existing code, including the line Console.Writeline("Hello World!"), and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance:

using Azure;
using Azure.AI.FormRecognizer.DocumentAnalysis;

//set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal to create your `AzureKeyCredential` and `DocumentAnalysisClient` instance
string endpoint = "<your-endpoint>";
string key = "<your-key>";
AzureKeyCredential credential = new AzureKeyCredential(key);
DocumentAnalysisClient client = new DocumentAnalysisClient(new Uri(endpoint), credential);

//sample document
Uri fileUri = new Uri("https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png");

AnalyzeDocumentOperation operation = await client.StartAnalyzeDocumentFromUriAsync("prebuilt-read", fileUri);

await operation.WaitForCompletionAsync();

AnalyzeResult result = operation.Value;

foreach (DocumentPage page in result.Pages)
{
    Console.WriteLine($"Document Page {page.PageNumber} has {page.Lines.Count} line(s), {page.Words.Count} word(s),");

    for (int i = 0; i < page.Lines.Count; i++)
    {
        DocumentLine line = page.Lines[i];
        Console.WriteLine($"  Line {i} has content: '{line.Content}'.");

        Console.WriteLine($"    Its bounding box is:");
        Console.WriteLine($"      Upper left => X: {line.BoundingBox[0].X}, Y= {line.BoundingBox[0].Y}");
        Console.WriteLine($"      Upper right => X: {line.BoundingBox[1].X}, Y= {line.BoundingBox[1].Y}");
        Console.WriteLine($"      Lower right => X: {line.BoundingBox[2].X}, Y= {line.BoundingBox[2].Y}");
        Console.WriteLine($"      Lower left => X: {line.BoundingBox[3].X}, Y= {line.BoundingBox[3].Y}");
    }
}

foreach (DocumentStyle style in result.Styles)
{
    // Check the style and style confidence to see if text is handwritten.
    // Note that value '0.8' is used as an example.

    bool isHandwritten = style.IsHandwritten.HasValue && style.IsHandwritten == true;

    if (isHandwritten && style.Confidence > 0.8)
    {
        Console.WriteLine($"Handwritten content found:");

        foreach (DocumentSpan span in style.Spans)
        {
            Console.WriteLine($"  Content: {result.Content.Substring(span.Offset, span.Length)}");
        }
    }
}

foreach (DocumentLanguage language in result.Languages)
{
    Console.WriteLine($"  Found language '{language.LanguageCode}' with confidence {language.Confidence}.");
}

Important

  • Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see Cognitive Services security.
  1. Once you've added a code sample to your application, choose the green Start button next to formRecognizer_quickstart to build and run your program, or press F5.

    Screenshot: run your Visual Studio program.

Read Model Output

Here's a snippet of the expected output:

Document Page 1 has 86 line(s), 697 word(s),
  Line 0 has content: 'While healthcare is still in the early stages of its Al journey, we'.
    Its bounding box is:
      Upper left => X: 259, Y= 55
      Upper right => X: 816, Y= 56
      Lower right => X: 816, Y= 79
      Lower left => X: 259, Y= 77
.
.
.
  Found language 'en' with confidence 0.95.

To view the entire output, visit the Azure samples repository on GitHub to view the read model output.

Next step

Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.

Reference documentation | Library source code | Package (Maven) | Samples

Prerequisites

  • Azure subscription - Create one for free.

  • The latest version of Visual Studio Code or your preferred IDE. See Java in Visual Studio Code.

    Tip

    • Visual Studio Code offers a Coding Pack for Java for Windows and macOS.The coding pack is a bundle of VS Code, the Java Development Kit (JDK), and a collection of suggested extensions by Microsoft. The Coding Pack can also be used to fix an existing development environment.
    • If you are using VS Code and the Coding Pack For Java, install the Gradle for Java extension.
  • If you aren't using VS Code, make sure you have the following installed in your development environment:

  • A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

    Tip

    Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.

  • After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. Later, you'll paste your key and endpoint into the code below:

    Screenshot: keys and endpoint location in the Azure portal.

Set up

Create a new Gradle project

  1. In console window (such as cmd, PowerShell, or Bash), create a new directory for your app called form-recognizer-app, and navigate to it.

    mkdir form-recognizer-app && form-recognizer-app
    
  2. Run the gradle init command from your working directory. This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.

    gradle init --type basic
    
  3. When prompted to choose a DSL, select Kotlin.

  4. Accept the default project name (form-recognizer-app)

Install the client library

This quickstart uses the Gradle dependency manager. You can find the client library and information for other dependency managers on the Maven Central Repository.

  1. Open the project's build.gradle.kts file in your IDE. Copay and past the following code to include the client library as an implementation statement, along with the required plugins and settings.

    plugins {
        java
        application
    }
    application {
        mainClass.set("FormRecognizer")
    }
    repositories {
        mavenCentral()
    }
    dependencies {
        implementation(group = "com.azure", name = "azure-ai-formrecognizer", version = "4.0.0-beta.4")
    }
    

Create a Java application

To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient class. To do so, you'll create an AzureKeyCredential with your key from the Azure portal and a DocumentAnalysisClient instance with the AzureKeyCredential and your Form Recognizer endpoint.

  1. From the form-recognizer-app directory, run the following command:

    mkdir -p src/main/java
    

    You'll create the following directory structure:

    Screenshot: Java directory structure

  2. Navigate to the java directory and create a file named FormRecognizer.java.

    Tip

    • You can create a new file using PowerShell.
    • Open a PowerShell window in your project directory by holding down the Shift key and right-clicking the folder.
    • Type the following command New-Item FormRecognizer.java.

Read Model

  • For this example, you'll need a form document file at a URI. You can use our sample form document for this quickstart.
  • To analyze a given file at a URI, you'll use the beginAnalyzeDocumentFromUrl method and pass prebuilt-read as the model Id. The returned value is an AnalyzeResult object containing data about the submitted document.
  • We've added the file URI value to the documentUrl variable in the main method.
  1. Open the FormRecognizer.java file and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance.
import com.azure.ai.formrecognizer.*;
import com.azure.ai.formrecognizer.models.AnalyzeResult;
import com.azure.ai.formrecognizer.models.DocumentLine;
import com.azure.ai.formrecognizer.models.AnalyzedDocument;
import com.azure.ai.formrecognizer.models.DocumentOperationResult;
import com.azure.ai.formrecognizer.models.DocumentWord;
import com.azure.ai.formrecognizer.models.DocumentTable;
import com.azure.core.credential.AzureKeyCredential;
import com.azure.core.util.polling.SyncPoller;

import java.util.List;
import java.util.Arrays;

public class FormRecognizer {

    // set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
    private static final String endpoint = "https://formrecognizer-jp.cognitiveservices.azure.com/";
    private static final String key = "092e23363b8b492dbc402cbebbf1c1d9";

        public static void main(String[] args) {

        // create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
        DocumentAnalysisClient client = new DocumentAnalysisClientBuilder()
            .credential(new AzureKeyCredential(key))
            .endpoint(endpoint)
            .buildClient();

        // sample document
        String documentUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png";

        String modelId = "prebuilt-read";

        SyncPoller < DocumentOperationResult, AnalyzeResult > analyzeLayoutResultPoller =
            client.beginAnalyzeDocumentFromUrl(modelId, documentUrl);

        AnalyzeResult analyzeLayoutResult = analyzeLayoutResultPoller.getFinalResult();

            // pages
            analyzeLayoutResult.getPages().forEach(documentPage -> {
                System.out.printf("Page has width: %.2f and height: %.2f, measured with unit: %s%n",
                    documentPage.getWidth(),
                    documentPage.getHeight(),
                    documentPage.getUnit());

            // lines
            documentPage.getLines().forEach(documentLine ->
                System.out.printf("Line %s is within a bounding box %s.%n",
                    documentLine.getContent(),
                    documentLine.getBoundingBox().toString()));

            // words
            documentPage.getWords().forEach(documentWord ->
                System.out.printf("Word '%s' has a confidence score of %.2f.%n",
                    documentWord.getContent(),
                    documentWord.getConfidence()));
        });
    }
}

Important

Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see* the Cognitive Services security.

  1. Navigate back to your main project directory—form-recognizer-app.

  2. Build your application with the build command:

    gradle build
    
  3. Run your application with the run command:

    gradle run
    

Read Model Output

Here's a snippet of the expected output:

Page has width: 915.00 and height: 1190.00, measured with unit: pixel
Line While healthcare is still in the early stages of its Al journey, we is within a bounding box [259.0, 55.0, 816.0, 56.0, 816.0, 79.0, 259.0, 77.0].
Line are seeing pharmaceutical and other life sciences organizations is within a bounding box [258.0, 83.0, 825.0, 83.0, 825.0, 106.0, 258.0, 106.0].
Line making major investments in Al and related technologies." is within a bounding box [259.0, 112.0, 784.0, 112.0, 784.0, 136.0, 259.0, 136.0].
.
.
.
Word 'While' has a confidence score of 1.00.
Word 'healthcare' has a confidence score of 1.00.
Word 'is' has a confidence score of 1.00.

To view the entire output,visit the Azure samples repository on GitHub to view the read model output.

Next step

Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.

Reference documentation | Library source code | Package (npm) | Samples

Prerequisites

  • Azure subscription - Create one for free.

  • The latest version of Visual Studio Code or your preferred IDE. For more information, see Node.js in Visual Studio Code

  • The latest LTS version of Node.js

  • A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

    Tip

    Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.

  • After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:

    Screenshot: keys and endpoint location in the Azure portal.

Set up

  1. Create a new Node.js Express application: In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app named form-recognizer-app, and navigate to it.

    mkdir form-recognizer-app && cd form-recognizer-app
    
  2. Run the npm init command to initialize the application and scaffold your project.

    npm init
    
  3. Specify your project's attributes using the prompts presented in the terminal.

    • The most important attributes are name, version number, and entry point.
    • We recommend keeping index.js for the entry point name. The description, test command, GitHub repository, keywords, author, and license information are optional attributes—they can be skipped for this project.
    • Accept the suggestions in parentheses by selecting Return or Enter.
    • After you've completed the prompts, a package.json file will be created in your form-recognizer-app directory.
  4. Install the ai-form-recognizer client library and azure/identity npm packages:

    npm install @azure/ai-form-recognizer@4.0.0-beta.3 @azure/identity
    
    • Your app's package.json file will be updated with the dependencies.
  5. Create a file named index.js in the application directory.

    Tip

    • You can create a new file using PowerShell.
    • Open a PowerShell window in your project directory by holding down the Shift key and right-clicking the folder.
    • Type the following command New-Item index.js.

Read Model

To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient class. To do so, you'll create an AzureKeyCredential with your key from the Azure portal and a DocumentAnalysisClient instance with the AzureKeyCredential and your Form Recognizer endpoint.

  • For this example, you'll need a form document file from a URL. You can use our sample form document for this quickstart.
  • We've added the file URL value to the formUrl variable near the top of the file.
  • To analyze a given file from a URL, you'll use the beginAnalyzeDocuments method and pass in prebuilt-read as the model Id.
  1. Open the index.js file in Visual Studio Code or your favorite IDE and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance:
const { AzureKeyCredential, DocumentAnalysisClient } = require("@azure/ai-form-recognizer");
 
function* getTextOfSpans(content, spans) {
  for (const span of spans) {
    yield content.slice(span.offset, span.offset + span.length);
  }
}

// set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
const endpoint = "<your-endpoint>";
const key = "<your-key>";

// sample document
const formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png"

async function main() {
  // create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
  const client = new DocumentAnalysisClient(endpoint, new AzureKeyCredential(key));
  const poller = await client.beginAnalyzeDocument("prebuilt-read", formUrl);

  const { content, pages, languages, styles } = await poller.pollUntilDone();

  if (pages.length <= 0) {
    console.log("No pages were extracted from the document.");
  } else {
    console.log("Pages:");
    for (const page of pages) {
      console.log("- Page", page.pageNumber, `(unit: ${page.unit})`);
      console.log(`  ${page.width}x${page.height}, angle: ${page.angle}`);
      console.log(`  ${page.lines.length} lines, ${page.words.length} words`);

      if (page.lines.length > 0) {
        console.log("  Lines:");

        for (const line of page.lines) {
          console.log(`  - "${line.content}"`);

          // The words of the line can also be iterated independently. The words are computed based on their
          // corresponding spans.
          for (const word of line.words()) {
            console.log(`    - "${word.content}"`);
          }
        }
      }
    }
  }

  if (languages.length <= 0) {
    console.log("No language spans were extracted from the document.");
  } else {
    console.log("Languages:");
    for (const languageEntry of languages) {
      console.log(
        `- Found language: ${languageEntry.languageCode} (confidence: ${languageEntry.confidence})`
      );
      for (const text of getTextOfSpans(content, languageEntry.spans)) {
        const escapedText = text.replace(/\r?\n/g, "\\n").replace(/"/g, '\\"');
        console.log(`  - "${escapedText}"`);
      }
    }
  }

  if (styles.length <= 0) {
    console.log("No text styles were extracted from the document.");
  } else {
    console.log("Styles:");
    for (const style of styles) {
      console.log(
        `- Handwritten: ${style.isHandwritten ? "yes" : "no"} (confidence=${style.confidence})`
      );

      for (const word of getTextOfSpans(content, style.spans)) {
        console.log(`  - "${word}"`);
      }
    }
  }
}

main().catch((error) => {
  console.error("An error occurred:", error);
  process.exit(1);
});

Important

Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see* the Cognitive Services security.

  1. Once you've added a code sample to your application, navigate to the folder where you have your form recognizer application (form-recognizer-app).

  2. Type the following command in your terminal:

    node index.js
    

Read Model Output

Here's a snippet of the expected output:

Pages:
- Page 1 (unit: pixel)
  915x1190, angle: 0
  86 lines, 697 words
  Lines:
  - "While healthcare is still in the early stages of its Al journey, we"
    - "While"
    - "healthcare"
    - "is"
.
.
.
Languages:
- Found language: en (confidence: 0.95)
  - "While healthcare is still in the early stages of its Al journey, we\nare seeing pharmaceutical and other life sciences organizations"
  - "As pharmaceutical and other life sciences organizations invest\nin and deploy advanced technologies, they are beginning to see"
  - "are looking to incorporate automation and continuing smart"
.
.
.
No text styles were extracted from the document.

To view the entire output, visit the Azure samples repository on GitHub to view the read model output

Next step

Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.

Reference documentation | Library source code | Package (PyPi) | Samples

Prerequisites

  • Azure subscription - Create one for free

  • Python 3.x

    • Your Python installation should include pip. You can check if you have pip installed by running pip --version on the command line. Get pip by installing the latest version of Python.
  • The latest version of Visual Studio Code or your preferred IDE. For more information, see Getting Started with Python in VS Code

  • A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Tip

Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.

  • After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:

    Screenshot: keys and endpoint location in the Azure portal.

Set up

Open a terminal window in your local environment and install the Azure Form Recognizer client library for Python with pip:

pip install azure-ai-formrecognizer==3.2.0b3

Read Model

To interact with the Form Recognizer service, you'll need to create an instance of the DocumentAnalysisClient class. To do so, you'll create an AzureKeyCredential with your key from the Azure portal and a DocumentAnalysisClient instance with the AzureKeyCredential and your Form Recognizer endpoint.

  • For this example, you'll need a form document file from a URL. You can use our sample form document for this quickstart.
  • We've added the file URL value to the formUrl variable in the analyze_read function.
  • To analyze a given file at a URL, you'll use the begin_analyze_document_from_url method and pass in prebuilt-read as the model Id. The returned value is a result object containing data about the submitted document.
  1. Create a new Python file called form_recognizer_quickstart.py in your preferred editor or IDE.

  2. Open the form_recognizer_quickstart.py file and copy the following code sample to paste into your application. Make sure you update the key and endpoint variables with values from your Azure portal Form Recognizer instance:

# import libraries
import os
from azure.ai.formrecognizer import DocumentAnalysisClient
from azure.core.credentials import AzureKeyCredential

# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
endpoint = "<your-endpoint>"
key = "<your-key>"

def format_bounding_box(bounding_box):
    if not bounding_box:
        return "N/A"
    return ", ".join(["[{}, {}]".format(p.x, p.y) for p in bounding_box])

def analyze_read():
    # sample form document
    formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png"

    # create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
    document_analysis_client = DocumentAnalysisClient(
        endpoint=endpoint, credential=AzureKeyCredential(key)
    )
    
    poller = document_analysis_client.begin_analyze_document_from_url(
            "prebuilt-read", formUrl)
    result = poller.result()

    print ("Document contains content: ", result.content)
    
    for idx, style in enumerate(result.styles):
        print(
            "Document contains {} content".format(
                "handwritten" if style.is_handwritten else "no handwritten"
            )
        )

    for page in result.pages:
        print("----Analyzing Read from page #{}----".format(page.page_number))
        print(
            "Page has width: {} and height: {}, measured with unit: {}".format(
                page.width, page.height, page.unit
            )
        )

        for line_idx, line in enumerate(page.lines):
            print(
                "...Line # {} has text content '{}' within bounding box '{}'".format(
                    line_idx,
                    line.content,
                    format_bounding_box(line.bounding_box),
                )
            )

        for word in page.words:
            print(
                "...Word '{}' has a confidence of {}".format(
                    word.content, word.confidence
                )
            )

    print("----------------------------------------")


if __name__ == "__main__":
    analyze_read()

Important

Remember to remove the key from your code when you're done, and never post it publicly. For production, use secure methods to store and access your credentials. For more information, see Cognitive Services security.

  1. Once you've added a code sample to your application, navigate to the folder where you have your form_recognizer_quickstart.py file.

  2. Type the following command in your terminal:

    python form_recognizer_quickstart.py
    

Read Model Output

Here's a snippet of the expected output:

Document contains content:  While healthcare is still in the early stages of its Al journey, we
are seeing pharmaceutical and other life sciences organizations
making major investments in Al and related technologies."
.
.
.
----Analyzing Read from page #1----
Page has width: 915.0 and height: 1190.0, measured with unit: pixel
...Line # 0 has text content 'While healthcare is still in the early stages of its Al journey, we' within bounding box '[259.0, 55.0], [816.0, 56.0], [816.0, 79.0], [259.0, 77.0]'
...Line # 1 has text content 'are seeing pharmaceutical and other life sciences organizations' within bounding box '[258.0, 83.0], [825.0, 83.0], [825.0, 106.0], [258.0, 106.0]'
...Line # 2 has text content 'making major investments in Al and related technologies."' within bounding box '[259.0, 112.0], [784.0, 112.0], [784.0, 136.0], [259.0, 136.0]'
.
.
.
...Word 'While' has a confidence of 0.999
...Word 'healthcare' has a confidence of 0.995
...Word 'is' has a confidence of 0.997

To view the entire output, visit the Azure samples repository on GitHub to view the read model output

Next step

Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.

Prerequisites

  • Azure subscription - Create one for free

  • cURL installed.

  • PowerShell version 7.*+, or a similar command-line application. To check your PowerShell version, type Get-Host | Select-Object Version.

  • A Cognitive Services or Form Recognizer resource. Once you have your Azure subscription, create a single-service or multi-service Form Recognizer resource in the Azure portal to get your key and endpoint. You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Tip

Create a Cognitive Services resource if you plan to access multiple cognitive services under a single endpoint/key. For Form Recognizer access only, create a Form Recognizer resource. Please note that you'll need a single-service resource if you intend to use Azure Active Directory authentication.

  • After your resource deploys, select Go to resource. You need the key and endpoint from the resource you create to connect your application to the Form Recognizer API. You'll paste your key and endpoint into the code below later in the quickstart:

    Screenshot: keys and endpoint location in the Azure portal.

Read Model

Form Recognizer v3.0 consolidates the analyze document (POST) and get result (GET) requests into single operations. The modelId is used for POST and resultId for GET operations.

  • For this example, you'll need a form document file from a URI. You can use our sample form document for this quickstart.
  • We've added the file URI value to the POST curl command below.

POST Request

Before you run the following cURL command, make the following changes:

  1. Replace {endpoint} with the endpoint value from your Azure portal Form Recognizer instance.
  2. Replace {key} with the key value from your Azure portal Form Recognizer instance.
curl -v -i POST "{endpoint}/formrecognizer/documentModels/prebuilt-read:analyze?api-version=2022-06-30" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: {key}" --data-ascii "{'urlSource': 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/rest-api/read.png'}"

Operation-Location

You'll receive a 202 (Success) response that includes an Operation-Location header. The value of this header contains a resultID that can be queried to get the status of the asynchronous operation:

{alt-text}

Get Request

After you've called the Analyze document API, call the Get analyze result API to get the status of the operation and the extracted data. Before you run the command, make these changes:

  1. Replace {endpoint} with the endpoint value from your Azure portal Form Recognizer instance.
  2. Replace {key} with the key value from your Azure portal Form Recognizer instance.
  3. Replace {resultID} with the result ID from the Operation-Location header.
curl -v -X GET "{endpoint}/formrecognizer/documentModels/prebuilt-read/analyzeResults/{resultId}?api-version=2022-06-30" -H "Ocp-Apim-Subscription-Key: {key}"

Read Model Output

You'll receive a 200 (Success) response with JSON output. The first field, "status", indicates the status of the operation. If the operation isn't complete, the value of "status" will be "running" or "notStarted", and you should call the API again, either manually or through a script. We recommend an interval of one second or more between calls.

{
    "status": "succeeded",
    "createdDateTime": "2022-04-08T00:36:48Z",
    "lastUpdatedDateTime": "2022-04-08T00:36:50Z",
    "analyzeResult": {
        "apiVersion": "2022-06-30",
        "modelId": "prebuilt-read",
        "stringIndexType": "textElements",
        "content": "While healthcare is still in the early stages of its Al journey, we\nare seeing...",
        "pages": [
            {
                "pageNumber": 1,
                "angle": 0,
                "width": 915,
                "height": 1190,
                "unit": "pixel",
                "words": [
                    {
                        "content": "While",
                        "boundingBox": [
                            260,
                            56,
                            307,
                            56,
                            306,
                            76,
                            260,
                            76
                        ],
                        "confidence": 0.999,
                        "span": {
                            "offset": 0,
                            "length": 5
                        }
                    }
                ]
            }
        ]
    }
}

To view the entire output,visit the Azure samples repository on GitHub to view the read model output.

Next step

Try the layout model, which can extract selection marks and table structures in addition to what the read model offers.