Install and run Form Recognizer containers

Azure Form Recognizer applies machine learning technology to identify and extract key-value pairs and tables from forms. It associates values and table entries with the key-value pairs and then outputs structured data that includes the relationships in the original file.

To reduce complexity and easily integrate a custom Form Recognizer model into your workflow automation process or other application, you can call the model by using a simple REST API. Only five form documents (or one empty form and two filled-in forms) are needed, so you can get results quickly, accurately, and tailored to your specific content. No heavy manual intervention or extensive data science expertise is necessary. And it doesn't require data labeling or data annotation.

Function Features
Form Recognizer
  • Processes PDF, PNG, and JPG files
  • Trains custom models with a minimum of five forms of the same layout
  • Extracts key-value pairs and table information
  • Uses the Azure Cognitive Services Computer Vision API Recognize Text feature to detect and extract printed text from images inside forms
  • Doesn't require annotation or labeling
  • If you don't have an Azure subscription, create a free account before you begin.

    Prerequisites

    Before you use Form Recognizer containers, you must meet the following prerequisites:

    Required Purpose
    Docker Engine You need the Docker Engine installed on a host computer. Docker provides packages that configure the Docker environment on macOS, Windows, and Linux. For a primer on Docker and container basics, see the Docker overview.

    Docker must be configured to allow the containers to connect with and send billing data to Azure.

    On Windows, Docker must also be configured to support Linux containers.

    Familiarity with Docker You should have a basic understanding of Docker concepts, such as registries, repositories, containers, and container images, and knowledge of basic docker commands.
    The Azure CLI Install the Azure CLI on your host.
    Computer Vision API resource To process scanned documents and images, you need a Computer Vision resource. You can access the Recognize Text feature as either an Azure resource (the REST API or SDK) or a cognitive-services-recognize-text container. The usual billing fees apply.

    Pass in both the API key and endpoints for your Computer Vision resource (Azure cloud or Cognitive Services container). Use this API key and the endpoint as {COMPUTER_VISION_API_KEY} and {COMPUTER_VISION_ENDPOINT_URI}.

    If you use the cognitive-services-recognize-text container, make sure that:

    Your Computer Vision key for the Form Recognizer container is the key specified in the Computer Vision docker run command for the cognitive-services-recognize-text container.
    Your billing endpoint is the container's endpoint (for example, http://localhost:5000). If you use both the Computer Vision container and Form Recognizer container together on the same host, they can't both be started with the default port of 5000.
    Form Recognizer resource To use these containers, you must have:

    An Azure Form Recognizer resource to get the associated API key and endpoint URI. Both values are available on the Azure portal Form Recognizer Overview and Keys pages, and both values are required to start the container.

    {FORM_RECOGNIZER_API_KEY}: One of the two available resource keys on the Keys page

    {FORM_RECOGNIZER_ENDPOINT_URI}: The endpoint as provided on the Overview page

    Gathering required parameters

    There are three primary parameters for all Cognitive Services' containers that are required. The end-user license agreement (EULA) must be present with a value of accept. Additionally, both an Endpoint URL and API Key are needed.

    Endpoint URI {COMPUTER_VISION_ENDPOINT_URI} and {FORM_RECOGNIZER_ENDPOINT_URI}

    The Endpoint URI value is available on the Azure portal Overview page of the corresponding Cognitive Service resource. Navigate to the Overview page, hover over the Endpoint, and a Copy to clipboard icon will appear. Copy and use where needed.

    Gather the endpoint uri for later use

    Keys {COMPUTER_VISION_API_KEY} and {FORM_RECOGNIZER_API_KEY}

    This key is used to start the container, and is available on the Azure portal's Keys page of the corresponding Cognitive Service resource. Navigate to the Keys page, and click on the Copy to clipboard icon.

    Get one of the two keys for later use

    Important

    These subscription keys are used to access your Cognitive Service API. Do not share your keys. Store them securely, for example, using Azure Key Vault. We also recommend regenerating these keys regularly. Only one key is necessary to make an API call. When regenerating the first key, you can use the second key for continued access to the service.

    Request access to the container registry

    You must first complete and submit the Cognitive Services Form Recognizer Containers access request form to request access to the container. Doing so also signs you up for Computer Vision. You don't need to sign up for the Computer Vision request form separately.

    The form requests information about you, your company, and the user scenario for which you'll use the container. After you've submitted the form, the Azure Cognitive Services team reviews it to ensure that you meet the criteria for access to the private container registry.

    Important

    You must use an email address that's associated with either a Microsoft Account (MSA) or Azure Active Directory (Azure AD) account in the form.

    If your request is approved, you'll receive an email with instructions that describe how to obtain your credentials and access the private container registry.

    Use the Docker CLI to authenticate the private container registry

    You can authenticate with the private container registry for Cognitive Services Containers in any of several ways, but the recommended method from the command line is to use the Docker CLI.

    Use the docker login command, as shown in the following example, to log in to containerpreview.azurecr.io, the private container registry for Cognitive Services Containers. Replace <username> with the user name and <password> with the password that's provided in the credentials you received from the Azure Cognitive Services team.

    docker login containerpreview.azurecr.io -u <username> -p <password>
    

    If you've secured your credentials in a text file, you can concatenate the contents of that text file, by using the cat command, to the docker login command, as shown in the following example. Replace <passwordFile> with the path and name of the text file that contains the password and <username> with the user name that's provided in your credentials.

    cat <passwordFile> | docker login containerpreview.azurecr.io -u <username> --password-stdin
    

    The host computer

    The host is a x64-based computer that runs the Docker container. It can be a computer on your premises or a Docker hosting service in Azure, such as:

    Container requirements and recommendations

    The minimum and recommended CPU cores and memory to allocate for each Form Recognizer container are described in the following table:

    Container Minimum Recommended
    Form Recognizer 2 core, 4-GB memory 4 core, 8-GB memory
    Recognize Text 1 core, 8-GB memory 2 cores, 8-GB memory
    • Each core must be at least 2.6 gigahertz (GHz) or faster.
    • Core and memory correspond to the --cpus and --memory settings, which are used as part of the docker run command.

    Note

    The minimum and recommended values are based on Docker limits and not the host machine resources.

    Get the container images with the docker pull command

    Container images for both the Form Recognizer and Recognize Text offerings are available in the following container registry:

    Container Fully qualified image name
    Form Recognizer containerpreview.azurecr.io/microsoft/cognitive-services-form-recognizer:latest
    Recognize Text containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text:latest

    You will need both containers, please note that the Recognizer Text container is detailed outside of this article.

    Tip

    You can use the docker images command to list your downloaded container images. For example, the following command lists the ID, repository, and tag of each downloaded container image, formatted as a table:

    docker images --format "table {{.ID}}\t{{.Repository}}\t{{.Tag}}"
    
    IMAGE ID         REPOSITORY                TAG
    <image-id>       <repository-path/name>    <tag-name>
    

    Docker pull for the Form Recognizer container

    Form Recognizer

    To get the Form Recognizer container, use the following command:

    docker pull containerpreview.azurecr.io/microsoft/cognitive-services-form-recognizer:latest
    

    Docker pull for the Recognize Text container

    Recognize Text

    To get the Recognize Text container, use the following command:

    docker pull containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text:latest
    

    How to use the container

    After the container is on the host computer, use the following process to work with the container.

    1. Run the container, with the required billing settings. More examples of the docker run command are available.
    2. Query the container's prediction endpoint.

    Run the container by using the docker run command

    Use the docker run command to run the container. Refer to gathering required parameters for details on how to get the {COMPUTER_VISION_ENDPOINT_URI}, {COMPUTER_VISION_API_KEY}, {FORM_RECOGNIZER_ENDPOINT_URI} and {FORM_RECOGNIZER_API_KEY} values.

    Examples of the docker run command are available.

    Form Recognizer

    docker run --rm -it -p 5000:5000 --memory 8g --cpus 2 \
    --mount type=bind,source=c:\input,target=/input  \
    --mount type=bind,source=c:\output,target=/output \
    containerpreview.azurecr.io/microsoft/cognitive-services-form-recognizer \
    Eula=accept \
    Billing={FORM_RECOGNIZER_ENDPOINT_URI} \
    ApiKey={FORM_RECOGNIZER_API_KEY} \
    FormRecognizer:ComputerVisionApiKey={COMPUTER_VISION_API_KEY} \
    FormRecognizer:ComputerVisionEndpointUri={COMPUTER_VISION_ENDPOINT_URI}
    

    This command:

    • Runs a Form Recognizer container from the container image.
    • Allocates 2 CPU cores and 8 gigabytes (GB) of memory.
    • Exposes TCP port 5000 and allocates a pseudo-TTY for the container.
    • Automatically removes the container after it exits. The container image is still available on the host computer.
    • Mounts an /input and an /output volume to the container.

    Run multiple containers on the same host

    If you intend to run multiple containers with exposed ports, make sure to run each container with a different exposed port. For example, run the first container on port 5000 and the second container on port 5001.

    You can have this container and a different Azure Cognitive Services container running on the HOST together. You also can have multiple containers of the same Cognitive Services container running.

    Run separate containers as separate docker run commands

    For the Form Recognizer and Text Recognizer combination that's hosted locally on the same host, use the following two example Docker CLI commands:

    Run the first container on port 5000.

    docker run --rm -it -p 5000:5000 --memory 4g --cpus 1 \
    --mount type=bind,source=c:\input,target=/input  \
    --mount type=bind,source=c:\output,target=/output \
    containerpreview.azurecr.io/microsoft/cognitive-services-form-recognizer \
    Eula=accept \
    Billing={FORM_RECOGNIZER_ENDPOINT_URI} \
    ApiKey={FORM_RECOGNIZER_API_KEY}
    FormRecognizer:ComputerVisionApiKey={COMPUTER_VISION_API_KEY} \
    FormRecognizer:ComputerVisionEndpointUri={COMPUTER_VISION_ENDPOINT_URI}
    

    Run the second container on port 5001.

    docker run --rm -it -p 5001:5000 --memory 4g --cpus 1 \
    containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text \
    Eula=accept \
    Billing={COMPUTER_VISION_ENDPOINT_URI} \
    ApiKey={COMPUTER_VISION_API_KEY}
    

    Each subsequent container should be on a different port.

    Run separate containers with Docker Compose

    For the Form Recognizer and Text Recognizer combination that's hosted locally on the same host, see the following example Docker Compose YAML file. The Text Recognizer {COMPUTER_VISION_API_KEY} must be the same for both the formrecognizer and ocr containers. The {COMPUTER_VISION_ENDPOINT_URI} is used only in the ocr container, because the formrecognizer container uses the ocr name and port.

    version: '3.3'
    services:   
      ocr:
        image: "containerpreview.azurecr.io/microsoft/cognitive-services-recognize-text"
        deploy:
          resources:
            limits:
              cpus: '2'
              memory: 8g
            reservations:
              cpus: '1'
              memory: 4g
        environment:
          eula: accept
          billing: "{COMPUTER_VISION_ENDPOINT_URI}"
          apikey: "{COMPUTER_VISION_API_KEY}"
    
      formrecognizer:
        image: "containerpreview.azurecr.io/microsoft/cognitive-services-form-recognizer"
        deploy:
          resources:
            limits:
              cpus: '2'
              memory: 8g
            reservations:
              cpus: '1'
              memory: 4g
        environment:
          eula: accept
          billing: "{FORM_RECOGNIZER_ENDPOINT_URI}"
          apikey: "{FORM_RECOGNIZER_API_KEY}"
          FormRecognizer__ComputerVisionApiKey: {COMPUTER_VISION_API_KEY}
          FormRecognizer__ComputerVisionEndpointUri: "http://ocr:5000"
          FormRecognizer__SyncProcessTaskCancelLimitInSecs: 75
        links:
          - ocr
        volumes:
          - type: bind
            source: c:\output
            target: /output
          - type: bind
            source: c:\input
            target: /input
        ports:
          - "5000:5000"
    

    Important

    The Eula, Billing, and ApiKey, as well as the FormRecognizer:ComputerVisionApiKey and FormRecognizer:ComputerVisionEndpointUri options, must be specified to run the container; otherwise, the container won't start. For more information, see Billing.

    Query the container's prediction endpoint

    Container Endpoint
    form-recognizer http://localhost:5000

    Form Recognizer

    The container provides websocket-based query endpoint APIs, which you access through Form Recognizer services SDK documentation.

    By default, the Form Recognizer SDK uses the online services. To use the container, you need to change the initialization method. See the examples below.

    For C#

    Change from using this Azure-cloud initialization call:

    var config =
        FormRecognizerConfig.FromSubscription(
            "YourSubscriptionKey",
            "YourServiceRegion");
    

    to this call, which uses the container endpoint:

    var config =
        FormRecognizerConfig.FromEndpoint(
            "ws://localhost:5000/formrecognizer/v1.0-preview/custom",
            "YourSubscriptionKey");
    

    For Python

    Change from using this Azure-cloud initialization call:

    formrecognizer_config =
        formrecognizersdk.FormRecognizerConfig(
            subscription=formrecognizer_key, region=service_region)
    

    to this call, which uses the container endpoint:

    formrecognizer_config = 
        formrecognizersdk.FormRecognizerConfig(
            subscription=formrecognizer_key,
            endpoint="ws://localhost:5000/formrecognizer/v1.0-preview/custom"
    

    Form Recognizer

    The container provides REST endpoint APIs, which you can find on the Form Recognizer API page.

    Validate that a container is running

    There are several ways to validate that the container is running. Locate the External IP address and exposed port of the container in question, and open your favorite web browser. Use the various request URLs below to validate the container is running. The example request URLs listed below are http://localhost:5000, but your specific container may vary. Keep in mind that you're to rely on your container's External IP address and exposed port.

    Request URL Purpose
    http://localhost:5000/ The container provides a home page.
    http://localhost:5000/status Requested with an HTTP GET, to validate that the container is running without causing an endpoint query. This request can be used for Kubernetes liveness and readiness probes.
    http://localhost:5000/swagger The container provides a full set of documentation for the endpoints and a Try it out feature. With this feature, you can enter your settings into a web-based HTML form and make the query without having to write any code. After the query returns, an example CURL command is provided to demonstrate the HTTP headers and body format that's required.

    Container's home page

    Stop the container

    To shut down the container, in the command-line environment where the container is running, select Ctrl+C.

    Troubleshooting

    If you run the container with an output mount and logging enabled, the container generates log files that are helpful to troubleshoot issues that happen while starting or running the container.

    Tip

    For more troubleshooting information and guidance, see Cognitive Services containers frequently asked questions (FAQ).

    Billing

    The Form Recognizer containers send billing information to Azure by using a Form Recognizer resource on your Azure account.

    Queries to the container are billed at the pricing tier of the Azure resource that's used for the <ApiKey>.

    Azure Cognitive Services containers aren't licensed to run without being connected to the billing endpoint for metering. You must enable the containers to communicate billing information with the billing endpoint at all times. Cognitive Services containers don't send customer data, such as the image or text that's being analyzed, to Microsoft.

    Connect to Azure

    The container needs the billing argument values to run. These values allow the container to connect to the billing endpoint. The container reports usage about every 10 to 15 minutes. If the container doesn't connect to Azure within the allowed time window, the container continues to run but doesn't serve queries until the billing endpoint is restored. The connection is attempted 10 times at the same time interval of 10 to 15 minutes. If it can't connect to the billing endpoint within the 10 tries, the container stops running.

    Billing arguments

    For the docker run command to start the container, all three of the following options must be specified with valid values:

    Option Description
    ApiKey The API key of the Cognitive Services resource that's used to track billing information.
    The value of this option must be set to an API key for the provisioned resource that's specified in Billing.
    Billing The endpoint of the Cognitive Services resource that's used to track billing information.
    The value of this option must be set to the endpoint URI of a provisioned Azure resource.
    Eula Indicates that you accepted the license for the container.
    The value of this option must be set to accept.

    For more information about these options, see Configure containers.

    Blog posts

    Developer samples

    Developer samples are available at our GitHub repository.

    View webinar

    Join the webinar to learn about:

    • How to deploy Cognitive Services to any machine using Docker
    • How to deploy Cognitive Services to AKS

    Summary

    In this article, you learned concepts and workflow for downloading, installing, and running Form Recognizer containers. In summary:

    • Form Recognizer provides one Linux container for Docker.
    • Container images are downloaded from the private container registry in Azure.
    • Container images run in Docker.
    • You can use either the REST API or the REST SDK to call operations in Form Recognizer container by specifying the host URI of the container.
    • You must specify the billing information when you instantiate a container.

    Important

    Cognitive Services containers are not licensed to run without being connected to Azure for metering. Customers need to enable the containers to communicate billing information with the metering service at all times. Cognitive Services containers do not send customer data (for example, the image or text that is being analyzed) to Microsoft.

    Next steps