Quickstart: Optical character recognition (OCR)

Get started with the Computer Vision Read REST API or client libraries. The Read API provides you with AI algorithms for extracting text from images and returning it as structured strings. Follow these steps to install a package to your application and try out the sample code for basic tasks.

Use the OCR client library to read printed and handwritten text from a remote image. The OCR service can read visible text in an image and convert it to a character stream. For more information on text recognition, see the Optical character recognition (OCR) overview. The code in this section uses the latest Computer Vision SDK release for Read 3.0.

Tip

You can also extract text from a local image. See the ComputerVisionClient methods, such as ReadInStreamAsync. Or, see the sample code on GitHub for scenarios involving local images.

Reference documentation | Library source code | Package (NuGet) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The Visual Studio IDE or current version of .NET Core.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Read printed and handwritten text

  1. Create a new C# application.

    Using Visual Studio, create a new .NET Core application.

    Install the client library

    Once you've created a new project, install the client library by right-clicking on the project solution in the Solution Explorer and selecting Manage NuGet Packages. In the package manager that opens select Browse, check Include prerelease, and search for Microsoft.Azure.CognitiveServices.Vision.ComputerVision. Select version 7.0.0, and then Install.

  2. Find the key and endpoint.

    Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your key and endpoint in the resource's key and endpoint page, under resource management.

  3. From the project directory, open the Program.cs file in your preferred editor or IDE. Replace the contents of Program.cs with the following code.

    using System;
    using System.Collections.Generic;
    using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
    using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
    using System.Threading.Tasks;
    using System.IO;
    using Newtonsoft.Json;
    using Newtonsoft.Json.Linq;
    using System.Threading;
    using System.Linq;
    
    namespace ComputerVisionQuickstart
    {
        class Program
        {
            // Add your Computer Vision subscription key and endpoint
            static string subscriptionKey = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE";
            static string endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE";
    
            private const string READ_TEXT_URL_IMAGE = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/printed_text.jpg";
    
            static void Main(string[] args)
            {
                Console.WriteLine("Azure Cognitive Services Computer Vision - .NET quickstart example");
                Console.WriteLine();
    
                ComputerVisionClient client = Authenticate(endpoint, subscriptionKey);
    
                // Extract text (OCR) from a URL image using the Read API
                ReadFileUrl(client, READ_TEXT_URL_IMAGE).Wait();
            }
    
            public static ComputerVisionClient Authenticate(string endpoint, string key)
            {
                ComputerVisionClient client =
                  new ComputerVisionClient(new ApiKeyServiceClientCredentials(key))
                  { Endpoint = endpoint };
                return client;
            }
    
            public static async Task ReadFileUrl(ComputerVisionClient client, string urlFile)
            {
                Console.WriteLine("----------------------------------------------------------");
                Console.WriteLine("READ FILE FROM URL");
                Console.WriteLine();
    
                // Read text from URL
                var textHeaders = await client.ReadAsync(urlFile);
                // After the request, get the operation location (operation ID)
                string operationLocation = textHeaders.OperationLocation;
                Thread.Sleep(2000);
    
                // Retrieve the URI where the extracted text will be stored from the Operation-Location header.
                // We only need the ID and not the full URL
                const int numberOfCharsInOperationId = 36;
                string operationId = operationLocation.Substring(operationLocation.Length - numberOfCharsInOperationId);
    
                // Extract the text
                ReadOperationResult results;
                Console.WriteLine($"Extracting text from URL file {Path.GetFileName(urlFile)}...");
                Console.WriteLine();
                do
                {
                    results = await client.GetReadResultAsync(Guid.Parse(operationId));
                }
                while ((results.Status == OperationStatusCodes.Running ||
                    results.Status == OperationStatusCodes.NotStarted));
    
                // Display the found text.
                Console.WriteLine();
                var textUrlFileResults = results.AnalyzeResult.ReadResults;
                foreach (ReadResult page in textUrlFileResults)
                {
                    foreach (Line line in page.Lines)
                    {
                        Console.WriteLine(line.Text);
                    }
                }
                Console.WriteLine();
            }
    
        }
    }
    
  4. Paste your key and endpoint into the code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

    Important

    Remember to remove the key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

  5. As an optional step, see How to specify the model version. For example, to explicitly specify the latest GA model, edit the ReadAsync call as shown. Skipping the parameter or using "latest" automatically uses the most recent GA model.

      // Read text from URL with a specific model version
      var textHeaders = await client.ReadAsync(urlFile,null,null,"2022-04-30");
    
  6. Run the application.

    Click the Debug button at the top of the IDE window.


Output

Azure Cognitive Services Computer Vision - .NET quickstart example

----------------------------------------------------------
READ FILE FROM URL

Extracting text from URL file printed_text.jpg...


Nutrition Facts Amount Per Serving
Serving size: 1 bar (40g)
Serving Per Package: 4
Total Fat 13g
Saturated Fat 1.5g
Amount Per Serving
Trans Fat 0g
Calories 190
Cholesterol 0mg
ories from Fat 110
Sodium 20mg
nt Daily Values are based on Vitamin A 50%
calorie diet.

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the OCR client library to read printed and handwritten text from a remote image. The OCR service can read visible text in an image and convert it to a character stream. For more information on text recognition, see the Optical character recognition (OCR) overview.

Tip

You can also read text from a local image. See the ComputerVisionClientOperationsMixin methods, such as read_in_stream. Or, see the sample code on GitHub for scenarios involving local images.

Reference documentation | Library source code | Package (PiPy) | Samples

Prerequisites

  • An Azure subscription - Create one for free

  • Python 3.x

    • Your Python installation should include pip. You can check if you have pip installed by running pip --version on the command line. Get pip by installing the latest version of Python.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.

    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Read printed and handwritten text

  1. Install the client library.

    You can install the client library with:

    pip install --upgrade azure-cognitiveservices-vision-computervision
    

    Also install the Pillow library.

    pip install pillow
    
  2. Create a new Python application

    Create a new Python file—quickstart-file.py, for example. Then open it in your preferred editor or IDE.

  3. Find the key and endpoint.

    Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your key and endpoint in the resource's key and endpoint page, under resource management.

  4. Replace the contents of quickstart-file.py with the following code.

    from azure.cognitiveservices.vision.computervision import ComputerVisionClient
    from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
    from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
    from msrest.authentication import CognitiveServicesCredentials
    
    from array import array
    import os
    from PIL import Image
    import sys
    import time
    
    '''
    Authenticate
    Authenticates your credentials and creates a client.
    '''
    subscription_key = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE"
    endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE"
    
    computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))
    '''
    END - Authenticate
    '''
    
    '''
    OCR: Read File using the Read API, extract text - remote
    This example will extract text in an image, then print results, line by line.
    This API call can also extract handwriting style text (not shown).
    '''
    print("===== Read File - remote =====")
    # Get an image with text
    read_image_url = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg"
    
    # Call API with URL and raw response (allows you to get the operation location)
    read_response = computervision_client.read(read_image_url,  raw=True)
    
    # Get the operation location (URL with an ID at the end) from the response
    read_operation_location = read_response.headers["Operation-Location"]
    # Grab the ID from the URL
    operation_id = read_operation_location.split("/")[-1]
    
    # Call the "GET" API and wait for it to retrieve the results 
    while True:
        read_result = computervision_client.get_read_result(operation_id)
        if read_result.status not in ['notStarted', 'running']:
            break
        time.sleep(1)
    
    # Print the detected text, line by line
    if read_result.status == OperationStatusCodes.succeeded:
        for text_result in read_result.analyze_result.read_results:
            for line in text_result.lines:
                print(line.text)
                print(line.bounding_box)
    print()
    '''
    END - Read File - remote
    '''
    
    print("End of Computer Vision quickstart.")
    
    
  5. Paste your key and endpoint into the code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

    Important

    Remember to remove the key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

  6. As an optional step, see How to specify the model version. For example, to explicitly specify the latest GA model, edit the read statement as shown. Skipping the parameter or using "latest" automatically uses the most recent GA model.

       # Call API with URL and raw response (allows you to get the operation location)
       read_response = computervision_client.read(read_image_url,  raw=True, model_version="2022-04-30")
    
  7. Run the application with the python command on your quickstart file.

    python quickstart-file.py
    

Output

===== Read File - remote =====
The quick brown fox jumps
[38.0, 650.0, 2572.0, 699.0, 2570.0, 854.0, 37.0, 815.0]
Over
[184.0, 1053.0, 508.0, 1044.0, 510.0, 1123.0, 184.0, 1128.0]
the lazy dog!
[639.0, 1011.0, 1976.0, 1026.0, 1974.0, 1158.0, 637.0, 1141.0]

End of Computer Vision quickstart.

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the OCR client library to read printed and handwritten text from a remote image. The OCR service can read visible text in an image and convert it to a character stream. For more information on text recognition, see the Optical character recognition (OCR) overview.

Tip

You can also read text in a local image. See the ComputerVision methods, such as read. Or, see the sample code on GitHub for scenarios involving local images.

Reference documentation | Library source code |Artifact (Maven) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The current version of the Java Development Kit (JDK)
  • The Gradle build tool, or another dependency manager.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Read printed and handwritten text

  1. Create a new Gradle project.

    In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

    mkdir myapp && cd myapp
    

    Run the gradle init command from your working directory. This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.

    gradle init --type basic
    

    When prompted to choose a DSL, select Kotlin.

  2. Install the client library.

    This quickstart uses the Gradle dependency manager. You can find the client library and information for other dependency managers on the Maven Central Repository.

    Locate build.gradle.kts and open it with your preferred IDE or text editor. Then copy in the following build configuration. This configuration defines the project as a Java application whose entry point is the class ComputerVisionQuickstart. It imports the Computer Vision library.

    plugins {
        java
        application
    }
    application { 
        mainClass.set("ComputerVisionQuickstart")
    }
    repositories {
        mavenCentral()
    }
    dependencies {
        implementation(group = "com.microsoft.azure.cognitiveservices", name = "azure-cognitiveservices-computervision", version = "1.0.6-beta")
    }
    
  3. Set up a test image.

    Create a resources/ folder in the src/main/ folder of your project, and add an image you'd like to read text from. You can download a sample image to use here.

  4. Create a Java file.

    From your working directory, run the following command to create a project source folder:

    mkdir -p src/main/java
    

    Navigate to the new folder and create a file called ComputerVisionQuickstart.java. Open it in your preferred editor or IDE.

  5. Find the key and endpoint.

    Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your key and endpoint in the resource's key and endpoint page, under resource management.

  6. Replace the contents of the file with the following code. This code defines a method, ReadFromUrl, that takes a remote image path and prints the image's text to the console.

    import com.microsoft.azure.cognitiveservices.vision.computervision.*;
    import com.microsoft.azure.cognitiveservices.vision.computervision.implementation.ComputerVisionImpl;
    import com.microsoft.azure.cognitiveservices.vision.computervision.models.*;
    
    import java.io.*;
    import java.nio.file.Files;
    
    import java.util.ArrayList;
    import java.util.List;
    import java.util.UUID;
    
    public class ComputerVisionQuickstart {
    
        static String subscriptionKey = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE";
        static String endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE";
    
        public static void main(String[] args) {
            
            System.out.println("\nAzure Cognitive Services Computer Vision - Java Quickstart Sample");
    
            // Create an authenticated Computer Vision client.
            ComputerVisionClient compVisClient = Authenticate(subscriptionKey, endpoint); 
    
            // Read from remote image
            ReadFromUrl(compVisClient);
        }
    
        public static ComputerVisionClient Authenticate(String subscriptionKey, String endpoint){
            return ComputerVisionManager.authenticate(subscriptionKey).withEndpoint(endpoint);
        }
        
        /**
         * OCR with READ : Performs a Read Operation
         * @param client instantiated vision client
         */
        private static void ReadFromUrl(ComputerVisionClient client) {
            System.out.println("-----------------------------------------------");
            
            String remoteTextImageURL = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/printed_text.jpg";
            System.out.println("Read with URL: " + remoteTextImageURL);
    
            try {
                // Cast Computer Vision to its implementation to expose the required methods
                ComputerVisionImpl vision = (ComputerVisionImpl) client.computerVision();
    
                // Read in remote image and response header
                ReadHeaders responseHeader = vision.readWithServiceResponseAsync(remoteTextImageURL, null)
                .toBlocking()
                .single()
                .headers();
    
                // Extract the operation Id from the operationLocation header
                String operationLocation = responseHeader.operationLocation();
                System.out.println("Operation Location:" + operationLocation);
    
                getAndPrintReadResult(vision, operationLocation);
    
            } catch (Exception e) {
                System.out.println(e.getMessage());
                e.printStackTrace();
            }
        }
    
        /**
         * Extracts the OperationId from a Operation-Location returned by the POST Read operation
         * @param operationLocation
         * @return operationId
         */
        private static String extractOperationIdFromOpLocation(String operationLocation) {
            if (operationLocation != null && !operationLocation.isEmpty()) {
                String[] splits = operationLocation.split("/");
    
                if (splits != null && splits.length > 0) {
                    return splits[splits.length - 1];
                }
            }
            throw new IllegalStateException("Something went wrong: Couldn't extract the operation id from the operation location");
        }
    
        /**
         * Polls for Read result and prints results to console
         * @param vision Computer Vision instance
         * @return operationLocation returned in the POST Read response header
         */
        private static void getAndPrintReadResult(ComputerVision vision, String operationLocation) throws InterruptedException {
            System.out.println("Polling for Read results ...");
    
            // Extract OperationId from Operation Location
            String operationId = extractOperationIdFromOpLocation(operationLocation);
    
            boolean pollForResult = true;
            ReadOperationResult readResults = null;
    
            while (pollForResult) {
                // Poll for result every second
                Thread.sleep(1000);
                readResults = vision.getReadResult(UUID.fromString(operationId));
    
                // The results will no longer be null when the service has finished processing the request.
                if (readResults != null) {
                    // Get request status
                    OperationStatusCodes status = readResults.status();
    
                    if (status == OperationStatusCodes.FAILED || status == OperationStatusCodes.SUCCEEDED) {
                        pollForResult = false;
                    }
                }
            }
    
            // Print read results, page per page
            for (ReadResult pageResult : readResults.analyzeResult().readResults()) {
                System.out.println("");
                System.out.println("Printing Read results for page " + pageResult.page());
                StringBuilder builder = new StringBuilder();
    
                for (Line line : pageResult.lines()) {
                    builder.append(line.text());
                    builder.append("\n");
                }
    
                System.out.println(builder.toString());
            }
        }
    }
    
  7. Paste your key and endpoint into the above code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

    Important

    Remember to remove the key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

  8. Build the app with the following command:

    gradle build
    

    Then, run the application with the gradle run command:

    gradle run
    

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the Optical character recognition client library to read printed and handwritten text with the Read API. The OCR service can read visible text in an image and convert it to a character stream. For more information on text recognition, see the Optical character recognition (OCR) overview.

Tip

You can also read text from a local image. See the ComputerVisionClient methods, such as readInStream. Or, see the sample code on GitHub for scenarios involving local images.

Reference documentation | Library source code | Package (npm) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The current version of Node.js
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Read printed and handwritten text

  1. Create a new Node.js application.

    In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

    mkdir myapp && cd myapp
    

    Run the npm init command to create a node application with a package.json file.

    npm init
    

    Install the client library

    Install the ms-rest-azure and @azure/cognitiveservices-computervision NPM package:

    npm install @azure/cognitiveservices-computervision
    

    Also install the async module:

    npm install async
    

    Your app's package.json file will be updated with the dependencies.

    Create a new file, index.js, and open it in a text editor.

  2. Find the key and endpoint.

    Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your key and endpoint in the resource's key and endpoint page, under resource management.

  3. Paste the following code into your index.js file.

    'use strict';
    
    const async = require('async');
    const fs = require('fs');
    const https = require('https');
    const path = require("path");
    const createReadStream = require('fs').createReadStream
    const sleep = require('util').promisify(setTimeout);
    const ComputerVisionClient = require('@azure/cognitiveservices-computervision').ComputerVisionClient;
    const ApiKeyCredentials = require('@azure/ms-rest-js').ApiKeyCredentials;
    /**
     * AUTHENTICATE
     * This single client is used for all examples.
     */
    const key = 'PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE';
    const endpoint = 'PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE';
    
    const computerVisionClient = new ComputerVisionClient(
      new ApiKeyCredentials({ inHeader: { 'Ocp-Apim-Subscription-Key': key } }), endpoint);
    /**
     * END - Authenticate
     */
    
    function computerVision() {
      async.series([
        async function () {
    
          /**
           * OCR: READ PRINTED & HANDWRITTEN TEXT WITH THE READ API
           * Extracts text from images using OCR (optical character recognition).
           */
          console.log('-------------------------------------------------');
          console.log('READ PRINTED, HANDWRITTEN TEXT AND PDF');
          console.log();
    
          // URL images containing printed and/or handwritten text. 
          // The URL can point to image files (.jpg/.png/.bmp) or multi-page files (.pdf, .tiff).
          const printedTextSampleURL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/printed_text.jpg';
    
          // Recognize text in printed image from a URL
          console.log('Read printed text from URL...', printedTextSampleURL.split('/').pop());
          const printedResult = await readTextFromURL(computerVisionClient, printedTextSampleURL);
          printRecText(printedResult);
    
          // Perform read and await the result from URL
          async function readTextFromURL(client, url) {
            // To recognize text in a local image, replace client.read() with readTextInStream() as shown:
            let result = await client.read(url);
            // Operation ID is last path segment of operationLocation (a URL)
            let operation = result.operationLocation.split('/').slice(-1)[0];
    
            // Wait for read recognition to complete
            // result.status is initially undefined, since it's the result of read
            while (result.status !== "succeeded") { await sleep(1000); result = await client.getReadResult(operation); }
            return result.analyzeResult.readResults; // Return the first page of result. Replace [0] with the desired page if this is a multi-page file such as .pdf or .tiff.
          }
    
          // Prints all text from Read result
          function printRecText(readResults) {
            console.log('Recognized text:');
            for (const page in readResults) {
              if (readResults.length > 1) {
                console.log(`==== Page: ${page}`);
              }
              const result = readResults[page];
              if (result.lines.length) {
                for (const line of result.lines) {
                  console.log(line.words.map(w => w.text).join(' '));
                }
              }
              else { console.log('No recognized text.'); }
            }
          }
    
          /**
           * 
           * Download the specified file in the URL to the current local folder
           * 
           */
          function downloadFilesToLocal(url, localFileName) {
            return new Promise((resolve, reject) => {
              console.log('--- Downloading file to local directory from: ' + url);
              const request = https.request(url, (res) => {
                if (res.statusCode !== 200) {
                  console.log(`Download sample file failed. Status code: ${res.statusCode}, Message: ${res.statusMessage}`);
                  reject();
                }
                var data = [];
                res.on('data', (chunk) => {
                  data.push(chunk);
                });
                res.on('end', () => {
                  console.log('   ... Downloaded successfully');
                  fs.writeFileSync(localFileName, Buffer.concat(data));
                  resolve();
                });
              });
              request.on('error', function (e) {
                console.log(e.message);
                reject();
              });
              request.end();
            });
          }
    
          /**
           * END - Recognize Printed & Handwritten Text
           */
          console.log();
          console.log('-------------------------------------------------');
          console.log('End of quickstart.');
    
        },
        function () {
          return new Promise((resolve) => {
            resolve();
          })
        }
      ], (err) => {
        throw (err);
      });
    }
    
    computerVision();
    
  4. Paste your key and endpoint into the above code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

    Important

    Remember to remove the key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

  5. As an optional step, see How to specify the model version. For example, to explicitly specify the latest GA model, edit the read statement as shown. Skipping the parameter or using "latest" automatically uses the most recent GA model.

      let result = await client.read(url,{modelVersion:"2022-04-30"});
    
  6. Run the application with the node command on your quickstart file.

    node index.js
    

Output

-------------------------------------------------
READ PRINTED, HANDWRITTEN TEXT AND PDF

Read printed text from URL... printed_text.jpg
Recognized text:
Nutrition Facts Amount Per Serving
Serving size: 1 bar (40g)
Serving Per Package: 4
Total Fat 13g
Saturated Fat 1.5g
Amount Per Serving
Trans Fat 0g
Calories 190
Cholesterol 0mg
ories from Fat 110
Sodium 20mg
nt Daily Values are based on Vitamin A 50%
calorie diet.

-------------------------------------------------
End of quickstart.

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the Optical character recognition REST API to read printed and handwritten text.

Note

This quickstart uses cURL commands to call the REST API. You can also call the REST API using a programming language. See the GitHub samples for examples in C#, Python, Java, and JavaScript.

Prerequisites

  • An Azure subscription - Create one for free
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • cURL installed

Extract printed and handwritten text

The OCR service can extract visible text in an image or document and convert it to a character stream. For more information on text extraction, see the Optical character recognition (OCR) overview.

Call the Read API

To create and run the sample, do the following steps:

  1. Copy the following command into a text editor.
  2. Make the following changes in the command where needed:
    1. Replace the value of <subscriptionKey> with your key.
    2. Replace the first part of the request URL (westcentralus) with the text in your own endpoint URL.

      Note

      New resources created after July 1, 2019, will use custom subdomain names. For more information and a complete list of regional endpoints, see Custom subdomain names for Cognitive Services.

    3. Optionally, change the image URL in the request body (https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\) to the URL of a different image to be analyzed.
  3. Open a command prompt window.
  4. Paste the command from the text editor into the command prompt window, and then run the command.
curl -v -X POST "https://westcentralus.api.cognitive.microsoft.com/vision/v3.2/read/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: <subscription key>" --data-ascii "{\"url\":\"https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\"}"

The response will include an Operation-Location header, whose value is a unique URL. You use this URL to query the results of the Read operation. The URL expires in 48 hours.

Optionally, specify the model version

As an optional step, see How to specify the model version. For example, to explicitly specify the latest GA model, use model-version=2022-04-30 as the parameter. Skipping the parameter or using model-version=latest automatically uses the most recent GA model.

curl -v -X POST "https://westcentralus.api.cognitive.microsoft.com/vision/v3.2/read/analyze?model-version=2022-04-30" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: <subscription key>" --data-ascii "{\"url\":\"https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\"}"

Get Read results

  1. Copy the following command into your text editor.
  2. Replace the URL with the Operation-Location value you copied in the previous step.
  3. Make the following changes in the command where needed:
    1. Replace the value of <subscriptionKey> with your subscription key.
  4. Open a command prompt window.
  5. Paste the command from the text editor into the command prompt window, and then run the command.
curl -v -X GET "https://westcentralus.api.cognitive.microsoft.com/vision/v3.2/read/analyzeResults/{operationId}" -H "Ocp-Apim-Subscription-Key: {subscription key}" --data-ascii "{body}" 

Examine the response

A successful response is returned in JSON. The sample application parses and displays a successful response in the command prompt window, similar to the following example:

{
  "status": "succeeded",
  "createdDateTime": "2021-04-08T21:56:17.6819115+00:00",
  "lastUpdatedDateTime": "2021-04-08T21:56:18.4161316+00:00",
  "analyzeResult": {
    "version": "3.2",
    "readResults": [
      {
        "page": 1,
        "angle": 0,
        "width": 338,
        "height": 479,
        "unit": "pixel",
        "lines": [
          {
            "boundingBox": [
              25,
              14,
              318,
              14,
              318,
              59,
              25,
              59
            ],
            "text": "NOTHING",
            "appearance": {
              "style": {
                "name": "other",
                "confidence": 0.971
              }
            },
            "words": [
              {
                "boundingBox": [
                  27,
                  15,
                  294,
                  15,
                  294,
                  60,
                  27,
                  60
                ],
                "text": "NOTHING",
                "confidence": 0.994
              }
            ]
          }
        ]
      }
    ]
  }
}

Next steps

In this quickstart, you learned how to call the Read REST API. Next, learn more about the Read API features.

Prerequisites

  • Sign in to Vision Studio with your Azure subscription and Cognitive Services resource. See the Get started section of the overview if you need help with this step.

Read printed and handwritten text

  1. Select the Extract text tab, and select panel titled Extract text from images.
  2. To use the try-it-out experience, you'll need to choose a resource and acknowledge it will incur usage according to your pricing tier.
  3. Select an image from the available set, or upload your own.
  4. After you select your image, you'll see the extracted text appear in the output window. You can also select the JSON tab to see the JSON output that the API call returns.
  5. Below the try-it-out experience are next steps to start using this capability in your own application.

Next steps

In this quickstart, you used Vision Studio to access the Read API. Next, learn more about the Read API features.