Quickstart: Use the Read client library or REST API

Get started with the Read REST API or client libraries. The Read service provides you with AI algorithms for extracting visible text from images and returning it as structured strings. Follow these steps to install a package to your application and try out the sample code for basic tasks.

Use the OCR client library to read printed and handwritten text from an image.

Reference documentation | Library source code | Package (NuGet) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The Visual Studio IDE or current version of .NET Core.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Setting up

Create a new C# application

Using Visual Studio, create a new .NET Core application.

Install the client library

Once you've created a new project, install the client library by right-clicking on the project solution in the Solution Explorer and selecting Manage NuGet Packages. In the package manager that opens select Browse, check Include prerelease, and search for Microsoft.Azure.CognitiveServices.Vision.ComputerVision. Select version 7.0.0, and then Install.

Tip

Want to view the whole quickstart code file at once? You can find it on GitHub, which contains the code examples in this quickstart.

From the project directory, open the Program.cs file in your preferred editor or IDE.

Find the subscription key and endpoint

Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

In the application's Program class, create variables for your Computer Vision subscription key and endpoint. Paste your subscription key and endpoint into the following code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

using System;
using System.Collections.Generic;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
using System.Threading.Tasks;
using System.IO;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;
using System.Threading;
using System.Linq;

namespace ComputerVisionQuickstart
{
    class Program
    {
        // Add your Computer Vision subscription key and endpoint
        static string subscriptionKey = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE";
        static string endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE";

Important

Remember to remove the subscription key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

In the application's Main method, add calls for the methods used in this quickstart. You will create these later.

ComputerVisionClient client = Authenticate(endpoint, subscriptionKey);

// Extract text (OCR) from a URL image using the Read API
ReadFileUrl(client, READ_TEXT_URL_IMAGE).Wait();

Object model

The following classes and interfaces handle some of the major features of the OCR .NET SDK.

Name Description
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to do most image operations.
ComputerVisionClientExtensions This class contains additional methods for the ComputerVisionClient.

Code examples

These code snippets show you how to do the following tasks with the OCR client library for .NET:

Authenticate the client

In a new method in the Program class, instantiate a client with your endpoint and subscription key. Create a ApiKeyServiceClientCredentials object with your subscription key, and use it with your endpoint to create a ComputerVisionClient object.

/*
 * AUTHENTICATE
 * Creates a Computer Vision client used by each example.
 */
public static ComputerVisionClient Authenticate(string endpoint, string key)
{
    ComputerVisionClient client =
      new ComputerVisionClient(new ApiKeyServiceClientCredentials(key))
      { Endpoint = endpoint };
    return client;
}

Read printed and handwritten text

The OCR service can read visible text in an image and convert it to a character stream. For more information on text recognition, see the Optical character recognition (OCR) overview. The code in this section uses the latest Computer Vision SDK release for Read 3.0 and defines a method, BatchReadFileUrl, which uses the client object to detect and extract text in the image.

Tip

You can also extract text from a local image. See the ComputerVisionClient methods, such as ReadInStreamAsync. Or, see the sample code on GitHub for scenarios involving local images.

Set up test image

In your Program class, save a reference to the URL of the image you want to extract text from. This snippet includes sample images for both printed and handwritten text.

private const string READ_TEXT_URL_IMAGE = "https://intelligentkioskstore.blob.core.windows.net/visionapi/suggestedphotos/3.png";

Call the Read API

Define the new method for reading text. Add the code below, which calls the ReadAsync method for the given image. This returns an operation ID and starts an asynchronous process to read the content of the image.

/*
 * READ FILE - URL 
 * Extracts text. 
 */
public static async Task ReadFileUrl(ComputerVisionClient client, string urlFile)
{
    Console.WriteLine("----------------------------------------------------------");
    Console.WriteLine("READ FILE FROM URL");
    Console.WriteLine();

    // Read text from URL
    var textHeaders = await client.ReadAsync(urlFile);
    // After the request, get the operation location (operation ID)
    string operationLocation = textHeaders.OperationLocation;
    Thread.Sleep(2000);

Get Read results

Next, get the operation ID returned from the ReadAsync call, and use it to query the service for operation results. The following code checks the operation until the results are returned. It then prints the extracted text data to the console.

// Retrieve the URI where the extracted text will be stored from the Operation-Location header.
// We only need the ID and not the full URL
const int numberOfCharsInOperationId = 36;
string operationId = operationLocation.Substring(operationLocation.Length - numberOfCharsInOperationId);

// Extract the text
ReadOperationResult results;
Console.WriteLine($"Extracting text from URL file {Path.GetFileName(urlFile)}...");
Console.WriteLine();
do
{
    results = await client.GetReadResultAsync(Guid.Parse(operationId));
}
while ((results.Status == OperationStatusCodes.Running ||
    results.Status == OperationStatusCodes.NotStarted));

Display Read results

Add the following code to parse and display the retrieved text data, and finish the method definition.

    // Display the found text.
    Console.WriteLine();
    var textUrlFileResults = results.AnalyzeResult.ReadResults;
    foreach (ReadResult page in textUrlFileResults)
    {
        foreach (Line line in page.Lines)
        {
            Console.WriteLine(line.Text);
        }
    }
    Console.WriteLine();
}

Run the application

Run the application by clicking the Debug button at the top of the IDE window.

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the Optical character recognition client library to read printed and handwritten text with the Read API.

Reference documentation | Library source code | Package (PiPy) | Samples

Prerequisites

  • An Azure subscription - Create one for free

  • Python 3.x

    • Your Python installation should include pip. You can check if you have pip installed by running pip --version on the command line. Get pip by installing the latest version of Python.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.

    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Setting up

Install the client library

You can install the client library with:

pip install --upgrade azure-cognitiveservices-vision-computervision

Also install the Pillow library.

pip install pillow

Create a new Python application

Tip

Want to view the whole quickstart code file at once? You can find it on GitHub, which contains the code examples in this quickstart.

Create a new Python file—quickstart-file.py, for example. Then open it in your preferred editor or IDE.

Find the subscription key and endpoint

Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

Create variables for your Computer Vision subscription key and endpoint. Paste your subscription key and endpoint into the following code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials

from array import array
import os
from PIL import Image
import sys
import time

'''
Authenticate
Authenticates your credentials and creates a client.
'''
subscription_key = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE"
endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE"

Important

Remember to remove the subscription key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

Object model

The following classes and interfaces handle some of the major features of the OCR Python SDK.

Name Description
ComputerVisionClientOperationsMixin This class directly handles all of the image operations, such as image analysis, text detection, and thumbnail generation.
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to produce instances of other classes. It implements ComputerVisionClientOperationsMixin.
VisualFeatureTypes This enum defines the different types of image analysis that can be done in a standard Analyze operation. You specify a set of VisualFeatureTypes values depending on your needs.

Code examples

These code snippets show you how to do the following tasks with the OCR client library for Python:

Authenticate the client

Instantiate a client with your endpoint and key. Create a CognitiveServicesCredentials object with your key, and use it with your endpoint to create a ComputerVisionClient object.

computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

Read printed and handwritten text

The OCR service can read visible text in an image and convert it to a character stream. You do this in two parts.

Call the Read API

First, use the following code to call the read method for the given image. This returns an operation ID and starts an asynchronous process to read the content of the image.

'''
OCR: Read File using the Read API, extract text - remote
This example will extract text in an image, then print results, line by line.
This API call can also extract handwriting style text (not shown).
'''
print("===== Read File - remote =====")
# Get an image with text
read_image_url = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg"

# Call API with URL and raw response (allows you to get the operation location)
read_response = computervision_client.read(read_image_url,  raw=True)

Tip

You can also read text from a local image. See the ComputerVisionClientOperationsMixin methods, such as read_in_stream. Or, see the sample code on GitHub for scenarios involving local images.

Get Read results

Next, get the operation ID returned from the read call, and use it to query the service for operation results. The following code checks the operation at one-second intervals until the results are returned. It then prints the extracted text data to the console.

# Get the operation location (URL with an ID at the end) from the response
read_operation_location = read_response.headers["Operation-Location"]
# Grab the ID from the URL
operation_id = read_operation_location.split("/")[-1]

# Call the "GET" API and wait for it to retrieve the results 
while True:
    read_result = computervision_client.get_read_result(operation_id)
    if read_result.status not in ['notStarted', 'running']:
        break
    time.sleep(1)

# Print the detected text, line by line
if read_result.status == OperationStatusCodes.succeeded:
    for text_result in read_result.analyze_result.read_results:
        for line in text_result.lines:
            print(line.text)
            print(line.bounding_box)
print()

Run the application

Run the application with the python command on your quickstart file.

python quickstart-file.py

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the Optical character recognition client library to read printed and handwritten text in images.

Reference documentation | Library source code |Artifact (Maven) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The current version of the Java Development Kit (JDK)
  • The Gradle build tool, or another dependency manager.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Setting up

Create a new Gradle project

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

Run the gradle init command from your working directory. This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.

gradle init --type basic

When prompted to choose a DSL, select Kotlin.

Install the client library

This quickstart uses the Gradle dependency manager. You can find the client library and information for other dependency managers on the Maven Central Repository.

Locate build.gradle.kts and open it with your preferred IDE or text editor. Then copy in the following build configuration. This configuration defines the project as a Java application whose entry point is the class ComputerVisionQuickstart. It imports the Computer Vision library.

plugins {
    java
    application
}
application { 
    mainClass.set("ComputerVisionQuickstart")
}
repositories {
    mavenCentral()
}
dependencies {
    implementation(group = "com.microsoft.azure.cognitiveservices", name = "azure-cognitiveservices-computervision", version = "1.0.6-beta")
}

Create a Java file

From your working directory, run the following command to create a project source folder:

mkdir -p src/main/java

Tip

Want to view the whole quickstart code file at once? You can find it on GitHub, which contains the code examples in this quickstart.

Navigate to the new folder and create a file called ComputerVisionQuickstart.java. Open it in your preferred editor or IDE.

Find the subscription key and endpoint

Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

Define the class ComputerVisionQuickstart. Create variables for your Computer Vision subscription key and endpoint. Paste your subscription key and endpoint into the following code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

import com.microsoft.azure.cognitiveservices.vision.computervision.*;
import com.microsoft.azure.cognitiveservices.vision.computervision.implementation.ComputerVisionImpl;
import com.microsoft.azure.cognitiveservices.vision.computervision.models.*;

import java.io.*;
import java.nio.file.Files;

import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

public class ComputerVisionQuickstart {

    static String subscriptionKey = "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE";
    static String endpoint = "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE";

Important

Remember to remove the subscription key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

In the application's main method, add calls for the methods used in this quickstart. You'll define these later.

public static void main(String[] args) {
    
    System.out.println("\nAzure Cognitive Services Computer Vision - Java Quickstart Sample");

    // Create an authenticated Computer Vision client.
    ComputerVisionClient compVisClient = Authenticate(subscriptionKey, endpoint); 

    // Read from local file
    ReadFromFile(compVisClient);
}

Object model

The following classes and interfaces handle some of the major features of the OCR Java SDK.

Name Description
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to produce instances of other classes.

Code examples

These code snippets show you how to do the following tasks with the OCR client library for Java:

Authenticate the client

In a new method, instantiate a ComputerVisionClient object with your endpoint and key.

public static ComputerVisionClient Authenticate(String subscriptionKey, String endpoint){
    return ComputerVisionManager.authenticate(subscriptionKey).withEndpoint(endpoint);
}

Read printed and handwritten text

The OCR service can read visible text in an image and convert it to a character stream. This section defines a method, ReadFromFile, that takes a local file path and prints the image's text to the console.

Tip

You can also read text in a remote image referenced by URL. See the ComputerVision methods, such as read. Or, see the sample code on GitHub for scenarios involving remote images.

Set up test image

Create a resources/ folder in the src/main/ folder of your project, and add an image you'd like to read text from. You can download a sample image to use here.

Then add the following method definition to your ComputerVisionQuickstart class. Change the value of the localFilePath to match your image file.

/**
 * OCR with READ : Performs a Read Operation on a local image
 * @param client instantiated vision client
 * @param localFilePath local file path from which to perform the read operation against
 */
private static void ReadFromFile(ComputerVisionClient client) {
    System.out.println("-----------------------------------------------");
    
    String localFilePath = "src\\main\\resources\\myImage.png";
    System.out.println("Read with local file: " + localFilePath);

Call the Read API

Then, add the following code to call the readInStreamWithServiceResponseAsync method for the given image.


try {
    File rawImage = new File(localFilePath);
    byte[] localImageBytes = Files.readAllBytes(rawImage.toPath());

    // Cast Computer Vision to its implementation to expose the required methods
    ComputerVisionImpl vision = (ComputerVisionImpl) client.computerVision();

    // Read in remote image and response header
    ReadInStreamHeaders responseHeader =
            vision.readInStreamWithServiceResponseAsync(localImageBytes, null, null)
                .toBlocking()
                .single()
                .headers();

The following block of code extracts the operation ID from the response of the Read call. It uses this ID with a helper method to print the text read results to the console.

// Extract the operationLocation from the response header
String operationLocation = responseHeader.operationLocation();
System.out.println("Operation Location:" + operationLocation);

getAndPrintReadResult(vision, operationLocation);

Close out the try/catch block and the method definition.


    } catch (Exception e) {
        System.out.println(e.getMessage());
        e.printStackTrace();
    }
}

Get Read results

Then, add a definition for the helper method. This method uses the operation ID from the previous step to query the read operation and get OCR results when they're available.

/**
 * Polls for Read result and prints results to console
 * @param vision Computer Vision instance
 * @return operationLocation returned in the POST Read response header
 */
private static void getAndPrintReadResult(ComputerVision vision, String operationLocation) throws InterruptedException {
    System.out.println("Polling for Read results ...");

    // Extract OperationId from Operation Location
    String operationId = extractOperationIdFromOpLocation(operationLocation);

    boolean pollForResult = true;
    ReadOperationResult readResults = null;

    while (pollForResult) {
        // Poll for result every second
        Thread.sleep(1000);
        readResults = vision.getReadResult(UUID.fromString(operationId));

        // The results will no longer be null when the service has finished processing the request.
        if (readResults != null) {
            // Get request status
            OperationStatusCodes status = readResults.status();

            if (status == OperationStatusCodes.FAILED || status == OperationStatusCodes.SUCCEEDED) {
                pollForResult = false;
            }
        }
    }

The rest of the method parses the OCR results and prints them to the console.

    // Print read results, page per page
    for (ReadResult pageResult : readResults.analyzeResult().readResults()) {
        System.out.println("");
        System.out.println("Printing Read results for page " + pageResult.page());
        StringBuilder builder = new StringBuilder();

        for (Line line : pageResult.lines()) {
            builder.append(line.text());
            builder.append("\n");
        }

        System.out.println(builder.toString());
    }
}

Finally, add the other helper method used above, which extracts the operation ID from the initial response.

/**
 * Extracts the OperationId from a Operation-Location returned by the POST Read operation
 * @param operationLocation
 * @return operationId
 */
private static String extractOperationIdFromOpLocation(String operationLocation) {
    if (operationLocation != null && !operationLocation.isEmpty()) {
        String[] splits = operationLocation.split("/");

        if (splits != null && splits.length > 0) {
            return splits[splits.length - 1];
        }
    }
    throw new IllegalStateException("Something went wrong: Couldn't extract the operation id from the operation location");
}

Run the application

You can build the app with:

gradle build

Run the application with the gradle run command:

gradle run

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the Optical character recognition client library to read printed and handwritten text with the Read API.

Reference documentation | Library source code | Package (npm) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The current version of Node.js
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Setting up

Create a new Node.js application

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

Run the npm init command to create a node application with a package.json file.

npm init

Install the client library

Install the ms-rest-azure and @azure/cognitiveservices-computervision NPM package:

npm install @azure/cognitiveservices-computervision

Also install the async module:

npm install async

Your app's package.json file will be updated with the dependencies.

Tip

Want to view the whole quickstart code file at once? You can find it on GitHub, which contains the code examples in this quickstart.

Create a new file, index.js, and open it in a text editor.

Find the subscription key and endpoint

Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

Create variables for your Computer Vision subscription key and endpoint. Paste your subscription key and endpoint into the following code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

'use strict';

const async = require('async');
const fs = require('fs');
const https = require('https');
const path = require("path");
const createReadStream = require('fs').createReadStream
const sleep = require('util').promisify(setTimeout);
const ComputerVisionClient = require('@azure/cognitiveservices-computervision').ComputerVisionClient;
const ApiKeyCredentials = require('@azure/ms-rest-js').ApiKeyCredentials;

/**
 * AUTHENTICATE
 * This single client is used for all examples.
 */
const key = 'PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE';
const endpoint = 'PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE';

Important

Remember to remove the subscription key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

Object model

The following classes and interfaces handle some of the major features of the OCR Node.js SDK.

Name Description
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to do most image operations.

Code examples

These code snippets show you how to do the following tasks with the OCR client library for Node.js:

Authenticate the client

Instantiate a client with your endpoint and key. Create a ApiKeyCredentials object with your key and endpoint, and use it to create a ComputerVisionClient object.

const computerVisionClient = new ComputerVisionClient(
  new ApiKeyCredentials({ inHeader: { 'Ocp-Apim-Subscription-Key': key } }), endpoint);

Then, define a function computerVision and declare an async series with primary function and callback function. At the end of the script, you'll complete this function definition and call it.

function computerVision() {
  async.series([
    async function () {

Read printed and handwritten text

The OCR service can extract the visible text in an image and convert it to a character stream. This sample uses the Read operations.

Set up test images

Save a reference of the URL of the images you want to extract text from.

// URL images containing printed and/or handwritten text. 
// The URL can point to image files (.jpg/.png/.bmp) or multi-page files (.pdf, .tiff).
const printedTextSampleURL = 'https://moderatorsampleimages.blob.core.windows.net/samples/sample2.jpg';
const multiLingualTextURL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/MultiLingual.png';
const mixedMultiPagePDFURL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/MultiPageHandwrittenForm.pdf';

Note

You can also read text from a local image. See the ComputerVisionClient methods, such as readInStream. Or, see the sample code on GitHub for scenarios involving local images.

Call the Read API

Define the following fields in your function to denote the Read call status values.

// Status strings returned from Read API. NOTE: CASING IS SIGNIFICANT.
// Before Read 3.0, these are "Succeeded" and "Failed"
const STATUS_SUCCEEDED = "succeeded";
const STATUS_FAILED = "failed"

Add the code below, which calls the readTextFromURL function for the given images.

// Recognize text in printed image from a URL
console.log('Read printed text from URL...', printedTextSampleURL.split('/').pop());
const printedResult = await readTextFromURL(computerVisionClient, printedTextSampleURL);
printRecText(printedResult);

// Recognize multi-lingual text in a PNG from a URL
console.log('\nRead printed multi-lingual text in a PNG from URL...', multiLingualTextURL.split('/').pop());
const multiLingualResult = await readTextFromURL(computerVisionClient, multiLingualTextURL);
printRecText(multiLingualResult);

// Recognize printed text and handwritten text in a PDF from a URL
console.log('\nRead printed and handwritten text from a PDF from URL...', mixedMultiPagePDFURL.split('/').pop());
const mixedPdfResult = await readTextFromURL(computerVisionClient, mixedMultiPagePDFURL);
printRecText(mixedPdfResult);

Define the readTextFromURL function. This calls the read method on the client object, which returns an operation ID and starts an asynchronous process to read the content of the image. Then it uses the operation ID to check the operation status until the results are returned. They it returns the extracted results.

// Perform read and await the result from URL
async function readTextFromURL(client, url) {
  // To recognize text in a local image, replace client.read() with readTextInStream() as shown:
  let result = await client.read(url);
  // Operation ID is last path segment of operationLocation (a URL)
  let operation = result.operationLocation.split('/').slice(-1)[0];

  // Wait for read recognition to complete
  // result.status is initially undefined, since it's the result of read
  while (result.status !== STATUS_SUCCEEDED) { await sleep(1000); result = await client.getReadResult(operation); }
  return result.analyzeResult.readResults; // Return the first page of result. Replace [0] with the desired page if this is a multi-page file such as .pdf or .tiff.
}

Then, define the helper function printRecText, which prints the results of the Read operations to the console.

// Prints all text from Read result
function printRecText(readResults) {
  console.log('Recognized text:');
  for (const page in readResults) {
    if (readResults.length > 1) {
      console.log(`==== Page: ${page}`);
    }
    const result = readResults[page];
    if (result.lines.length) {
      for (const line of result.lines) {
        console.log(line.words.map(w => w.text).join(' '));
      }
    }
    else { console.log('No recognized text.'); }
  }
}

Close the function

Close out the computerVision function and call it.


    },
    function () {
      return new Promise((resolve) => {
        resolve();
      })
    }
  ], (err) => {
    throw (err);
  });
}

computerVision();

Run the application

Run the application with the node command on your quickstart file.

node index.js

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the OCR client library to read printed and handwritten text from images.

Reference documentation | Library source code | Package

Prerequisites

  • An Azure subscription - Create one for free
  • The latest version of Go
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.

Setting up

Create a Go project directory

In a console window (cmd, PowerShell, Terminal, Bash), create a new workspace for your Go project, named my-app, and navigate to it.

mkdir -p my-app/{src, bin, pkg}  
cd my-app

Your workspace will contain three folders:

  • src - This directory will contain source code and packages. Any packages installed with the go get command will go in this directory.
  • pkg - This directory will contain the compiled Go package objects. These files all have an .a extension.
  • bin - This directory will contain the binary executable files that are created when you run go install.

Tip

To learn more about the structure of a Go workspace, see the Go language documentation. This guide includes information for setting $GOPATH and $GOROOT.

Install the client library for Go

Next, install the client library for Go:

go get -u https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/computervision

or if you use dep, within your repo run:

dep ensure -add https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/computervision

Create a Go application

Next, create a file in the src directory named sample-app.go:

cd src
touch sample-app.go

Tip

Want to view the whole quickstart code file at once? You can find it on GitHub, which contains the code examples in this quickstart.

Open sample-app.go in your preferred IDE or text editor.

Declare a context at the root of your script. You'll need this object to execute most Computer Vision function calls.

Find the subscription key and endpoint

Go to the Azure portal. If the Computer Vision resource you created in the Prerequisites section deployed successfully, click the Go to Resource button under Next Steps. You can find your subscription key and endpoint in the resource's key and endpoint page, under resource management.

Create variables for your Computer Vision subscription key and endpoint. Paste your subscription key and endpoint into the following code where indicated. Your Computer Vision endpoint has the form https://<your_computer_vision_resource_name>.cognitiveservices.azure.com/.

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "github.com/Azure/azure-sdk-for-go/services/cognitiveservices/v2.0/computervision"
    "github.com/Azure/go-autorest/autorest"
    "io"
    "log"
    "os"
    "strings"
    "time"
)

// Declare global so don't have to pass it to all of the tasks.
var computerVisionContext context.Context

func main() {
    computerVisionKey := "PASTE_YOUR_COMPUTER_VISION_SUBSCRIPTION_KEY_HERE"
    endpointURL := "PASTE_YOUR_COMPUTER_VISION_ENDPOINT_HERE"

Important

Remember to remove the subscription key from your code when you're done, and never post it publicly. For production, consider using a secure way of storing and accessing your credentials. For example, Azure key vault.

Next, you'll begin adding code to carry out different OCR operations.

Object model

The following classes and interfaces handle some of the major features of the OCR Go SDK.

Name Description
BaseClient This class is needed for all Computer Vision functionality, such as image analysis and text reading. You instantiate it with your subscription information, and you use it to do most image operations.
ReadOperationResult This type contains the results of a Batch Read operation.

Code examples

These code snippets show you how to do the following tasks with the OCR client library for Go:

Authenticate the client

Note

This step assumes you've created environment variables for your Computer Vision key and endpoint, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT respectively.

Create a main function and add the following code to it to instantiate a client with your endpoint and key.

/*  
 * Configure the Computer Vision client
 */
computerVisionClient := computervision.New(endpointURL);
computerVisionClient.Authorizer = autorest.NewCognitiveServicesAuthorizer(computerVisionKey)

computerVisionContext = context.Background()
/*
 * END - Configure the Computer Vision client
 */

Read printed and handwritten text

The OCR service can read visible text in an image and convert it to a character stream. The code in this section defines a function, RecognizeTextReadAPIRemoteImage, which uses the client object to detect and extract printed or handwritten text in the image.

Add the sample image reference and function call in your main function.

// Analyze text in an image, remote
BatchReadFileRemoteImage(computerVisionClient, printedImageURL)

Tip

You can also extract text from a local image. See the BaseClient methods, such as BatchReadFileInStream. Or, see the sample code on GitHub for scenarios involving local images.

Call the Read API

Define the new function for reading text, RecognizeTextReadAPIRemoteImage. Add the code below, which calls the BatchReadFile method for the given image. This method returns an operation ID and starts an asynchronous process to read the content of the image.

func BatchReadFileRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("BATCH READ FILE - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    // The response contains a field called "Operation-Location", 
    // which is a URL with an ID that you'll use for GetReadOperationResult to access OCR results.
    textHeaders, err := client.BatchReadFile(computerVisionContext, remoteImage)
    if err != nil { log.Fatal(err) }

    // Use ExtractHeader from the autorest library to get the Operation-Location URL
    operationLocation := autorest.ExtractHeaderValue("Operation-Location", textHeaders.Response)

    numberOfCharsInOperationId := 36
    operationId := string(operationLocation[len(operationLocation)-numberOfCharsInOperationId : len(operationLocation)])

Get Read results

Next, get the operation ID returned from the BatchReadFile call, and use it with the GetReadOperationResult method to query the service for operation results. The following code checks the operation at one-second intervals until the results are returned. It then prints the extracted text data to the console.

readOperationResult, err := client.GetReadOperationResult(computerVisionContext, operationId)
if err != nil { log.Fatal(err) }

// Wait for the operation to complete.
i := 0
maxRetries := 10

fmt.Println("Recognizing text in a remote image with the batch Read API ...")
for readOperationResult.Status != computervision.Failed &&
        readOperationResult.Status != computervision.Succeeded {
    if i >= maxRetries {
        break
    }
    i++

    fmt.Printf("Server status: %v, waiting %v seconds...\n", readOperationResult.Status, i)
    time.Sleep(1 * time.Second)

    readOperationResult, err = client.GetReadOperationResult(computerVisionContext, operationId)
    if err != nil { log.Fatal(err) }
}

Display Read results

Add the following code to parse and display the retrieved text data, and finish the function definition.

// Display the results.
fmt.Println()
for _, recResult := range *(readOperationResult.RecognitionResults) {
    for _, line := range *recResult.Lines {
        fmt.Println(*line.Text)
    }
}

Run the application

Run the application from your application directory with the go run command.

go run sample-app.go

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to install the OCR client library and use the Read API. Next, learn more about the Read API features.

Use the Optical character recognition REST API to read printed and handwritten text.

Note

This quickstart uses cURL commands to call the REST API. You can also call the REST API using a programming language. See the GitHub samples for examples in C#, Python, Java, JavaScript, and Go.

Prerequisites

  • An Azure subscription - Create one for free
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • cURL installed

Extract printed and handwritten text

The OCR service can extract visible text in an image or document and convert it to a character stream. For more information on text extraction, see the Optical character recognition (OCR) overview.

Call the Read API

To create and run the sample, do the following steps:

  1. Copy the following command into a text editor.
  2. Make the following changes in the command where needed:
    1. Replace the value of <subscriptionKey> with your subscription key.
    2. Replace the first part of the request URL (westcentralus) with the text in your own endpoint URL.

      Note

      New resources created after July 1, 2019, will use custom subdomain names. For more information and a complete list of regional endpoints, see Custom subdomain names for Cognitive Services.

    3. Optionally, change the image URL in the request body (https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\) to the URL of a different image to be analyzed.
  3. Open a command prompt window.
  4. Paste the command from the text editor into the command prompt window, and then run the command.
curl -v -X POST "https://westcentralus.api.cognitive.microsoft.com/vision/v3.2/read/analyze" -H "Content-Type: application/json" -H "Ocp-Apim-Subscription-Key: <subscription key>" --data-ascii "{\"url\":\"https://upload.wikimedia.org/wikipedia/commons/thumb/a/af/Atomist_quote_from_Democritus.png/338px-Atomist_quote_from_Democritus.png\"}"

The response will include an Operation-Location header, whose value is a unique URL. You use this URL to query the results of the Read operation. The URL expires in 48 hours.

How to use preview features

For the preview languages and features, see How to specify the model version to use the latest preview. The preview model includes any enhancements to the currently GA languages and features.

Get Read results

  1. Copy the following command into your text editor.
  2. Replace the URL with the Operation-Location value you copied in the previous step.
  3. Make the following changes in the command where needed:
    1. Replace the value of <subscriptionKey> with your subscription key.
  4. Open a command prompt window.
  5. Paste the command from the text editor into the command prompt window, and then run the command.
curl -v -X GET "https://westcentralus.api.cognitive.microsoft.com/vision/v3.2/read/analyzeResults/{operationId}" -H "Ocp-Apim-Subscription-Key: {subscription key}" --data-ascii "{body}" 

Examine the response

A successful response is returned in JSON. The sample application parses and displays a successful response in the command prompt window, similar to the following example:

{
  "status": "succeeded",
  "createdDateTime": "2021-04-08T21:56:17.6819115+00:00",
  "lastUpdatedDateTime": "2021-04-08T21:56:18.4161316+00:00",
  "analyzeResult": {
    "version": "3.2",
    "readResults": [
      {
        "page": 1,
        "angle": 0,
        "width": 338,
        "height": 479,
        "unit": "pixel",
        "lines": [
          {
            "boundingBox": [
              25,
              14,
              318,
              14,
              318,
              59,
              25,
              59
            ],
            "text": "NOTHING",
            "appearance": {
              "style": {
                "name": "other",
                "confidence": 0.971
              }
            },
            "words": [
              {
                "boundingBox": [
                  27,
                  15,
                  294,
                  15,
                  294,
                  60,
                  27,
                  60
                ],
                "text": "NOTHING",
                "confidence": 0.994
              }
            ]
          }
        ]
      }
    ]
  }
}

Next steps

In this quickstart, you learned how to call the Read REST API. Next, learn more about the Read API features.