Quickstart: Use the Computer Vision client library

Get started with the Computer Vision client library. Follow these steps to install the package and try out the example code for basic tasks. Computer Vision provides you with access to advanced algorithms for processing images and returning information.

Use the Computer Vision client library to:

  • Analyze an image for tags, text description, faces, adult content, and more.
  • Recognize printed and handwritten text with the Batch Read API.

Reference documentation | Library source code | Package (NuGet) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The latest version of the .NET Core SDK.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • Create environment variables for the key and endpoint URL, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT, respectively.

Setting up

Create a new C# application

Create a new .NET Core application in your preferred editor or IDE.

In a console window (such as cmd, PowerShell, or Bash), use the dotnet new command to create a new console app with the name computer-vision-quickstart. This command creates a simple "Hello World" C# project with a single source file: ComputerVisionQuickstart.cs.

dotnet new console -n computer-vision-quickstart

Change your directory to the newly created app folder. You can build the application with:

dotnet build

The build output should contain no warnings or errors.

...
Build succeeded.
 0 Warning(s)
 0 Error(s)
...

From the project directory, open the ComputerVisionQuickstart.cs file in your preferred editor or IDE. Add the following using directives:

using System;
using System.Collections.Generic;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision;
using Microsoft.Azure.CognitiveServices.Vision.ComputerVision.Models;
using System.Threading.Tasks;
using System.IO;
using Newtonsoft.Json;
using Newtonsoft.Json.Linq;

In the application's Program class, create variables for your resource's Azure endpoint and key.

// Add your Computer Vision subscription key and endpoint to your environment variables. 
// Close/reopen your project for them to take effect.
static string subscriptionKey = Environment.GetEnvironmentVariable("COMPUTER_VISION_SUBSCRIPTION_KEY");
static string endpoint = Environment.GetEnvironmentVariable("COMPUTER_VISION_ENDPOINT");

Install the client library

Within the application directory, install the Computer Vision client library for .NET with the following command:

dotnet add package Microsoft.Azure.CognitiveServices.Vision.ComputerVision --version 6.0.0-preview.1

If you're using the Visual Studio IDE, the client library is available as a downloadable NuGet package.

Object model

The following classes and interfaces handle some of the major features of the Computer Vision .NET SDK.

Name Description
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to do most image operations.
ComputerVisionClientExtensions This class contains additional methods for the ComputerVisionClient.
VisualFeatureTypes This enum defines the different types of image analysis that can be done in a standard Analyze operation. You specify a set of VisualFeatureTypes values depending on your needs.

Code examples

These code snippets show you how to do the following tasks with the Computer Vision client library for .NET:

Authenticate the client

Note

This quickstart assumes you've created environment variables for your Computer Vision key and endpoint, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT respectively.

In a new method, instantiate a client with your endpoint and key. Create a ApiKeyServiceClientCredentials object with your key, and use it with your endpoint to create a ComputerVisionClient object.

/*
 * AUTHENTICATE
 * Creates a Computer Vision client used by each example.
 */
public static ComputerVisionClient Authenticate(string endpoint, string key)
{
    ComputerVisionClient client =
      new ComputerVisionClient(new ApiKeyServiceClientCredentials(key))
      { Endpoint = endpoint };
    return client;
}

You'll likely want to call this method in the Main method.

// Create a client
ComputerVisionClient client = Authenticate(endpoint, subscriptionKey);

Analyze an image

The following code defines a method, AnalyzeImageUrl, which uses the client object to analyze a remote image and print the results. The method returns a text description, categorization, list of tags, detected faces, adult content flags, main colors, and image type.

Add the method call in your Main method.

// Analyze an image to get features and other properties.
AnalyzeImageUrl(client, ANALYZE_URL_IMAGE).Wait();

Set up test image

In your Program class, save a reference to the URL of the image you want to analyze.

// URL image used for analyzing an image (image of puppy)
private const string ANALYZE_URL_IMAGE = "https://moderatorsampleimages.blob.core.windows.net/samples/sample16.png";

Note

You can also analyze a local image. See the sample code on GitHub for scenarios involving local images.

Specify visual features

Define your new method for image analysis. Add the code below, which specifies visual features you'd like to extract in your analysis. See the VisualFeatureTypes enum for a complete list.

/* 
 * ANALYZE IMAGE - URL IMAGE
 * Analyze URL image. Extracts captions, categories, tags, objects, faces, racy/adult content,
 * brands, celebrities, landmarks, color scheme, and image types.
 */
public static async Task AnalyzeImageUrl(ComputerVisionClient client, string imageUrl)
{
    Console.WriteLine("----------------------------------------------------------");
    Console.WriteLine("ANALYZE IMAGE - URL");
    Console.WriteLine();

    // Creating a list that defines the features to be extracted from the image. 
    List<VisualFeatureTypes> features = new List<VisualFeatureTypes>()
{
  VisualFeatureTypes.Categories, VisualFeatureTypes.Description,
  VisualFeatureTypes.Faces, VisualFeatureTypes.ImageType,
  VisualFeatureTypes.Tags, VisualFeatureTypes.Adult,
  VisualFeatureTypes.Color, VisualFeatureTypes.Brands,
  VisualFeatureTypes.Objects
};

Insert any of the following code blocks into your AnalyzeImageUrl method to implement their features. Remember to add a closing bracket at the end.

}

Analyze

The AnalyzeImageAsync method returns an ImageAnalysis object that contains all of extracted information.

Console.WriteLine($"Analyzing the image {Path.GetFileName(imageUrl)}...");
Console.WriteLine();
// Analyze the URL image 
ImageAnalysis results = await client.AnalyzeImageAsync(imageUrl, features);

The following sections show how to parse this information in detail.

Get image description

The following code gets the list of generated captions for the image. See Describe images for more details.

// Sunmarizes the image content.
Console.WriteLine("Summary:");
foreach (var caption in results.Description.Captions)
{
    Console.WriteLine($"{caption.Text} with confidence {caption.Confidence}");
}
Console.WriteLine();

Get image category

The following code gets the detected category of the image. See Categorize images for more details.

// Display categories the image is divided into.
Console.WriteLine("Categories:");
foreach (var category in results.Categories)
{
    Console.WriteLine($"{category.Name} with confidence {category.Score}");
}
Console.WriteLine();

Get image tags

The following code gets the set of detected tags in the image. See Content tags for more details.

// Image tags and their confidence score
Console.WriteLine("Tags:");
foreach (var tag in results.Tags)
{
    Console.WriteLine($"{tag.Name} {tag.Confidence}");
}
Console.WriteLine();

Detect objects

The following code detects common objects in the image and prints them to the console. See Object detection for more details.

// Objects
Console.WriteLine("Objects:");
foreach (var obj in results.Objects)
{
    Console.WriteLine($"{obj.ObjectProperty} with confidence {obj.Confidence} at location {obj.Rectangle.X}, " +
      $"{obj.Rectangle.X + obj.Rectangle.W}, {obj.Rectangle.Y}, {obj.Rectangle.Y + obj.Rectangle.H}");
}
Console.WriteLine();

Detect brands

The following code detects corporate brands and logos in the image and prints them to the console. See Brand detection for more details.

// Well-known (or custom, if set) brands.
Console.WriteLine("Brands:");
foreach (var brand in results.Brands)
{
    Console.WriteLine($"Logo of {brand.Name} with confidence {brand.Confidence} at location {brand.Rectangle.X}, " +
      $"{brand.Rectangle.X + brand.Rectangle.W}, {brand.Rectangle.Y}, {brand.Rectangle.Y + brand.Rectangle.H}");
}
Console.WriteLine();

Detect faces

The following code returns the detected faces in the image with their rectangle coordinates and select face attributes. See Face detection for more details.

// Faces
Console.WriteLine("Faces:");
foreach (var face in results.Faces)
{
    Console.WriteLine($"A {face.Gender} of age {face.Age} at location {face.FaceRectangle.Left}, " +
      $"{face.FaceRectangle.Left}, {face.FaceRectangle.Top + face.FaceRectangle.Width}, " +
      $"{face.FaceRectangle.Top + face.FaceRectangle.Height}");
}
Console.WriteLine();

Detect adult, racy, or gory content

The following code prints the detected presence of adult content in the image. See Adult, racy, gory content for more details.

// Adult or racy content, if any.
Console.WriteLine("Adult:");
Console.WriteLine($"Has adult content: {results.Adult.IsAdultContent} with confidence {results.Adult.AdultScore}");
Console.WriteLine($"Has racy content: {results.Adult.IsRacyContent} with confidence {results.Adult.RacyScore}");
Console.WriteLine();

Get image color scheme

The following code prints the detected color attributes in the image, like the dominant colors and accent color. See Color schemes for more details.

// Identifies the color scheme.
Console.WriteLine("Color Scheme:");
Console.WriteLine("Is black and white?: " + results.Color.IsBWImg);
Console.WriteLine("Accent color: " + results.Color.AccentColor);
Console.WriteLine("Dominant background color: " + results.Color.DominantColorBackground);
Console.WriteLine("Dominant foreground color: " + results.Color.DominantColorForeground);
Console.WriteLine("Dominant colors: " + string.Join(",", results.Color.DominantColors));
Console.WriteLine();

Get domain-specific content

Computer Vision can use specialized models to do further analysis on images. See Domain-specific content for more details.

The following code parses data about detected celebrities in the image.

// Celebrities in image, if any.
Console.WriteLine("Celebrities:");
foreach (var category in results.Categories)
{
    if (category.Detail?.Celebrities != null)
    {
        foreach (var celeb in category.Detail.Celebrities)
        {
            Console.WriteLine($"{celeb.Name} with confidence {celeb.Confidence} at location {celeb.FaceRectangle.Left}, " +
              $"{celeb.FaceRectangle.Top}, {celeb.FaceRectangle.Height}, {celeb.FaceRectangle.Width}");
        }
    }
}
Console.WriteLine();

The following code parses data about detected landmarks in the image.

// Popular landmarks in image, if any.
Console.WriteLine("Landmarks:");
foreach (var category in results.Categories)
{
    if (category.Detail?.Landmarks != null)
    {
        foreach (var landmark in category.Detail.Landmarks)
        {
            Console.WriteLine($"{landmark.Name} with confidence {landmark.Confidence}");
        }
    }
}
Console.WriteLine();

Get the image type

The following code prints information about the type of image—whether it is clip art or a line drawing.

// Detects the image types.
Console.WriteLine("Image Type:");
Console.WriteLine("Clip Art Type: " + results.ImageType.ClipArtType);
Console.WriteLine("Line Drawing Type: " + results.ImageType.LineDrawingType);
Console.WriteLine();

Read printed and handwritten text

Computer Vision can read visible text in an image and convert it to a character stream. For more information on text recognition, see the Optical character recognition (OCR) conceptual doc. The code in this section defines a method, BatchReadFileUrl, which uses the client object to detect and extract text in the image.

Add the method call in your Main method.

// Read the batch text from an image (handwriting and/or printed).
BatchReadFileUrl(client, EXTRACT_TEXT_URL_HANDW).Wait();

Set up test image

In your Program class, save a reference to the URL of the image you want to extract text from. This snippet includes sample images for both printed and handwritten text.

private const string EXTRACT_TEXT_URL_HANDW = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg";
// URL image for extracting printed text.
private const string EXTRACT_TEXT_URL_PRINT = "https://intelligentkioskstore.blob.core.windows.net/visionapi/suggestedphotos/3.png";

Note

You can also extract text from a local image. See the sample code on GitHub for scenarios involving local images.

Call the Read API

Define the new method for reading text. Add the code below, which calls the ReadAsync method for the given image. This returns an operation ID and starts an asynchronous process to read the content of the image.

/*
 * BATCH READ FILE - URL IMAGE
 * Recognizes handwritten text. 
 * This API call offers an improvement of results over the Recognize Text calls.
 */
public static async Task BatchReadFileUrl(ComputerVisionClient client, string urlImage)
{
    Console.WriteLine("----------------------------------------------------------");
    Console.WriteLine("BATCH READ FILE - URL IMAGE");
    Console.WriteLine();

    // Read text from URL
    var textHeaders = await client.ReadAsync(urlImage, language: "en");
    // After the request, get the operation location (operation ID)
    string operationLocation = textHeaders.OperationLocation;

Get Read results

Next, get the operation ID returned from the ReadAsync call, and use it to query the service for operation results. The following code checks the operation at one-second intervals until the results are returned. It then prints the extracted text data to the console.

// Retrieve the URI where the recognized text will be stored from the Operation-Location header.
// We only need the ID and not the full URL
const int numberOfCharsInOperationId = 36;
string operationId = operationLocation.Substring(operationLocation.Length - numberOfCharsInOperationId);

// Extract the text
// Delay is between iterations and tries a maximum of 10 times.
int i = 0;
int maxRetries = 10;
ReadOperationResult results;
Console.WriteLine($"Extracting text from URL image {Path.GetFileName(urlImage)}...");
Console.WriteLine();
do
{
    results = await client.GetReadResultAsync(operationId);
    Console.WriteLine("Server status: {0}, waiting {1} seconds...", results.Status, i);
    await Task.Delay(1000);
    if (i == 9) 
    { 
        Console.WriteLine("Server timed out."); 
    }
}
while ((results.Status == TextOperationStatusCodes.Running ||
    results.Status == TextOperationStatusCodes.NotStarted) && i++ < maxRetries);

Display Read results

Add the following code to parse and display the retrieved text data, and finish the method definition.

    // Display the found text.
    Console.WriteLine();
    var textRecognitionLocalFileResults = results.RecognitionResults;
    foreach (TextRecognitionResult recResult in textRecognitionLocalFileResults)
    {
        foreach (Line line in recResult.Lines)
        {
            Console.WriteLine(line.Text);
        }
    }
    Console.WriteLine();
}

Run the application

Run the application from your application directory with the dotnet run command.

dotnet run

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

Reference documentation | Artifact (Maven) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The current version of the Java Development Kit(JDK)
  • The Gradle build tool, or another dependency manager.
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • Create environment variables for the key and endpoint URL, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT, respectively.

Setting up

Create a new Gradle project

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

Run the gradle init command from your working directory. This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.

gradle init --type basic

When prompted to choose a DSL, select Kotlin.

Locate build.gradle.kts and open it with your preferred IDE or text editor. Then copy in the following build configuration. This configuration defines the project as a Java application whose entry point is the class ComputerVisionQuickstarts. It imports the Computer Vision library.

plugins {
    java
    application
}
application { 
    mainClassName = "ComputerVisionQuickstarts"
}
repositories {
    mavenCentral()
}

From your working directory, run the following command to create a project source folder:

mkdir -p src/main/java

Navigate to the new folder and create a file called ComputerVisionQuickstarts.java. Open it in your preferred editor or IDE and add the following import statements:

import com.microsoft.azure.cognitiveservices.vision.computervision.*;
import com.microsoft.azure.cognitiveservices.vision.computervision.models.*;

import java.io.File;
import java.io.FileInputStream;
import java.nio.file.Files;

import java.util.ArrayList;
import java.util.List;

Then add a class definition for ComputerVisionQuickstarts.

Install the client library

This quickstart uses the Gradle dependency manager. You can find the client library and information for other dependency managers on the Maven Central Repository.

In your project's build.gradle.kts file, include the Computer Vision client library as a dependency.

dependencies {
    compile(group = "com.microsoft.azure.cognitiveservices", name = "azure-cognitiveservices-computervision", version = "1.0.2-beta")
}

Object model

The following classes and interfaces handle some of the major features of the Computer Vision Java SDK.

Name Description
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to produce instances of other classes.
ComputerVision This class comes from the client object and directly handles all of the image operations, such as image analysis, text detection, and thumbnail generation.
VisualFeatureTypes This enum defines the different types of image analysis that can be done in a standard Analyze operation. You specify a set of VisualFeatureTypes values depending on your needs.

Code examples

These code snippets show you how to do the following tasks with the Computer Vision client library for Java:

Authenticate the client

Note

This quickstart assumes you've created an environment variable for your Computer Vision key, named COMPUTER_VISION_SUBSCRIPTION_KEY.

The following code adds a main method to your class and creates variables for your resource's Azure endpoint and key. You'll need to enter your own endpoint string, which you can find by checking the Overview section of the Azure portal.

public static void main(String[] args) {
    // Add your Computer Vision subscription key and endpoint to your environment
    // variables.
    // After setting, close and then re-open your command shell or project for the
    // changes to take effect.
    String subscriptionKey = System.getenv("COMPUTER_VISION_SUBSCRIPTION_KEY");
    String endpoint = System.getenv("COMPUTER_VISION_ENDPOINT");

Next, add the following code to create a ComputerVisionClient object and passes it into other method(s), which you'll define later.

ComputerVisionClient compVisClient = ComputerVisionManager.authenticate(subscriptionKey).withEndpoint(endpoint);
// END - Create an authenticated Computer Vision client.

System.out.println("\nAzure Cognitive Services Computer Vision - Java Quickstart Sample");

// Analyze local and remote images
AnalyzeLocalImage(compVisClient);

// Recognize printed text with OCR for a local and remote (URL) image
RecognizeTextOCRLocal(compVisClient);

Note

If you created the environment variable after you launched the application, you'll need to close and reopen the editor, IDE, or shell running it to access the variable.

Analyze an image

The following code defines a method, AnalyzeLocalImage, which uses the client object to analyze a local image and print the results. The method returns a text description, categorization, list of tags, detected faces, adult content flags, main colors, and image type.

Set up test image

First, create a resources/ folder in the src/main/ folder of your project, and add an image you'd like to analyze. Then add the following method definition to your ComputerVisionQuickstarts class. If necessary, change the value of the pathToLocalImage to match your image file.

public static void AnalyzeLocalImage(ComputerVisionClient compVisClient) {
    /*
     * Analyze a local image:
     *
     * Set a string variable equal to the path of a local image. The image path
     * below is a relative path.
     */
    String pathToLocalImage = "src\\main\\resources\\myImage.jpg";

Note

You can also analyze a remote image using its URL. See the sample code on GitHub for scenarios involving remote images.

Specify visual features

Next, specify which visual features you'd like to extract in your analysis. See the VisualFeatureTypes enum for a complete list.

// This list defines the features to be extracted from the image.
List<VisualFeatureTypes> featuresToExtractFromLocalImage = new ArrayList<>();
featuresToExtractFromLocalImage.add(VisualFeatureTypes.DESCRIPTION);
featuresToExtractFromLocalImage.add(VisualFeatureTypes.CATEGORIES);
featuresToExtractFromLocalImage.add(VisualFeatureTypes.TAGS);
featuresToExtractFromLocalImage.add(VisualFeatureTypes.FACES);
featuresToExtractFromLocalImage.add(VisualFeatureTypes.ADULT);
featuresToExtractFromLocalImage.add(VisualFeatureTypes.COLOR);
featuresToExtractFromLocalImage.add(VisualFeatureTypes.IMAGE_TYPE);

Analyze

This method prints detailed results to the console for each scope of image analysis. We recommend you surround this method call in a Try/Catch block. The analyzeImageInStream method returns an ImageAnalysis object that contains all of extracted information.

// Need a byte array for analyzing a local image.
File rawImage = new File(pathToLocalImage);
byte[] imageByteArray = Files.readAllBytes(rawImage.toPath());

// Call the Computer Vision service and tell it to analyze the loaded image.
ImageAnalysis analysis = compVisClient.computerVision().analyzeImageInStream().withImage(imageByteArray)
        .withVisualFeatures(featuresToExtractFromLocalImage).execute();

The following sections show how to parse this information in detail.

Get image description

The following code gets the list of generated captions for the image. For more information, see Describe images.

// Display image captions and confidence values.
System.out.println("\nCaptions: ");
for (ImageCaption caption : analysis.description().captions()) {
    System.out.printf("\'%s\' with confidence %f\n", caption.text(), caption.confidence());
}

Get image category

The following code gets the detected category of the image. For more information, see Categorize images.

// Display image category names and confidence values.
System.out.println("\nCategories: ");
for (Category category : analysis.categories()) {
    System.out.printf("\'%s\' with confidence %f\n", category.name(), category.score());
}

Get image tags

The following code gets the set of detected tags in the image. For more information, see Content tags.

// Display image tags and confidence values.
System.out.println("\nTags: ");
for (ImageTag tag : analysis.tags()) {
    System.out.printf("\'%s\' with confidence %f\n", tag.name(), tag.confidence());
}

Detect faces

The following code returns the detected faces in the image with their rectangle coordinates and selects face attributes. For more information, see Face detection.

// Display any faces found in the image and their location.
System.out.println("\nFaces: ");
for (FaceDescription face : analysis.faces()) {
    System.out.printf("\'%s\' of age %d at location (%d, %d), (%d, %d)\n", face.gender(), face.age(),
            face.faceRectangle().left(), face.faceRectangle().top(),
            face.faceRectangle().left() + face.faceRectangle().width(),
            face.faceRectangle().top() + face.faceRectangle().height());
}

Detect adult, racy, or gory content

The following code prints the detected presence of adult content in the image. For more information, see Adult, racy, gory content.

// Display whether any adult or racy content was detected and the confidence
// values.
System.out.println("\nAdult: ");
System.out.printf("Is adult content: %b with confidence %f\n", analysis.adult().isAdultContent(),
        analysis.adult().adultScore());
System.out.printf("Has racy content: %b with confidence %f\n", analysis.adult().isRacyContent(),
        analysis.adult().racyScore());

Get image color scheme

The following code prints the detected color attributes in the image, like the dominant colors and accent color. For more information, see Color schemes.

// Display the image color scheme.
System.out.println("\nColor scheme: ");
System.out.println("Is black and white: " + analysis.color().isBWImg());
System.out.println("Accent color: " + analysis.color().accentColor());
System.out.println("Dominant background color: " + analysis.color().dominantColorBackground());
System.out.println("Dominant foreground color: " + analysis.color().dominantColorForeground());
System.out.println("Dominant colors: " + String.join(", ", analysis.color().dominantColors()));

Get domain-specific content

Computer Vision can use specialized model to do further analysis on images. For more information, see Domain-specific content.

The following code parses data about detected celebrities in the image.

// Display any celebrities detected in the image and their locations.
System.out.println("\nCelebrities: ");
for (Category category : analysis.categories()) {
    if (category.detail() != null && category.detail().celebrities() != null) {
        for (CelebritiesModel celeb : category.detail().celebrities()) {
            System.out.printf("\'%s\' with confidence %f at location (%d, %d), (%d, %d)\n", celeb.name(),
                    celeb.confidence(), celeb.faceRectangle().left(), celeb.faceRectangle().top(),
                    celeb.faceRectangle().left() + celeb.faceRectangle().width(),
                    celeb.faceRectangle().top() + celeb.faceRectangle().height());
        }
    }
}

The following code parses data about detected landmarks in the image.

// Display any landmarks detected in the image and their locations.
System.out.println("\nLandmarks: ");
for (Category category : analysis.categories()) {
    if (category.detail() != null && category.detail().landmarks() != null) {
        for (LandmarksModel landmark : category.detail().landmarks()) {
            System.out.printf("\'%s\' with confidence %f\n", landmark.name(), landmark.confidence());
        }
    }
}

Get the image type

The following code prints information about the type of image—whether it is clip art or line drawing.

// Display what type of clip art or line drawing the image is.
System.out.println("\nImage type:");
System.out.println("Clip art type: " + analysis.imageType().clipArtType());
System.out.println("Line drawing type: " + analysis.imageType().lineDrawingType());

Read printed and handwritten text

Computer Vision can read visible text in an image and convert it to a character stream.

Note

You can also read text in a remote image using its URL. See the sample code on GitHub for scenarios involving remote images.

Call the Recognize API

First, use the following code to call the recognizePrintedTextInStream method for the given image. When you add this code to your project, you need to replace the value of localTextImagePath with the path to your local image. You can download a sample image to use here.

/**
 * RECOGNIZE PRINTED TEXT: Displays text found in image with angle and orientation of
 * the block of text.
 */
private static void RecognizeTextOCRLocal(ComputerVisionClient client) {
    System.out.println("-----------------------------------------------");
    System.out.println("RECOGNIZE PRINTED TEXT");
    
    // Replace this string with the path to your own image.
    String localTextImagePath = "<local image path>";
    
    try {
        File rawImage = new File(localTextImagePath);
        byte[] localImageBytes = Files.readAllBytes(rawImage.toPath());

        // Recognize printed text in local image
        OcrResult ocrResultLocal = client.computerVision().recognizePrintedTextInStream()
                .withDetectOrientation(true).withImage(localImageBytes).withLanguage(OcrLanguages.EN).execute();

The following block of code processes the returned text and parses it to print out the first word in each line. You can use this code to quickly understand the structure of an OcrResult instance.

// Print results of local image
System.out.println();
System.out.println("Recognizing printed text from a local image with OCR ...");
System.out.println("\nLanguage: " + ocrResultLocal.language());
System.out.printf("Text angle: %1.3f\n", ocrResultLocal.textAngle());
System.out.println("Orientation: " + ocrResultLocal.orientation());

boolean firstWord = true;
// Gets entire region of text block
for (OcrRegion reg : ocrResultLocal.regions()) {
    // Get one line in the text block
    for (OcrLine line : reg.lines()) {
        for (OcrWord word : line.words()) {
            // get bounding box of first word recognized (just to demo)
            if (firstWord) {
                System.out.println("\nFirst word in first line is \"" + word.text()
                        + "\" with  bounding box: " + word.boundingBox());
                firstWord = false;
                System.out.println();
            }
            System.out.print(word.text() + " ");
        }
        System.out.println();
    }
}

Finally, close out the try/catch block and the method definition.

    } catch (Exception e) {
        System.out.println(e.getMessage());
        e.printStackTrace();
    }
}

Run the application

You can build the app with:

gradle build

Run the application with the gradle run command:

gradle run

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to use the Computer Vision Java library to do basis tasks. Next, explore the reference documentation to learn more about the library.

Reference documentation | Library source code | Package (npm) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • The current version of Node.js
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • Create environment variables for the key and endpoint URL, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT, respectively.

Setting up

Create a new Node.js application

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

Run the npm init command to create a node application with a package.json file.

npm init

Install the client library

Install the ms-rest-azure and @azure/cognitiveservices-computervision NPM packages:

npm install @azure/cognitiveservices-computervision

Your app's package.json file will be updated with the dependencies.

Prepare the Node.js script

Create a new file, index.js, and open it in a text editor. Add the following import statements.

'use strict';

const async = require('async');
const fs = require('fs');
const path = require("path");
const createReadStream = require('fs').createReadStream
const sleep = require('util').promisify(setTimeout);
const ComputerVisionClient = require('@azure/cognitiveservices-computervision').ComputerVisionClient;
const ApiKeyCredentials = require('@azure/ms-rest-js').ApiKeyCredentials;

Then, define a function computerVision and declare an async series with primary function and callback function. You will add your quickstart code into the primary function, and call computerVision at the bottom of the script.

function computerVision() {
  async.series([
    async function () {
    },
    function () {
      return new Promise((resolve) => {
        resolve();
      })
    }
  ], (err) => {
    throw (err);
  });
}

computerVision();

Object model

The following classes and interfaces handle some of the major features of the Computer Vision Node.js SDK.

Name Description
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to do most image operations.
VisualFeatureTypes This enum defines the different types of image analysis that can be done in a standard Analyze operation. You specify a set of VisualFeatureTypes values depending on your needs.

Code examples

These code snippets show you how to do the following tasks with the Computer Vision client library for Node.js:

Authenticate the client

Create variables for your resource's Azure endpoint and key. If you created the environment variable after you launched the application, you will need to close and reopen the editor, IDE, or shell running it to access the variable.

/**
 * AUTHENTICATE
 * This single client is used for all examples.
 */
let key = process.env['COMPUTER_VISION_SUBSCRIPTION_KEY'];
let endpoint = process.env['COMPUTER_VISION_ENDPOINT']
if (!key) { throw new Error('Set your environment variables for your subscription key and endpoint.'); }

Instantiate a client with your endpoint and key. Create a ApiKeyCredentials object with your key and endpoint, and use it to create a ComputerVisionClient object.

let computerVisionClient = new ComputerVisionClient(
    new ApiKeyCredentials({inHeader: {'Ocp-Apim-Subscription-Key': key}}), endpoint);

Analyze an image

The code in this section analyzes remote images to extract various visual features. You can do these operations as part of the analyzeImage method of the client object, or you can call them using individual methods. See the reference documentation for details.

Note

You can also analyze a local image. See the sample code on GitHub for scenarios involving local images.

Get image description

The following code gets the list of generated captions for the image. See Describe images for more details.

First, define the URL of an image to analyze:

var describeURL = 'https://moderatorsampleimages.blob.core.windows.net/samples/sample1.jpg';

Then add the following code to get the image description and print it to the console.

// Analyze URL image
console.log('Analyzing URL image to describe...', describeURL.split('/').pop());
var caption = (await computerVisionClient.describeImage(describeURL)).captions[0];
console.log(`This may be ${caption.text} (${caption.confidence.toFixed(2)} confidence)`);

Get image category

The following code gets the detected category of the image. See Categorize images for more details.

const categoryURLImage = 'https://moderatorsampleimages.blob.core.windows.net/samples/sample16.png';

// Analyze URL image
console.log('Analyzing category in image...', categoryURLImage.split('/').pop());
let categories = (await computerVisionClient.analyzeImage(categoryURLImage)).categories;
console.log(`Categories: ${formatCategories(categories)}`);

Define the helper function formatCategories:

// Formats the image categories
function formatCategories(categories) {
  categories.sort((a, b) => b.score - a.score);
  return categories.map(cat => `${cat.name} (${cat.score.toFixed(2)})`).join(', ');
}

Get image tags

The following code gets the set of detected tags in the image. See Content tags for more details.

console.log('-------------------------------------------------');
console.log('DETECT TAGS');
console.log();

// Image of different kind of dog.
const tagsURL = 'https://moderatorsampleimages.blob.core.windows.net/samples/sample16.png';

// Analyze URL image
console.log('Analyzing tags in image...', tagsURL.split('/').pop());
let tags = (await computerVisionClient.analyzeImage(tagsURL, {visualFeatures: ['Tags']})).tags;
console.log(`Tags: ${formatTags(tags)}`);

Define the helper function formatTags:

// Format tags for display
function formatTags(tags) {
  return tags.map(tag => (`${tag.name} (${tag.confidence.toFixed(2)})`)).join(', ');
}

Detect objects

The following code detects common objects in the image and prints them to the console. See Object detection for more details.

// Image of a dog
const objectURL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-node-sdk-samples/master/Data/image.jpg';

// Analyze a URL image
console.log('Analyzing objects in image...', objectURL.split('/').pop());
let objects = (await computerVisionClient.analyzeImage(objectURL, {visualFeatures: ['Objects']})).objects;
console.log();

// Print objects bounding box and confidence
if (objects.length) {
    console.log(`${objects.length} object${objects.length == 1 ? '' : 's'} found:`);
    for (let obj of objects) { console.log(`    ${obj.object} (${obj.confidence.toFixed(2)}) at ${formatRectObjects(obj.rectangle)}`); }
} else { console.log('No objects found.'); }

Define the helper function formatRectObjects:

// Formats the bounding box
function formatRectObjects(rect) {
  return `top=${rect.y}`.padEnd(10) + `left=${rect.x}`.padEnd(10) + `bottom=${rect.y + rect.h}`.padEnd(12) 
  + `right=${rect.x + rect.w}`.padEnd(10) + `(${rect.w}x${rect.h})`;
}

Detect brands

The following code detects corporate brands and logos in the image and prints them to the console. See Brand detection for more details.

const brandURLImage = 'https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/images/red-shirt-logo.jpg';

// Analyze URL image
console.log('Analyzing brands in image...', brandURLImage.split('/').pop());
let brands = (await computerVisionClient.analyzeImage(brandURLImage, {visualFeatures: ['Brands']})).brands;

// Print the brands found
if (brands.length) {
    console.log(`${brands.length} brand${brands.length != 1 ? 's' : ''} found:`);
    for (let brand of brands) {
        console.log(`    ${brand.name} (${brand.confidence.toFixed(2)} confidence)`);
    }
} else { console.log(`No brands found.`); }

Detect faces

The following code returns the detected faces in the image with their rectangle coordinates and select face attributes. See Face detection for more details.

const facesImageURL = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/faces.jpg';

// Analyze URL image.
console.log('Analyzing faces in image...', facesImageURL.split('/').pop());
// Get the visual feature for 'Faces' only.
let faces = (await computerVisionClient.analyzeImage(facesImageURL, {visualFeatures: ['Faces']})).faces;

// Print the bounding box, gender, and age from the faces.
if (faces.length) {
  console.log(`${faces.length} face${faces.length == 1 ? '' : 's'} found:`);
  for (let face of faces) { console.log(`    Gender: ${face.gender}`.padEnd(20) 
    + ` Age: ${face.age}`.padEnd(10) + `at ${formatRectFaces(face.faceRectangle)}`); }
} else { console.log('No faces found.'); }

Define the helper function formatRectFaces:

// Formats the bounding box
function formatRectFaces(rect) {
  return `top=${rect.top}`.padEnd(10) + `left=${rect.left}`.padEnd(10) + `bottom=${rect.top + rect.height}`.padEnd(12) 
    + `right=${rect.left + rect.width}`.padEnd(10) + `(${rect.width}x${rect.height})`;
}

Detect adult, racy, or gory content

The following code prints the detected presence of adult content in the image. See Adult, racy, gory content for more details.

Define the URL of the image to use:

// The URL image and local images are not racy/adult. 
// Try your own racy/adult images for a more effective result.
const adultURLImage = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/celebrities.jpg';

Then add the following code to detect adult content and print the results to the console.

// Function to confirm racy or not
const isIt = flag => flag ? 'is' : "isn't";

// Analyze URL image
console.log('Analyzing image for racy/adult content...', adultURLImage.split('/').pop());
var adult = (await computerVisionClient.analyzeImage(adultURLImage, {
  visualFeatures: ['Adult']
})).adult;
console.log(`This probably ${isIt(adult.isAdultContent)} adult content (${adult.adultScore.toFixed(4)} score)`);
console.log(`This probably ${isIt(adult.isRacyContent)} racy content (${adult.racyScore.toFixed(4)} score)`);

Get image color scheme

The following code prints the detected color attributes in the image, like the dominant colors and accent color. See Color schemes for more details.

const colorURLImage = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/celebrities.jpg';

// Analyze URL image
console.log('Analyzing image for color scheme...', colorURLImage.split('/').pop());
console.log();
let color = (await computerVisionClient.analyzeImage(colorURLImage, {visualFeatures: ['Color']})).color;
printColorScheme(color);

Define the helper function printColorScheme to print the details of the color scheme to the console.

// Print a detected color scheme
function printColorScheme(colors){
  console.log(`Image is in ${colors.isBwImg ? 'black and white' : 'color'}`);
  console.log(`Dominant colors: ${colors.dominantColors.join(', ')}`);
  console.log(`Dominant foreground color: ${colors.dominantColorForeground}`);
  console.log(`Dominant background color: ${colors.dominantColorBackground}`);
  console.log(`Suggested accent color: #${colors.accentColor}`);
}

Get domain-specific content

Computer Vision can use specialized model to do further analysis on images. See Domain-specific content for more details.

First, define the URL of an image to analyze:

const domainURLImage = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/landmark.jpg';

The following code parses data about detected landmarks in the image.

// Analyze URL image
console.log('Analyzing image for landmarks...', domainURLImage.split('/').pop());
let domain = (await computerVisionClient.analyzeImageByDomain('landmarks', domainURLImage)).result.landmarks;

// Prints domain-specific, recognized objects
if (domain.length) {
  console.log(`${domain.length} ${domain.length == 1 ? 'landmark' : 'landmarks'} found:`);
  for (let obj of domain) {
    console.log(`    ${obj.name}`.padEnd(20) + `(${obj.confidence.toFixed(2)} confidence)`.padEnd(20) + `${formatRectDomain(obj.faceRectangle)}`);
  }
} else {
  console.log('No landmarks found.');
}

Define the helper function formatRectDomain to parse the location data about detected landmarks.

// Formats bounding box
function formatRectDomain(rect) {
  if (!rect) return '';
  return `top=${rect.top}`.padEnd(10) + `left=${rect.left}`.padEnd(10) + `bottom=${rect.top + rect.height}`.padEnd(12) +
    `right=${rect.left + rect.width}`.padEnd(10) + `(${rect.width}x${rect.height})`;
}

Get the image type

The following code prints information about the type of image—whether it is clip art or line drawing.

const typeURLImage = 'https://raw.githubusercontent.com/Azure-Samples/cognitive-services-python-sdk-samples/master/samples/vision/images/make_things_happen.jpg';

 // Analyze URL image
console.log('Analyzing type in image...', typeURLImage.split('/').pop());
let types = (await computerVisionClient.analyzeImage(typeURLImage, {visualFeatures: ['ImageType']})).imageType;
console.log(`Image appears to be ${describeType(types)}`);

Define the helper function describeType:

function describeType(imageType) {
  if (imageType.clipArtType && imageType.clipArtType > imageType.lineDrawingType) return 'clip art';
  if (imageType.lineDrawingType && imageType.clipArtType < imageType.lineDrawingType) return 'a line drawing';
  return 'a photograph';
}

Read printed and handwritten text

Computer Vision can read visible text in an image and convert it to a character stream.

Note

You can also read text from a local image. See the sample code on GitHub for scenarios involving local images.

Set up test images

Save a reference of the URL of the images you want to extract text from.

// URL images containing printed and handwritten text
 const printedText     = 'https://moderatorsampleimages.blob.core.windows.net/samples/sample2.jpg';
 const handwrittenText = 'https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg';

Call the Recognize API

Add the code below, which calls the recognizeText function for the given images.

// Recognize text in printed image
console.log('Recognizing printed text...', printedText.split('/').pop());
var printed = await recognizeText(computerVisionClient, 'Printed', printedText);
printRecText(printed);

// Recognize text in handwritten image
console.log('\nRecognizing handwritten text...', handwrittenText.split('/').pop());
var handwriting = await recognizeText(computerVisionClient, 'Handwritten', handwrittenText);
printRecText(handwriting);

Define the recognizeText function. This calls the recognizeText method on the client object, which returns an operation ID and starts an asynchronous process to read the content of the image. Then it uses the operation ID to check the operation at one-second intervals until the results are returned. It then returns the extracted results.

// Perform text recognition and await the result
async function recognizeText(client, mode, url) {
  // To recognize text in a local image, replace client.recognizeText() with recognizeTextInStream() as shown:
  // result = await client.recognizeTextInStream(mode, () => createReadStream(localImagePath));
  let result = await client.recognizeText(mode, url);
  // Operation ID is last path segment of operationLocation (a URL)
  let operation = result.operationLocation.split('/').slice(-1)[0];

  // Wait for text recognition to complete
  // result.status is initially undefined, since it's the result of recognizeText
  while (result.status !== 'Succeeded') { await sleep(1000); result = await client.getTextOperationResult(operation); }
  return result.recognitionResult;
}

Then, define the helper function printRecText, which prints the results of a Recognize operation to the console.

// Prints all text from OCR result
function printRecText(ocr) {
  if (ocr.lines.length) {
      console.log('Recognized text:');
      for (let line of ocr.lines) {
          console.log(line.words.map(w => w.text).join(' '));
      }
  }
  else { console.log('No recognized text.'); }
}

Run the application

Run the application with the node command on your quickstart file.

node index.js

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

Reference documentation | Library source code | Package (PiPy) | Samples

Prerequisites

  • An Azure subscription - Create one for free
  • Python 3.x
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • Create environment variables for the key and endpoint URL, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT, respectively.

Note

You can download the full source code for the samples presented below, and examples of every function available from ComputerVisionClient.

Setting up

Create a new Python application

Create a new Python script—quickstart-file.py, for example. Then open it in your preferred editor or IDE and import the following libraries.

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from azure.cognitiveservices.vision.computervision.models import OperationStatusCodes
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
from msrest.authentication import CognitiveServicesCredentials

from array import array
import os
from PIL import Image
import sys
import time

Then, create variables for your resource's Azure endpoint and key.

# Add your Computer Vision subscription key to your environment variables.
if 'COMPUTER_VISION_SUBSCRIPTION_KEY' in os.environ:
    subscription_key = os.environ['COMPUTER_VISION_SUBSCRIPTION_KEY']
else:
    print("\nSet the COMPUTER_VISION_SUBSCRIPTION_KEY environment variable.\n**Restart your shell or IDE for changes to take effect.**")
    sys.exit()
# Add your Computer Vision endpoint to your environment variables.
if 'COMPUTER_VISION_ENDPOINT' in os.environ:
    endpoint = os.environ['COMPUTER_VISION_ENDPOINT']
else:
    print("\nSet the COMPUTER_VISION_ENDPOINT environment variable.\n**Restart your shell or IDE for changes to take effect.**")
    sys.exit()

Note

If you created the environment variable after you launched the application, you will need to close and reopen the editor, IDE, or shell running it to access the variable.

Install the client library

You can install the client library with:

pip install --upgrade azure-cognitiveservices-vision-computervision

Object model

The following classes and interfaces handle some of the major features of the Computer Vision Python SDK.

Name Description
ComputerVisionClientOperationsMixin This class directly handles all of the image operations, such as image analysis, text detection, and thumbnail generation.
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to produce instances of other classes. It implements ComputerVisionClientOperationsMixin.
VisualFeatureTypes This enum defines the different types of image analysis that can be done in a standard Analyze operation. You specify a set of VisualFeatureTypes values depending on your needs.

Code examples

These code snippets show you how to do the following tasks with the Computer Vision client library for Python:

Authenticate the client

Note

This quickstart assumes you've created an environment variable for your Computer Vision key, named COMPUTER_VISION_SUBSCRIPTION_KEY.

Instantiate a client with your endpoint and key. Create a CognitiveServicesCredentials object with your key, and use it with your endpoint to create a ComputerVisionClient object.

computervision_client = ComputerVisionClient(endpoint, CognitiveServicesCredentials(subscription_key))

Analyze an image

Save a reference to the URL of an image you want to analyze.

remote_image_url = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/landmark.jpg"

Get image description

The following code gets the list of generated captions for the image. See Describe images for more details.

'''
Describe an Image - remote
This example describes the contents of an image with the confidence score.
'''
print("===== Describe an image - remote =====")
# Call API
description_results = computervision_client.describe_image(remote_image_url )

# Get the captions (descriptions) from the response, with confidence level
print("Description of remote image: ")
if (len(description_results.captions) == 0):
    print("No description detected.")
else:
    for caption in description_results.captions:
        print("'{}' with confidence {:.2f}%".format(caption.text, caption.confidence * 100))

Get image category

The following code gets the detected category of the image. See Categorize images for more details.

'''
Categorize an Image - remote
This example extracts (general) categories from a remote image with a confidence score.
'''
print("===== Categorize an image - remote =====")
# Select the visual feature(s) you want.
remote_image_features = ["categories"]
# Call API with URL and features
categorize_results_remote = computervision_client.analyze_image(remote_image_url , remote_image_features)

# Print results with confidence score
print("Categories from remote image: ")
if (len(categorize_results_remote.categories) == 0):
    print("No categories detected.")
else:
    for category in categorize_results_remote.categories:
        print("'{}' with confidence {:.2f}%".format(category.name, category.score * 100))

Get image tags

The following code gets the set of detected tags in the image. See Content tags for more details.

'''
Tag an Image - remote
This example returns a tag (key word) for each thing in the image.
'''
print("===== Tag an image - remote =====")
# Call API with remote image
tags_result_remote = computervision_client.tag_image(remote_image_url )

# Print results with confidence score
print("Tags in the remote image: ")
if (len(tags_result_remote.tags) == 0):
    print("No tags detected.")
else:
    for tag in tags_result_remote.tags:
        print("'{}' with confidence {:.2f}%".format(tag.name, tag.confidence * 100))

Detect objects

The following code detects common objects in the image and prints them to the console. See Object detection for more details.

'''
Detect Objects - remote
This example detects different kinds of objects with bounding boxes in a remote image.
'''
print("===== Detect Objects - remote =====")
# Get URL image with different objects
remote_image_url_objects = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/objects.jpg"
# Call API with URL
detect_objects_results_remote = computervision_client.detect_objects(remote_image_url_objects)

# Print detected objects results with bounding boxes
print("Detecting objects in remote image:")
if len(detect_objects_results_remote.objects) == 0:
    print("No objects detected.")
else:
    for object in detect_objects_results_remote.objects:
        print("object at location {}, {}, {}, {}".format( \
        object.rectangle.x, object.rectangle.x + object.rectangle.w, \
        object.rectangle.y, object.rectangle.y + object.rectangle.h))

Detect brands

The following code detects corporate brands and logos in the image and prints them to the console. See Brand detection for more details.

'''
Detect Brands - remote
This example detects common brands like logos and puts a bounding box around them.
'''
print("===== Detect Brands - remote =====")
# Get a URL with a brand logo
remote_image_url = "https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/images/gray-shirt-logo.jpg"
# Select the visual feature(s) you want
remote_image_features = ["brands"]
# Call API with URL and features
detect_brands_results_remote = computervision_client.analyze_image(remote_image_url, remote_image_features)

print("Detecting brands in remote image: ")
if len(detect_brands_results_remote.brands) == 0:
    print("No brands detected.")
else:
    for brand in detect_brands_results_remote.brands:
        print("'{}' brand detected with confidence {:.1f}% at location {}, {}, {}, {}".format( \
        brand.name, brand.confidence * 100, brand.rectangle.x, brand.rectangle.x + brand.rectangle.w, \
        brand.rectangle.y, brand.rectangle.y + brand.rectangle.h))

Detect faces

The following code returns the detected faces in the image with their rectangle coordinates and select face attributes. See Face detection for more details.

'''
Detect Faces - remote
This example detects faces in a remote image, gets their gender and age, 
and marks them with a bounding box.
'''
print("===== Detect Faces - remote =====")
# Get an image with faces
remote_image_url_faces = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/faces.jpg"
# Select the visual feature(s) you want.
remote_image_features = ["faces"]
# Call the API with remote URL and features
detect_faces_results_remote = computervision_client.analyze_image(remote_image_url_faces, remote_image_features)

# Print the results with gender, age, and bounding box
print("Faces in the remote image: ")
if (len(detect_faces_results_remote.faces) == 0):
    print("No faces detected.")
else:
    for face in detect_faces_results_remote.faces:
        print("'{}' of age {} at location {}, {}, {}, {}".format(face.gender, face.age, \
        face.face_rectangle.left, face.face_rectangle.top, \
        face.face_rectangle.left + face.face_rectangle.width, \
        face.face_rectangle.top + face.face_rectangle.height))

Detect adult, racy, or gory content

The following code prints the detected presence of adult content in the image. See Adult, racy, gory content for more details.

'''
Detect Adult or Racy Content - remote
This example detects adult or racy content in a remote image, then prints the adult/racy score.
The score is ranged 0.0 - 1.0 with smaller numbers indicating negative results.
'''
print("===== Detect Adult or Racy Content - remote =====")
# Select the visual feature(s) you want
remote_image_features = ["adult"]
# Call API with URL and features
detect_adult_results_remote = computervision_client.analyze_image(remote_image_url, remote_image_features)

# Print results with adult/racy score
print("Analyzing remote image for adult or racy content ... ")
print("Is adult content: {} with confidence {:.2f}".format(detect_adult_results_remote.adult.is_adult_content, detect_adult_results_remote.adult.adult_score * 100))
print("Has racy content: {} with confidence {:.2f}".format(detect_adult_results_remote.adult.is_racy_content, detect_adult_results_remote.adult.racy_score * 100))

Get image color scheme

The following code prints the detected color attributes in the image, like the dominant colors and accent color. See Color schemes for more details.

'''
Detect Color - remote
This example detects the different aspects of its color scheme in a remote image.
'''
print("===== Detect Color - remote =====")
# Select the feature(s) you want
remote_image_features = ["color"]
# Call API with URL and features
detect_color_results_remote = computervision_client.analyze_image(remote_image_url, remote_image_features)

# Print results of color scheme
print("Getting color scheme of the remote image: ")
print("Is black and white: {}".format(detect_color_results_remote.color.is_bw_img))
print("Accent color: {}".format(detect_color_results_remote.color.accent_color))
print("Dominant background color: {}".format(detect_color_results_remote.color.dominant_color_background))
print("Dominant foreground color: {}".format(detect_color_results_remote.color.dominant_color_foreground))
print("Dominant colors: {}".format(detect_color_results_remote.color.dominant_colors))

Get domain-specific content

Computer Vision can use specialized model to do further analysis on images. See Domain-specific content for more details.

The following code parses data about detected celebrities in the image.

'''
Detect Domain-specific Content - remote
This example detects celebrites and landmarks in remote images.
'''
print("===== Detect Domain-specific Content - remote =====")
# URL of one or more celebrities
remote_image_url_celebs = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/faces.jpg"
# Call API with content type (celebrities) and URL
detect_domain_results_celebs_remote = computervision_client.analyze_image_by_domain("celebrities", remote_image_url_celebs)

# Print detection results with name
print("Celebrities in the remote image:")
if len(detect_domain_results_celebs_remote.result["celebrities"]) == 0:
    print("No celebrities detected.")
else:
    for celeb in detect_domain_results_celebs_remote.result["celebrities"]:
        print(celeb["name"])

The following code parses data about detected landmarks in the image.

# Call API with content type (landmarks) and URL
detect_domain_results_landmarks = computervision_client.analyze_image_by_domain("landmarks", remote_image_url)
print()

print("Landmarks in the remote image:")
if len(detect_domain_results_landmarks.result["landmarks"]) == 0:
    print("No landmarks detected.")
else:
    for landmark in detect_domain_results_landmarks.result["landmarks"]:
        print(landmark["name"])

Get the image type

The following code prints information about the type of image—whether it is clip art or line drawing.

'''
Detect Image Types - remote
This example detects an image's type (clip art/line drawing).
'''
print("===== Detect Image Types - remote =====")
# Get URL of an image with a type
remote_image_url_type = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-sample-data-files/master/ComputerVision/Images/type-image.jpg"
# Select visual feature(s) you want
remote_image_features = VisualFeatureTypes.image_type
# Call API with URL and features
detect_type_results_remote = computervision_client.analyze_image(remote_image_url_type, remote_image_features)

# Prints type results with degree of accuracy
print("Type of remote image:")
if detect_type_results_remote.image_type.clip_art_type == 0:
    print("Image is not clip art.")
elif detect_type_results_remote.image_type.line_drawing_type == 1:
    print("Image is ambiguously clip art.")
elif detect_type_results_remote.image_type.line_drawing_type == 2:
    print("Image is normal clip art.")
else:
    print("Image is good clip art.")

if detect_type_results_remote.image_type.line_drawing_type == 0:
    print("Image is not a line drawing.")
else:
    print("Image is a line drawing")

Read printed and handwritten text

Computer Vision can read visible text in an image and convert it to a character stream. You do this in two parts.

Call the Read API

First, use the following code to call the read method for the given image. This returns an operation ID and starts an asynchronous process to read the content of the image.

'''
Batch Read File, recognize handwritten text - remote
This example will extract handwritten text in an image, then print results, line by line.
This API call can also recognize handwriting (not shown).
'''
print("===== Batch Read File - remote =====")
# Get an image with handwritten text
remote_image_handw_text_url = "https://raw.githubusercontent.com/MicrosoftDocs/azure-docs/master/articles/cognitive-services/Computer-vision/Images/readsample.jpg"

# Call API with URL and raw response (allows you to get the operation location)
recognize_handw_results = computervision_client.read(remote_image_handw_text_url,  raw=True)

Get Read results

Next, get the operation ID returned from the read call, and use it to query the service for operation results. The following code checks the operation at one-second intervals until the results are returned. It then prints the extracted text data to the console.

# Get the operation location (URL with an ID at the end) from the response
operation_location_remote = recognize_handw_results.headers["Operation-Location"]
# Grab the ID from the URL
operation_id = operation_location_remote.split("/")[-1]

# Call the "GET" API and wait for it to retrieve the results 
while True:
    get_handw_text_results = computervision_client.get_read_result(operation_id)
    if get_handw_text_results.status not in ['notStarted', 'running']:
        break
    time.sleep(1)

# Print the detected text, line by line
if get_handw_text_results.status == OperationStatusCodes.succeeded:
    for text_result in get_handw_text_results.analyze_result.read_results:
        for line in text_result.lines:
            print(line.text)
            print(line.bounding_box)
print()

Run the application

Run the application with the python command on your quickstart file.

python quickstart-file.py

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to use the Computer Vision library for Python to do basis tasks. Next, explore the reference documentation to learn more about the library.

Reference documentation | Library source code | Package

Prerequisites

  • An Azure subscription - Create one for free
  • The latest version of Go
  • Once you have your Azure subscription, create a Computer Vision resource in the Azure portal to get your key and endpoint. After it deploys, click Go to resource.
    • You will need the key and endpoint from the resource you create to connect your application to the Computer Vision service. You'll paste your key and endpoint into the code below later in the quickstart.
    • You can use the free pricing tier (F0) to try the service, and upgrade later to a paid tier for production.
  • Create environment variables for the key and endpoint URL, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT, respectively.

Setting up

Create a Go project directory

In a console window (cmd, PowerShell, Terminal, Bash), create a new workspace for your Go project, named my-app, and navigate to it.

mkdir -p my-app/{src, bin, pkg}  
cd my-app

Your workspace will contain three folders:

  • src - This directory will contain source code and packages. Any packages installed with the go get command will go in this directory.
  • pkg - This directory will contain the compiled Go package objects. These files all have an .a extension.
  • bin - This directory will contain the binary executable files that are created when you run go install.

Tip

To learn more about the structure of a Go workspace, see the Go language documentation. This guide includes information for setting $GOPATH and $GOROOT.

Install the client library for Go

Next, install the client library for Go:

go get -u https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/computervision

or if you use dep, within your repo run:

dep ensure -add https://github.com/Azure/azure-sdk-for-go/tree/master/services/cognitiveservices/v2.1/computervision

Create a Go application

Next, create a file in the src directory named sample-app.go:

cd src
touch sample-app.go

Open sample-app.go in your preferred IDE or text editor. Then add the package name and import the following libraries:

package main

import (
    "context"
    "encoding/json"
    "fmt"
    "github.com/Azure/azure-sdk-for-go/services/cognitiveservices/v2.0/computervision"
    "github.com/Azure/go-autorest/autorest"
    "io"
    "log"
    "os"
    "strings"
    "time"
)

Also, declare a context at the root of your script. You'll need this object to execute most Computer Vision function calls:

// Declare global so don't have to pass it to all of the tasks.
var computerVisionContext context.Context

Next, you'll begin adding code to carry out different Computer Vision operations.

Object model

The following classes and interfaces handle some of the major features of the Computer Vision Go SDK.

Name Description
BaseClient This class is needed for all Computer Vision functionality, such as image analysis and text reading. You instantiate it with your subscription information, and you use it to do most image operations.
ImageAnalysis This type contains the results of an AnalyzeImage function call. There are similar types for each of the category-specific functions.
ReadOperationResult This type contains the results of a Batch Read operation.
VisualFeatureTypes This type defines the different kinds of image analysis that can be done in a standard Analyze operation. You specify a set of VisualFeatureTypes values depending on your needs.

Code examples

These code snippets show you how to do the following tasks with the Computer Vision client library for Go:

Authenticate the client

Note

This step assumes you've created environment variables for your Computer Vision key and endpoint, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT respectively.

Create a main function and add the following code to it to instantiate a client with your endpoint and key.

/*  
 * Configure the Computer Vision client
 * Set environment variables for COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT,
 * then restart your command shell or your IDE for changes to take effect.
 */
  computerVisionKey := os.Getenv("COMPUTER_VISION_SUBSCRIPTION_KEY")

if (computerVisionKey == "") {
    log.Fatal("\n\nPlease set a COMPUTER_VISION_SUBSCRIPTION_KEY environment variable.\n" +
                          "**You may need to restart your shell or IDE after it's set.**\n")
}

endpointURL := os.Getenv("COMPUTER_VISION_ENDPOINT")
if (endpointURL == "") {
    log.Fatal("\n\nPlease set a COMPUTER_VISION_ENDPOINT environment variable.\n" +
                          "**You may need to restart your shell or IDE after it's set.**")
}

computerVisionClient := computervision.New(endpointURL);
computerVisionClient.Authorizer = autorest.NewCognitiveServicesAuthorizer(computerVisionKey)

computerVisionContext = context.Background()
/*
 * END - Configure the Computer Vision client
 */

Analyze an image

The following code uses the client object to analyze a remote image and print the results to the console. You can get a text description, categorization, list of tags, detected objects, detected brands, detected faces, adult content flags, main colors, and image type.

Set up test image

First save a reference to the URL of the image you want to analyze. Put this inside your main function.

landmarkImageURL := "https://github.com/Azure-Samples/cognitive-services-sample-data-files/raw/master/ComputerVision/Images/landmark.jpg"

Note

You can also analyze a local image. See the sample code on GitHub for scenarios involving local images.

Specify visual features

The following function calls extract different visual features from the sample image. You'll define these functions in the following sections.

// Analyze features of an image, remote
DescribeRemoteImage(computerVisionClient, landmarkImageURL)
CategorizeRemoteImage(computerVisionClient, landmarkImageURL)
TagRemoteImage(computerVisionClient, landmarkImageURL)
DetectFacesRemoteImage(computerVisionClient, facesImageURL)
DetectObjectsRemoteImage(computerVisionClient, objectsImageURL)
DetectBrandsRemoteImage(computerVisionClient, brandsImageURL)
DetectAdultOrRacyContentRemoteImage(computerVisionClient, adultRacyImageURL)
DetectColorSchemeRemoteImage(computerVisionClient, brandsImageURL)
DetectDomainSpecificContentRemoteImage(computerVisionClient, landmarkImageURL)
DetectImageTypesRemoteImage(computerVisionClient, detectTypeImageURL)
GenerateThumbnailRemoteImage(computerVisionClient, adultRacyImageURL)

Get image description

The following function gets the list of generated captions for the image. For more information about image description, see Describe images.

func DescribeRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DESCRIBE IMAGE - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    maxNumberDescriptionCandidates := new(int32)
    *maxNumberDescriptionCandidates = 1

    remoteImageDescription, err := client.DescribeImage(
            computerVisionContext,
            remoteImage,
            maxNumberDescriptionCandidates,
            "") // language
        if err != nil { log.Fatal(err) }

    fmt.Println("Captions from remote image: ")
    if len(*remoteImageDescription.Captions) == 0 {
        fmt.Println("No captions detected.")
    } else {
        for _, caption := range *remoteImageDescription.Captions {
            fmt.Printf("'%v' with confidence %.2f%%\n", *caption.Text, *caption.Confidence * 100)
        }
    }
    fmt.Println()
}

Get image category

The following function gets the detected category of the image. For more information, see Categorize images.

func CategorizeRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("CATEGORIZE IMAGE - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    features := []computervision.VisualFeatureTypes{computervision.VisualFeatureTypesCategories}
    imageAnalysis, err := client.AnalyzeImage(
            computerVisionContext,
            remoteImage,
            features,
            []computervision.Details{},
            "")
    if err != nil { log.Fatal(err) }

    fmt.Println("Categories from remote image: ")
    if len(*imageAnalysis.Categories) == 0 {
        fmt.Println("No categories detected.")
    } else {
        for _, category := range *imageAnalysis.Categories {
            fmt.Printf("'%v' with confidence %.2f%%\n", *category.Name, *category.Score * 100)
        }
    }
    fmt.Println()
}

Get image tags

The following function gets the set of detected tags in the image. For more information, see Content tags.

func TagRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("TAG IMAGE - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    remoteImageTags, err := client.TagImage(
            computerVisionContext,
            remoteImage,
            "")
    if err != nil { log.Fatal(err) }

    fmt.Println("Tags in the remote image: ")
    if len(*remoteImageTags.Tags) == 0 {
        fmt.Println("No tags detected.")
    } else {
        for _, tag := range *remoteImageTags.Tags {
            fmt.Printf("'%v' with confidence %.2f%%\n", *tag.Name, *tag.Confidence * 100)
        }
    }
    fmt.Println()
}

Detect objects

The following function detects common objects in the image and prints them to the console. For more information, see Object detection.

func DetectObjectsRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DETECT OBJECTS - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    imageAnalysis, err := client.DetectObjects(
            computerVisionContext,
            remoteImage,
    )
    if err != nil { log.Fatal(err) }

    fmt.Println("Detecting objects in remote image: ")
    if len(*imageAnalysis.Objects) == 0 {
        fmt.Println("No objects detected.")
    } else {
        // Print the objects found with confidence level and bounding box locations.
        for _, object := range *imageAnalysis.Objects {
            fmt.Printf("'%v' with confidence %.2f%% at location (%v, %v), (%v, %v)\n",
                *object.Object, *object.Confidence * 100,
                *object.Rectangle.X, *object.Rectangle.X + *object.Rectangle.W,
                *object.Rectangle.Y, *object.Rectangle.Y + *object.Rectangle.H)
        }
    }
    fmt.Println()
}

Detect brands

The following code detects corporate brands and logos in the image and prints them to the console. For more information, Brand detection.

First, declare a reference to a new image within your main function.

brandsImageURL := "https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/images/gray-shirt-logo.jpg"

The following code defines the brand detection function.

func DetectBrandsRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DETECT BRANDS - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    // Define the kinds of features you want returned.
    features := []computervision.VisualFeatureTypes{computervision.VisualFeatureTypesBrands}

    imageAnalysis, err := client.AnalyzeImage(
        computerVisionContext,
        remoteImage,
        features,
        []computervision.Details{},
        "en")
    if err != nil { log.Fatal(err) }

    fmt.Println("Detecting brands in remote image: ")
    if len(*imageAnalysis.Brands) == 0 {
        fmt.Println("No brands detected.")
    } else {
        // Get bounding box around the brand and confidence level it's correctly identified.
        for _, brand := range *imageAnalysis.Brands {
            fmt.Printf("'%v' with confidence %.2f%% at location (%v, %v), (%v, %v)\n",
                *brand.Name, *brand.Confidence * 100,
                *brand.Rectangle.X, *brand.Rectangle.X + *brand.Rectangle.W,
                *brand.Rectangle.Y, *brand.Rectangle.Y + *brand.Rectangle.H)
        }
    }
    fmt.Println()
}

Detect faces

The following function returns the detected faces in the image with their rectangle coordinates and certain face attributes. For more information, see Face detection.

func DetectFacesRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DETECT FACES - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    // Define the features you want returned with the API call.
    features := []computervision.VisualFeatureTypes{computervision.VisualFeatureTypesFaces}
    imageAnalysis, err := client.AnalyzeImage(
            computerVisionContext,
            remoteImage,
            features,
            []computervision.Details{},
            "")
        if err != nil { log.Fatal(err) }

    fmt.Println("Detecting faces in a remote image ...")
    if len(*imageAnalysis.Faces) == 0 {
        fmt.Println("No faces detected.")
    } else {
        // Print the bounding box locations of the found faces.
        for _, face := range *imageAnalysis.Faces {
            fmt.Printf("'%v' of age %v at location (%v, %v), (%v, %v)\n",
                face.Gender, *face.Age,
                *face.FaceRectangle.Left, *face.FaceRectangle.Top,
                *face.FaceRectangle.Left + *face.FaceRectangle.Width,
                *face.FaceRectangle.Top + *face.FaceRectangle.Height)
        }
    }
    fmt.Println()
}

Detect adult, racy, or gory content

The following function prints the detected presence of adult content in the image. For more information, see Adult, racy, gory content.

func DetectAdultOrRacyContentRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DETECT ADULT OR RACY CONTENT - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    // Define the features you want returned from the API call.
    features := []computervision.VisualFeatureTypes{computervision.VisualFeatureTypesAdult}
    imageAnalysis, err := client.AnalyzeImage(
            computerVisionContext,
            remoteImage,
            features,
            []computervision.Details{},
            "") // language, English is default
    if err != nil { log.Fatal(err) }

    // Print whether or not there is questionable content.
    // Confidence levels: low means content is OK, high means it's not.
    fmt.Println("Analyzing remote image for adult or racy content: ");
    fmt.Printf("Is adult content: %v with confidence %.2f%%\n", *imageAnalysis.Adult.IsAdultContent, *imageAnalysis.Adult.AdultScore * 100)
    fmt.Printf("Has racy content: %v with confidence %.2f%%\n", *imageAnalysis.Adult.IsRacyContent, *imageAnalysis.Adult.RacyScore * 100)
    fmt.Println()
}

Get image color scheme

The following function prints the detected color attributes in the image, like the dominant colors and accent color. For more information, see Color schemes.

func DetectColorSchemeRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DETECT COLOR SCHEME - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    // Define the features you'd like returned with the result.
    features := []computervision.VisualFeatureTypes{computervision.VisualFeatureTypesColor}
    imageAnalysis, err := client.AnalyzeImage(
            computerVisionContext,
            remoteImage,
            features,
            []computervision.Details{},
            "") // language, English is default
    if err != nil { log.Fatal(err) }

    fmt.Println("Color scheme of the remote image: ");
    fmt.Printf("Is black and white: %v\n", *imageAnalysis.Color.IsBWImg)
    fmt.Printf("Accent color: 0x%v\n", *imageAnalysis.Color.AccentColor)
    fmt.Printf("Dominant background color: %v\n", *imageAnalysis.Color.DominantColorBackground)
    fmt.Printf("Dominant foreground color: %v\n", *imageAnalysis.Color.DominantColorForeground)
    fmt.Printf("Dominant colors: %v\n", strings.Join(*imageAnalysis.Color.DominantColors, ", "))
    fmt.Println()
}

Get domain-specific content

Computer Vision can use specialized models to do further analysis on images. For more information, see Domain-specific content.

The following code parses data about detected celebrities in the image.

func DetectDomainSpecificContentRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DETECT DOMAIN-SPECIFIC CONTENT - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    fmt.Println("Detecting domain-specific content in the local image ...")

    // Check if there are any celebrities in the image.
    celebrities, err := client.AnalyzeImageByDomain(
            computerVisionContext,
            "celebrities",
            remoteImage,
            "") // language, English is default
    if err != nil { log.Fatal(err) }

    fmt.Println("\nCelebrities: ")

    // Marshal the output from AnalyzeImageByDomain into JSON.
    data, err := json.MarshalIndent(celebrities.Result, "", "\t")

    // Define structs for which to unmarshal the JSON.
    type Celebrities struct {
        Name string `json:"name"`
    }

    type CelebrityResult struct {
        Celebrities	[]Celebrities `json:"celebrities"`
    }

    var celebrityResult CelebrityResult

    // Unmarshal the data.
    err = json.Unmarshal(data, &celebrityResult)
    if err != nil { log.Fatal(err) }

    //	Check if any celebrities detected.
    if len(celebrityResult.Celebrities) == 0 {
        fmt.Println("No celebrities detected.")
    }	else {
        for _, celebrity := range celebrityResult.Celebrities {
            fmt.Printf("name: %v\n", celebrity.Name)
        }
    }

The following code parses data about detected landmarks in the image.

    fmt.Println("\nLandmarks: ")

    // Check if there are any landmarks in the image.
    landmarks, err := client.AnalyzeImageByDomain(
            computerVisionContext,
            "landmarks",
            remoteImage,
            "")
    if err != nil { log.Fatal(err) }

    // Marshal the output from AnalyzeImageByDomain into JSON.
    data, err = json.MarshalIndent(landmarks.Result, "", "\t")

    // Define structs for which to unmarshal the JSON.
    type Landmarks struct {
        Name string `json:"name"`
    }

    type LandmarkResult struct {
        Landmarks	[]Landmarks `json:"landmarks"`
    }

    var landmarkResult LandmarkResult

    // Unmarshal the data.
    err = json.Unmarshal(data, &landmarkResult)
    if err != nil { log.Fatal(err) }

    // Check if any celebrities detected.
    if len(landmarkResult.Landmarks) == 0 {
        fmt.Println("No landmarks detected.")
    }	else {
        for _, landmark := range landmarkResult.Landmarks {
            fmt.Printf("name: %v\n", landmark.Name)
        }
    }
    fmt.Println()
}

Get the image type

The following function prints information about the type of image—whether it's clip art or a line drawing.

func DetectImageTypesRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("DETECT IMAGE TYPES - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    features := []computervision.VisualFeatureTypes{computervision.VisualFeatureTypesImageType}

    imageAnalysis, err := client.AnalyzeImage(
            computerVisionContext,
            remoteImage,
            features,
            []computervision.Details{},
            "")
    if err != nil { log.Fatal(err) }

    fmt.Println("Image type of remote image:")

    fmt.Println("\nClip art type: ")
    switch *imageAnalysis.ImageType.ClipArtType {
    case 0:
        fmt.Println("Image is not clip art.")
    case 1:
        fmt.Println("Image is ambiguously clip art.")
    case 2:
        fmt.Println("Image is normal clip art.")
    case 3:
        fmt.Println("Image is good clip art.")
    }

    fmt.Println("\nLine drawing type: ")
    if *imageAnalysis.ImageType.LineDrawingType == 1 {
        fmt.Println("Image is a line drawing.")
    }	else {
        fmt.Println("Image is not a line drawing.")
    }
    fmt.Println()
}

Read printed and handwritten text

Computer Vision can read visible text in an image and convert it to a character stream. The code in this section defines a function, RecognizeTextReadAPIRemoteImage, which uses the client object to detect and extract printed or handwritten text in the image.

Add the sample image reference and function call in your main function.

// Analyze text in an image, remote
BatchReadFileRemoteImage(computerVisionClient, printedImageURL)

Note

You can also extract text from a local image. See the sample code on GitHub for scenarios involving local images.

Call the Read API

Define the new function for reading text, RecognizeTextReadAPIRemoteImage. Add the code below, which calls the BatchReadFile method for the given image. This method returns an operation ID and starts an asynchronous process to read the content of the image.

func BatchReadFileRemoteImage(client computervision.BaseClient, remoteImageURL string) {
    fmt.Println("-----------------------------------------")
    fmt.Println("BATCH READ FILE - remote")
    fmt.Println()
    var remoteImage computervision.ImageURL
    remoteImage.URL = &remoteImageURL

    // The response contains a field called "Operation-Location", 
    // which is a URL with an ID that you'll use for GetReadOperationResult to access OCR results.
    textHeaders, err := client.BatchReadFile(computerVisionContext, remoteImage)
    if err != nil { log.Fatal(err) }

    // Use ExtractHeader from the autorest library to get the Operation-Location URL
    operationLocation := autorest.ExtractHeaderValue("Operation-Location", textHeaders.Response)

    numberOfCharsInOperationId := 36
    operationId := string(operationLocation[len(operationLocation)-numberOfCharsInOperationId : len(operationLocation)])

Get Read results

Next, get the operation ID returned from the BatchReadFile call, and use it with the GetReadOperationResult method to query the service for operation results. The following code checks the operation at one-second intervals until the results are returned. It then prints the extracted text data to the console.

readOperationResult, err := client.GetReadOperationResult(computerVisionContext, operationId)
if err != nil { log.Fatal(err) }

// Wait for the operation to complete.
i := 0
maxRetries := 10

fmt.Println("Recognizing text in a remote image with the batch Read API ...")
for readOperationResult.Status != computervision.Failed &&
        readOperationResult.Status != computervision.Succeeded {
    if i >= maxRetries {
        break
    }
    i++

    fmt.Printf("Server status: %v, waiting %v seconds...\n", readOperationResult.Status, i)
    time.Sleep(1 * time.Second)

    readOperationResult, err = client.GetReadOperationResult(computerVisionContext, operationId)
    if err != nil { log.Fatal(err) }
}

Display Read results

Add the following code to parse and display the retrieved text data, and finish the function definition.

// Display the results.
fmt.Println()
for _, recResult := range *(readOperationResult.RecognitionResults) {
    for _, line := range *recResult.Lines {
        fmt.Println(*line.Text)
    }
}

Run the application

Run the application from your application directory with the go run command.

go run sample-app.go

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps