Quickstart: Computer Vision client library for Java

Get started with the Computer Vision client library for Java. Follow these steps to install the package and try out the example code for basic tasks. Computer Vision provides you with access to advanced algorithms for processing images and returning information.

Use the Computer Vision client library for Java to:

  • Analyze an image for tags, text description, faces, adult content, and more.
  • Recognize printed and handwritten text with the Batch Read API.

Reference documentation | Artifact (Maven) | Samples


Setting up

Create a Computer Vision Azure resource

Azure Cognitive Services are represented by Azure resources that you subscribe to. Create a resource for Computer Vision using the Azure portal or Azure CLI on your local machine. You can also:

Then, create environment variables for the key and service endpoint string, named COMPUTER_VISION_SUBSCRIPTION_KEY and COMPUTER_VISION_ENDPOINT, respectively.

Create a new Gradle project

In a console window (such as cmd, PowerShell, or Bash), create a new directory for your app, and navigate to it.

mkdir myapp && cd myapp

Run the gradle init command from your working directory. This command will create essential build files for Gradle, including build.gradle.kts, which is used at runtime to create and configure your application.

gradle init --type basic

When prompted to choose a DSL, select Kotlin.

Locate build.gradle.kts and open it with your preferred IDE or text editor. Then copy in the following build configuration. This configuration defines the project as a Java application whose entry point is the class ComputerVisionQuickstarts. It imports the Computer Vision library.

plugins {
application { 
    mainClassName = "ComputerVisionQuickstarts"
repositories {

From your working directory, run the following command to create a project source folder:

mkdir -p src/main/java

Navigate to the new folder and create a file called ComputerVisionQuickstarts.java. Open it in your preferred editor or IDE and add the following import statements:

import com.microsoft.azure.cognitiveservices.vision.computervision.*;
import com.microsoft.azure.cognitiveservices.vision.computervision.models.*;

import java.io.File;
import java.io.FileInputStream;
import java.nio.file.Files;

import java.util.ArrayList;
import java.util.List;

Then add a class definition for ComputerVisionQuickstarts.

Install the client library

This quickstart uses the Gradle dependency manager. You can find the client library and information for other dependency managers on the Maven Central Repository.

In your project's build.gradle.kts file, include the Computer Vision client library as a dependency.

dependencies {
	compile(group = "com.microsoft.azure.cognitiveservices", name = "azure-cognitiveservices-computervision", version = "1.0.2-beta")

Object model

The following classes and interfaces handle some of the major features of the Computer Vision Java SDK.

Name Description
ComputerVisionClient This class is needed for all Computer Vision functionality. You instantiate it with your subscription information, and you use it to produce instances of other classes.
ComputerVision This class comes from the client object and directly handles all of the image operations, such as image analysis, text detection, and thumbnail generation.
VisualFeatureTypes This enum defines the different types of image analysis that can be done in a standard Analyze operation. You specify a set of VisualFeatureTypes values depending on your needs.

Code examples

These code snippets show you how to do the following tasks with the Computer Vision client library for Java:

Authenticate the client


This quickstart assumes you've created an environment variable for your Computer Vision key, named COMPUTER_VISION_SUBSCRIPTION_KEY.

The following code adds a main method to your class and creates variables for your resource's Azure endpoint and key. You'll need to enter your own endpoint string, which you can find by checking the Overview section of the Azure portal.

public static void main(String[] args) {
    // Add your Computer Vision subscription key and endpoint to your environment
    // variables.
    // After setting, close and then re-open your command shell or project for the
    // changes to take effect.
    String subscriptionKey = System.getenv("COMPUTER_VISION_SUBSCRIPTION_KEY");
    String endpoint = System.getenv("COMPUTER_VISION_ENDPOINT");

Next, add the following code to create a ComputerVisionClient object and passes it into other method(s), which you'll define later.

ComputerVisionClient compVisClient = ComputerVisionManager.authenticate(subscriptionKey).withEndpoint(endpoint);
// END - Create an authenticated Computer Vision client.

System.out.println("\nAzure Cognitive Services Computer Vision - Java Quickstart Sample");

// Analyze local and remote images

// Recognize printed text with OCR for a local and remote (URL) image


If you created the environment variable after you launched the application, you'll need to close and reopen the editor, IDE, or shell running it to access the variable.

Analyze an image

The following code defines a method, AnalyzeLocalImage, which uses the client object to analyze a local image and print the results. The method returns a text description, categorization, list of tags, detected faces, adult content flags, main colors, and image type.

Set up test image

First, create a resources/ folder in the src/main/ folder of your project, and add an image you'd like to analyze. Then add the following method definition to your ComputerVisionQuickstarts class. If necessary, change the value of the pathToLocalImage to match your image file.

public static void AnalyzeLocalImage(ComputerVisionClient compVisClient) {
     * Analyze a local image:
     * Set a string variable equal to the path of a local image. The image path
     * below is a relative path.
    String pathToLocalImage = "src\\main\\resources\\myImage.jpg";


You can also analyze a remote image using its URL. See the sample code on GitHub for scenarios involving remote images.

Specify visual features

Next, specify which visual features you'd like to extract in your analysis. See the VisualFeatureTypes enum for a complete list.

// This list defines the features to be extracted from the image.
List<VisualFeatureTypes> featuresToExtractFromLocalImage = new ArrayList<>();


This method prints detailed results to the console for each scope of image analysis. We recommend you surround this method call in a Try/Catch block. The analyzeImageInStream method returns an ImageAnalysis object that contains all of extracted information.

// Need a byte array for analyzing a local image.
File rawImage = new File(pathToLocalImage);
byte[] imageByteArray = Files.readAllBytes(rawImage.toPath());

// Call the Computer Vision service and tell it to analyze the loaded image.
ImageAnalysis analysis = compVisClient.computerVision().analyzeImageInStream().withImage(imageByteArray)

The following sections show how to parse this information in detail.

Get image description

The following code gets the list of generated captions for the image. For more information, see Describe images.

// Display image captions and confidence values.
System.out.println("\nCaptions: ");
for (ImageCaption caption : analysis.description().captions()) {
    System.out.printf("\'%s\' with confidence %f\n", caption.text(), caption.confidence());

Get image category

The following code gets the detected category of the image. For more information, see Categorize images.

// Display image category names and confidence values.
System.out.println("\nCategories: ");
for (Category category : analysis.categories()) {
    System.out.printf("\'%s\' with confidence %f\n", category.name(), category.score());

Get image tags

The following code gets the set of detected tags in the image. For more information, see Content tags.

// Display image tags and confidence values.
System.out.println("\nTags: ");
for (ImageTag tag : analysis.tags()) {
    System.out.printf("\'%s\' with confidence %f\n", tag.name(), tag.confidence());

Detect faces

The following code returns the detected faces in the image with their rectangle coordinates and selects face attributes. For more information, see Face detection.

// Display any faces found in the image and their location.
System.out.println("\nFaces: ");
for (FaceDescription face : analysis.faces()) {
    System.out.printf("\'%s\' of age %d at location (%d, %d), (%d, %d)\n", face.gender(), face.age(),
            face.faceRectangle().left(), face.faceRectangle().top(),
            face.faceRectangle().left() + face.faceRectangle().width(),
            face.faceRectangle().top() + face.faceRectangle().height());

Detect adult, racy, or gory content

The following code prints the detected presence of adult content in the image. For more information, see Adult, racy, gory content.

// Display whether any adult or racy content was detected and the confidence
// values.
System.out.println("\nAdult: ");
System.out.printf("Is adult content: %b with confidence %f\n", analysis.adult().isAdultContent(),
System.out.printf("Has racy content: %b with confidence %f\n", analysis.adult().isRacyContent(),

Get image color scheme

The following code prints the detected color attributes in the image, like the dominant colors and accent color. For more information, see Color schemes.

// Display the image color scheme.
System.out.println("\nColor scheme: ");
System.out.println("Is black and white: " + analysis.color().isBWImg());
System.out.println("Accent color: " + analysis.color().accentColor());
System.out.println("Dominant background color: " + analysis.color().dominantColorBackground());
System.out.println("Dominant foreground color: " + analysis.color().dominantColorForeground());
System.out.println("Dominant colors: " + String.join(", ", analysis.color().dominantColors()));

Get domain-specific content

Computer Vision can use specialized model to do further analysis on images. For more information, see Domain-specific content.

The following code parses data about detected celebrities in the image.

// Display any celebrities detected in the image and their locations.
System.out.println("\nCelebrities: ");
for (Category category : analysis.categories()) {
    if (category.detail() != null && category.detail().celebrities() != null) {
        for (CelebritiesModel celeb : category.detail().celebrities()) {
            System.out.printf("\'%s\' with confidence %f at location (%d, %d), (%d, %d)\n", celeb.name(),
                    celeb.confidence(), celeb.faceRectangle().left(), celeb.faceRectangle().top(),
                    celeb.faceRectangle().left() + celeb.faceRectangle().width(),
                    celeb.faceRectangle().top() + celeb.faceRectangle().height());

The following code parses data about detected landmarks in the image.

// Display any landmarks detected in the image and their locations.
System.out.println("\nLandmarks: ");
for (Category category : analysis.categories()) {
    if (category.detail() != null && category.detail().landmarks() != null) {
        for (LandmarksModel landmark : category.detail().landmarks()) {
            System.out.printf("\'%s\' with confidence %f\n", landmark.name(), landmark.confidence());

Get the image type

The following code prints information about the type of image—whether it is clip art or line drawing.

// Display what type of clip art or line drawing the image is.
System.out.println("\nImage type:");
System.out.println("Clip art type: " + analysis.imageType().clipArtType());
System.out.println("Line drawing type: " + analysis.imageType().lineDrawingType());

Read printed and handwritten text

Computer Vision can read visible text in an image and convert it to a character stream.


You can also read text in a remote image using its URL. See the sample code on GitHub for scenarios involving remote images.

Call the Recognize API

First, use the following code to call the recognizePrintedTextInStream method for the given image. When you add this code to your project, you need to replace the value of localTextImagePath with the path to your local image.

// Display what type of clip art or line drawing the image is.
System.out.println("\nImage type:");
System.out.println("Clip art type: " + analysis.imageType().clipArtType());
System.out.println("Line drawing type: " + analysis.imageType().lineDrawingType());

The following block of code processes the returned text and parses it to print out the first word in each line. You can use this code to quickly understand the structure of an OcrResult instance.

// Print results of local image
System.out.println("Recognizing printed text from a local image with OCR ...");
System.out.println("\nLanguage: " + ocrResultLocal.language());
System.out.printf("Text angle: %1.3f\n", ocrResultLocal.textAngle());
System.out.println("Orientation: " + ocrResultLocal.orientation());

boolean firstWord = true;
// Gets entire region of text block
for (OcrRegion reg : ocrResultLocal.regions()) {
    // Get one line in the text block
    for (OcrLine line : reg.lines()) {
        for (OcrWord word : line.words()) {
            // get bounding box of first word recognized (just to demo)
            if (firstWord) {
                System.out.println("\nFirst word in first line is \"" + word.text()
                        + "\" with  bounding box: " + word.boundingBox());
                firstWord = false;
            System.out.print(word.text() + " ");

Finally, close out the try/catch block and the method definition.

    } catch (Exception e) {

Run the application

You can build the app with:

gradle build

Run the application with the gradle run command:

gradle run

Clean up resources

If you want to clean up and remove a Cognitive Services subscription, you can delete the resource or resource group. Deleting the resource group also deletes any other resources associated with it.

Next steps

In this quickstart, you learned how to use the Computer Vision Java library to do basis tasks. Next, explore the reference documentation to learn more about the library.