Tutorial: Create a labeling project for multi-class image classification

APPLIES TO: yesBasic edition yesEnterprise edition                    (Upgrade to Enterprise edition)

This tutorial shows you how to manage the process of labeling (also referred to as tagging) images to be used as data for building machine learning models. Data labeling in Azure Machine Learning is in public preview.

If you want to train a machine learning model to classify images, you need hundreds or even thousands of images that are correctly labeled. Azure Machine Learning helps you manage the progress of your private team of domain experts as they label your data.

In this tutorial, you'll use images of cats and dogs. Since each image is either a cat or a dog, this is a multi-class labeling project. You'll learn how to:

  • Create an Azure storage account and upload images to the account.
  • Create an Azure Machine Learning workspace.
  • Create a multi-class image labeling project.
  • Label your data. Either you or your labelers can perform this task.
  • Complete the project by reviewing and exporting the data.

Prerequisites

  • An Azure subscription. If you don't have an Azure subscription, create a free account.

Create a workspace

An Azure Machine Learning workspace is a foundational resource in the cloud that you use to experiment, train, and deploy machine learning models. It ties your Azure subscription and resource group to an easily consumed object in the service.

You create a workspace via the Azure portal, a web-based console for managing your Azure resources.

  1. Sign in to Azure portal by using the credentials for your Azure subscription.

  2. In the upper-left corner of Azure portal, select + Create a resource.

    Create a new resource

  3. Use the search bar to find Machine Learning.

  4. Select Machine Learning.

  5. In the Machine Learning pane, select Create to begin.

  6. Provide the following information to configure your new workspace:

    Field Description
    Workspace name Enter a unique name that identifies your workspace. In this example, we use docs-ws. Names must be unique across the resource group. Use a name that's easy to recall and to differentiate from workspaces created by others.
    Subscription Select the Azure subscription that you want to use.
    Resource group Use an existing resource group in your subscription or enter a name to create a new resource group. A resource group holds related resources for an Azure solution. In this example, we use docs-aml.
    Location Select the location closest to your users and the data resources to create your workspace.
    Workspace edition Select Basic as the workspace type for this tutorial. The workspace type (Basic & Enterprise) determines the features to which you’ll have access and pricing. Everything in this tutorial can be performed with either a Basic or Enterprise workspace.
  7. After you are finished configuring the workspace, select Review + Create.

    Warning

    It can take several minutes to create your workspace in the cloud.

    When the process is finished, a deployment success message appears.

  8. To view the new workspace, select Go to resource.

Start a labeling project

Next you will manage the data labeling project in Azure Machine Learning studio, a consolidated interface that includes machine learning tools to perform data science scenarios for data science practitioners of all skill levels. The studio is not supported on Internet Explorer browsers.

  1. Sign in to Azure Machine Learning studio.

  2. Select your subscription and the workspace you created.

Create a datastore

Azure Machine Learning datastores are used to store connection information, like your subscription ID and token authorization. Here you use a datastore to connect to the storage account that contains the images for this tutorial.

  1. On the left side of your workspace, select Datastores.

  2. Select + New datastore.

  3. Fill out the form with these settings:

    Field Description
    Datastore name Give the datastore a name. Here we use labeling_tutorial.
    Datastore type Select the type of storage. Here we use Azure Blob Storage, the preferred storage for images.
    Account selection method Select Enter manually.
    URL https://azureopendatastorage.blob.core.windows.net/openimagescontainer
    Authentication type Select SAS token.
    Account key ?sv=2019-02-02&ss=bfqt&srt=sco&sp=rl&se=2025-03-25T04:51:17Z&st=2020-03-24T20:51:17Z&spr=https&sig=7D7SdkQidGT6pURQ9R4SUzWGxZ%2BHlNPCstoSRRVg8OY%3D
  4. Select Create to create the datastore.

Create a labeling project

Now that you have access to the data you want to have labeled, create your labeling project.

  1. At the top of the page, select Projects.

  2. Select + Add project.

    Create a project

Project details

  1. Use the following input for the Project details form:

    Field Description
    Project name Give your project a name. Here we'll use tutorial-cats-n-dogs.
    Labeling task type Select Image Classification Multi-class.

    Select Next to continue creating the project.

Select or create a dataset

  1. On the Select or create a dataset form, select the second choice, Create a dataset, then select the link From datastore.

  2. Use the following input for the Create dataset from datastore form:

    1. On the Basic info form, add a name, here we'll use images-for-tutorial. Add a description if you wish. Then select Next.
    2. On the Datastore selection form, use the dropdown to select your Previously created datastore, for example tutorial_images (Azure Blob Storage)
    3. Next, still on the Datastore selection form, select Browse and then select MultiClass - DogsCats. Select Save to use /MultiClass - DogsCats as the path.
    4. Select Next to confirm details and then Create to create the dataset.
    5. Select the circle next to the dataset name in the list, for example images-for-tutorial.
  3. Select Next to continue creating the project.

Incremental refresh

If you plan to add new images to your dataset, incremental refresh will find these new images and add them to your project. When you enable this feature, the project will periodically check for new images. You won't be adding new images to the datastore for this tutorial, so leave this feature unchecked.

Select Next to continue.

Label classes

  1. On the Label classes form, type a label name, then select +Add label to type the next label. For this project, the labels are Cat, Dog, and Uncertain.

  2. Select Next when have added all the labels.

Labeling instructions

  1. On the Labeling instructions form, you can provide a link to a website that provides detailed instructions for your labelers. We'll leave it blank for this tutorial.

  2. You can also add a short description of the task directly on the form. Type Labeling tutorial - Cats & Dogs.

  3. Select Next.

  4. If you are using an Enterprise workspace, you will see a ML assisted labeling section. Leave the checkbox unchecked. ML assisted labeling requires more data than you'll be using in this tutorial.

  5. Select Create project.

This page doesn't automatically refresh. After a pause, manually refresh the page until the project's status changes to Created.

Start labeling

You have now set up your Azure resources, and configured a data labeling project. It's time to add labels to your data.

Tag the images

In this part of the tutorial, you'll switch roles from the project administrator to that of a labeler. Anyone who has contributor access to your workspace can become a labeler.

  1. In Machine Learning studio, select Data labeling on the left-hand side to find your project.

  2. Select Label link for the project.

  3. Read the instructions, then select Tasks.

  4. Select a thumbnail image on the right to display the number of images you wish to label in one go. You must label all these images before you can move on. Only switch layouts when you have a fresh page of unlabeled data. Switching layouts clears the page's in-progress tagging work.

  5. Select one or more images, then select a tag to apply to the selection. The tag appears below the image. Continue to select and tag all images on the page. To select all the displayed images simultaneously, select Select all. Select at least one image to apply a tag.

    Tip

    You can select the first nine tags by using the number keys on your keyboard.

  6. Once all the images on the page are tagged, select Submit to submit these labels.

    Tagging images

  7. After you submit tags for the data at hand, Azure refreshes the page with a new set of images from the work queue.

Complete the project

Now you'll switch roles back to the project administrator for the labeling project.

As a manager, you may want to review the work of your labeler.

Review labeled data

  1. In Machine Learning studio, select Data labeling on the left-hand side to find your project.

  2. Select the project name link.

  3. The Dashboard shows you the progress of your project.

  4. At the top of the page, select Data.

  5. On the left side, select Labeled data to see your tagged images.

  6. When you disagree with a label, select the image and then select Reject at the bottom of the page. The tags will be removed and the image is put back in the queue of unlabeled images.

Export labeled data

You can export the label data for Machine Learning experimentation at any time. Users often export multiple times and train different models, rather than wait for all the images to be labeled.

Image labels can be exported in COCO format or as an Azure Machine Learning dataset. The dataset format makes it easy to use for training in Azure Machine Learning.

  1. In Machine Learning studio, select Data labeling on the left-hand side to find your project.

  2. Select the project name link.

  3. Select Export and choose Export as Azure ML Dataset.

    The status of the export appears just below the Export button.

  4. Once the labels are successfully exported, select Datasets on the left side to view the results.

Clean up resources

Important

The resources you created can be used as prerequisites to other Azure Machine Learning tutorials and how-to articles.

If you don't plan to use the resources you created, delete them, so you don't incur any charges:

  1. In the Azure portal, select Resource groups on the far left.

    Delete in the Azure portal

  2. From the list, select the resource group you created.

  3. Select Delete resource group.

  4. Enter the resource group name. Then select Delete.

Next steps

In this tutorial, you labeled images. Now use your labeled data: