Quickstart: Create an object detection project with the Custom Vision .NET SDK

This article shows you how to get started using the Custom Vision SDK with C# to build an object detection model. After it's created, you can add tagged regions, upload images, train the project, obtain the project's default prediction endpoint URL, and use the endpoint to programmatically test an image. Use this example as a template for building your own .NET application.

Prerequisites

  • Any edition of Visual Studio 2015 or 2017
  • To use the Custom Vision Service you will need to create Custom Vision Training and Prediction resources in Azure. To do so in the Azure portal, fill out the dialog window on the Create Custom Vision page to create both a Training and Prediction resource.

Get the Custom Vision SDK and sample code

To write a .NET app that uses Custom Vision, you'll need the Custom Vision NuGet packages. These packages are included in the sample project you'll download, but you can access them individually here.

Clone or download the Cognitive Services .NET Samples project. Navigate to the CustomVision/ObjectDetection folder and open ObjectDetection.csproj in Visual Studio.

This Visual Studio project creates a new Custom Vision project named My New Project, which can be accessed through the Custom Vision website. It then uploads images to train and test an object detection model. In this project, the model is trained to detect forks and scissors in images.

Get the training and prediction keys

The project needs a valid set of subscription keys to interact with the service. You can find the items at the Custom Vision website. Sign in with the account associated with the Azure account used to create your Custom Vision resources. On the home page (the page with the option to add a new project), select the gear icon in the upper right. Find your training and prediction resources in the list and expand them. Here you can find your training key, prediction key, and prediction resource ID values. Save these values to a temporary location.

Image of the keys UI

Or, you can obtain these keys and ID from the Azure portal by viewing your Custom Vision Training and Prediction resources and navigating to the Keys tab. There you'll find your training key and prediction key. Navigate to the Properties tab of your Prediction resource to get your prediction resource ID.

Understand the code

Open the Program.cs file and inspect the code. Create environment variables for your training and prediction keys named CUSTOM_VISION_TRAINING_KEY and CUSTOM_VISION_PREDICTION_KEY, respectively. The script will look for these variables.

// Add your training & prediction key from the settings page of the portal
string trainingKey = Environment.GetEnvironmentVariable("CUSTOM_VISION_TRAINING_KEY");
string predictionKey = Environment.GetEnvironmentVariable("CUSTOM_VISION_PREDICTION_KEY");

Also, get your Endpoint URL from the Settings page of the Custom Vision website. Save it to an environment variable called CUSTOM_VISION_ENDPOINT. The script saves a reference to it at the root of your class.

string ENDPOINT = Environment.GetEnvironmentVariable("CUSTOM_VISION_ENDPOINT");

Create a new Custom Vision Service project

This next bit of code creates an object detection project. The created project will show up on the Custom Vision website that you visited earlier. See the CreateProject method to specify other options when you create your project (explained in the Build a detector web portal guide).

// Find the object detection domain
var domains = trainingApi.GetDomains();
var objDetectionDomain = domains.FirstOrDefault(d => d.Type == "ObjectDetection");

// Create a new project
Console.WriteLine("Creating new project:");
var project = trainingApi.CreateProject("My New Project", null, objDetectionDomain.Id);

Add tags to the project

// Make two tags in the new project
var forkTag = trainingApi.CreateTag(project.Id, "fork");
var scissorsTag = trainingApi.CreateTag(project.Id, "scissors");

Upload and tag images

When you tag images in object detection projects, you need to specify the region of each tagged object using normalized coordinates. The following code associates each of the sample images with its tagged region.

Dictionary<string, double[]> fileToRegionMap = new Dictionary<string, double[]>()
{
    // FileName, Left, Top, Width, Height
    {"scissors_1", new double[] { 0.4007353, 0.194068655, 0.259803921, 0.6617647 } },
    {"scissors_2", new double[] { 0.426470578, 0.185898721, 0.172794119, 0.5539216 } },
    {"scissors_3", new double[] { 0.289215684, 0.259428144, 0.403186262, 0.421568632 } },
    {"scissors_4", new double[] { 0.343137264, 0.105833367, 0.332107842, 0.8055556 } },
    {"scissors_5", new double[] { 0.3125, 0.09766343, 0.435049027, 0.71405226 } },
    {"scissors_6", new double[] { 0.379901975, 0.24308826, 0.32107842, 0.5718954 } },
    {"scissors_7", new double[] { 0.341911763, 0.20714055, 0.3137255, 0.6356209 } },
    {"scissors_8", new double[] { 0.231617644, 0.08459154, 0.504901946, 0.8480392 } },
    {"scissors_9", new double[] { 0.170343131, 0.332957536, 0.767156839, 0.403594762 } },
    {"scissors_10", new double[] { 0.204656869, 0.120539248, 0.5245098, 0.743464053 } },
    {"scissors_11", new double[] { 0.05514706, 0.159754932, 0.799019635, 0.730392158 } },
    {"scissors_12", new double[] { 0.265931368, 0.169558853, 0.5061275, 0.606209159 } },
    {"scissors_13", new double[] { 0.241421565, 0.184264734, 0.448529422, 0.6830065 } },
    {"scissors_14", new double[] { 0.05759804, 0.05027781, 0.75, 0.882352948 } },
    {"scissors_15", new double[] { 0.191176474, 0.169558853, 0.6936275, 0.6748366 } },
    {"scissors_16", new double[] { 0.1004902, 0.279036, 0.6911765, 0.477124184 } },
    {"scissors_17", new double[] { 0.2720588, 0.131977156, 0.4987745, 0.6911765 } },
    {"scissors_18", new double[] { 0.180147052, 0.112369314, 0.6262255, 0.6666667 } },
    {"scissors_19", new double[] { 0.333333343, 0.0274019931, 0.443627447, 0.852941155 } },
    {"scissors_20", new double[] { 0.158088237, 0.04047389, 0.6691176, 0.843137264 } },
    {"fork_1", new double[] { 0.145833328, 0.3509314, 0.5894608, 0.238562092 } },
    {"fork_2", new double[] { 0.294117659, 0.216944471, 0.534313738, 0.5980392 } },
    {"fork_3", new double[] { 0.09191177, 0.0682516545, 0.757352948, 0.6143791 } },
    {"fork_4", new double[] { 0.254901975, 0.185898721, 0.5232843, 0.594771266 } },
    {"fork_5", new double[] { 0.2365196, 0.128709182, 0.5845588, 0.71405226 } },
    {"fork_6", new double[] { 0.115196079, 0.133611143, 0.676470637, 0.6993464 } },
    {"fork_7", new double[] { 0.164215669, 0.31008172, 0.767156839, 0.410130739 } },
    {"fork_8", new double[] { 0.118872553, 0.318251669, 0.817401946, 0.225490168 } },
    {"fork_9", new double[] { 0.18259804, 0.2136765, 0.6335784, 0.643790841 } },
    {"fork_10", new double[] { 0.05269608, 0.282303959, 0.8088235, 0.452614367 } },
    {"fork_11", new double[] { 0.05759804, 0.0894935, 0.9007353, 0.3251634 } },
    {"fork_12", new double[] { 0.3345588, 0.07315363, 0.375, 0.9150327 } },
    {"fork_13", new double[] { 0.269607842, 0.194068655, 0.4093137, 0.6732026 } },
    {"fork_14", new double[] { 0.143382356, 0.218578458, 0.7977941, 0.295751631 } },
    {"fork_15", new double[] { 0.19240196, 0.0633497, 0.5710784, 0.8398692 } },
    {"fork_16", new double[] { 0.140931368, 0.480016381, 0.6838235, 0.240196079 } },
    {"fork_17", new double[] { 0.305147052, 0.2512582, 0.4791667, 0.5408496 } },
    {"fork_18", new double[] { 0.234068632, 0.445702642, 0.6127451, 0.344771236 } },
    {"fork_19", new double[] { 0.219362751, 0.141781077, 0.5919118, 0.6683006 } },
    {"fork_20", new double[] { 0.180147052, 0.239820287, 0.6887255, 0.235294119 } }
};

Then, this map of associations is used to upload each sample image with its region coordinates. You can upload up to 64 images in a single batch.

// Add all images for fork
var imagePath = Path.Combine("Images", "fork");
var imageFileEntries = new List<ImageFileCreateEntry>();
foreach (var fileName in Directory.EnumerateFiles(imagePath))
{
    var region = fileToRegionMap[Path.GetFileNameWithoutExtension(fileName)];
    imageFileEntries.Add(new ImageFileCreateEntry(fileName, File.ReadAllBytes(fileName), null, new List<Region>(new Region[] { new Region(forkTag.Id, region[0], region[1], region[2], region[3]) })));
}
trainingApi.CreateImagesFromFiles(project.Id, new ImageFileCreateBatch(imageFileEntries));

// Add all images for scissors
imagePath = Path.Combine("Images", "scissors");
imageFileEntries = new List<ImageFileCreateEntry>();
foreach (var fileName in Directory.EnumerateFiles(imagePath))
{
    var region = fileToRegionMap[Path.GetFileNameWithoutExtension(fileName)];
    imageFileEntries.Add(new ImageFileCreateEntry(fileName, File.ReadAllBytes(fileName), null, new List<Region>(new Region[] { new Region(scissorsTag.Id, region[0], region[1], region[2], region[3]) })));
}
trainingApi.CreateImagesFromFiles(project.Id, new ImageFileCreateBatch(imageFileEntries));

At this point, you've uploaded all the samples images and tagged each one (fork or scissors) with an associated pixel rectangle.

Train the project

This code creates the first training iteration in the project.

// Now there are images with tags start training the project
Console.WriteLine("\tTraining");
var iteration = trainingApi.TrainProject(project.Id);

// The returned iteration will be in progress, and can be queried periodically to see when it has completed
while (iteration.Status == "Training")
{
    Thread.Sleep(1000);

    // Re-query the iteration to get its updated status
    iteration = trainingApi.GetIteration(project.Id, iteration.Id);
}

Publish the current iteration

The name given to the published iteration can be used to send prediction requests. An iteration is not available in the prediction endpoint until it's published.

// The iteration is now trained. Publish it to the prediction end point.
var publishedModelName = "toolModel";
var predictionResourceId = "<target prediction resource ID>";
trainingApi.PublishIteration(project.Id, iteration.Id, publishedModelName, predictionResourceId);
Console.WriteLine("Done!\n");

Create a prediction endpoint

// Create a prediction endpoint, passing in the obtained prediction key
CustomVisionPredictionClient endpoint = new CustomVisionPredictionClient()
{
    ApiKey = predictionKey,
    Endpoint = ENDPOINT
};

Use the prediction endpoint

This part of the script loads the test image, queries the model endpoint, and outputs prediction data to the console.

// Make a prediction against the new project
Console.WriteLine("Making a prediction:");
var imageFile = Path.Combine("Images", "test", "test_image.jpg");
using (var stream = File.OpenRead(imageFile))
{
    var result = endpoint.DetectImage(project.Id, publishedModelName, File.OpenRead(imageFile));

    // Loop over each prediction and write out the results
    foreach (var c in result.Predictions)
    {
        Console.WriteLine($"\t{c.TagName}: {c.Probability:P1} [ {c.BoundingBox.Left}, {c.BoundingBox.Top}, {c.BoundingBox.Width}, {c.BoundingBox.Height} ]");
    }
}
Console.ReadKey();

Run the application

As the application runs, it should open a console window and write the following output:

Creating new project:
        Training
Done!

Making a prediction:
        fork: 98.2% [ 0.111609578, 0.184719115, 0.6607002, 0.6637112 ]
        scissors: 1.2% [ 0.112389535, 0.119195729, 0.658031344, 0.7023591 ]

You can then verify that the test image (found in Images/Test/) is tagged appropriately and that the region of detection is correct. At this point, you can press any key to exit the application.

Clean up resources

If you wish to implement your own object detection project (or try an image classification project instead), you may want to delete the fork/scissors detection project from this example. A free trial allows for two Custom Vision projects.

On the Custom Vision website, navigate to Projects and select the trash can under My New Project.

Screenshot of a panel labeled My New Project with a trash can icon

Next steps

Now you've seen how to do every step of the object detection process in code. This sample executes a single training iteration, but often you'll need to train and test your model multiple times in order to make it more accurate. The following guide deals with image classification, but its principles are similar to object detection.