Tutorial: Deploy an image classification model in Azure Container Instances

APPLIES TO: yesBasic edition yesEnterprise edition                    (Upgrade to Enterprise edition)

This tutorial is part two of a two-part tutorial series. In the previous tutorial, you trained machine learning models and then registered a model in your workspace on the cloud.

Now you're ready to deploy the model as a web service in Azure Container Instances. A web service is an image, in this case a Docker image. It encapsulates the scoring logic and the model itself.

In this part of the tutorial, you use Azure Machine Learning for the following tasks:

  • Set up your testing environment.
  • Retrieve the model from your workspace.
  • Test the model locally.
  • Deploy the model to Container Instances.
  • Test the deployed model.

Container Instances is a great solution for testing and understanding the workflow. For scalable production deployments, consider using Azure Kubernetes Service. For more information, see how to deploy and where.


Code in this article was tested with Azure Machine Learning SDK version 1.0.41.


To run the notebook, first complete the model training in Tutorial (part 1): Train an image classification model. Then open the img-classification-part2-deploy.ipynb notebook in your cloned tutorials folder.

This tutorial is also available on GitHub if you wish to use it on your own local environment. Make sure you have installed matplotlib and scikit-learn in your environment.


The rest of this article contains the same content as you see in the notebook.

Switch to the Jupyter notebook now if you want to read along as you run the code. To run a single code cell in a notebook, click the code cell and hit Shift+Enter. Or, run the entire notebook by choosing Run all from the top toolbar.

Set up the environment

Start by setting up a testing environment.

Import packages

Import the Python packages needed for this tutorial:

%matplotlib inline
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
import azureml
from azureml.core import Workspace, Run

# Display the core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

Retrieve the model

You registered a model in your workspace in the previous tutorial. Now load this workspace and download the model to your local directory:

from azureml.core import Workspace
from azureml.core.model import Model
import os
ws = Workspace.from_config()
model = Model(ws, 'sklearn_mnist')

model.download(target_dir=os.getcwd(), exist_ok=True)

# verify the downloaded model file
file_path = os.path.join(os.getcwd(), "sklearn_mnist_model.pkl")


Test the model locally

Before you deploy, make sure your model is working locally:

  • Load test data.
  • Predict test data.
  • Examine the confusion matrix.

Load test data

Load the test data from the ./data/ directory created during the training tutorial:

from utils import load_data
import os

data_folder = os.path.join(os.getcwd(), 'data')
# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster
X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0
y_test = load_data(os.path.join(
    data_folder, 'test-labels.gz'), True).reshape(-1)

Predict test data

To get predictions, feed the test dataset to the model:

import pickle
from sklearn.externals import joblib

clf = joblib.load(os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl'))
y_hat = clf.predict(X_test)

Examine the confusion matrix

Generate a confusion matrix to see how many samples from the test set are classified correctly. Notice the misclassified value for the incorrect predictions:

from sklearn.metrics import confusion_matrix

conf_mx = confusion_matrix(y_test, y_hat)
print('Overall accuracy:', np.average(y_hat == y_test))

The output shows the confusion matrix:

[[ 960    0    1    2    1    5    6    3    1    1]
 [   0 1112    3    1    0    1    5    1   12    0]
 [   9    8  920   20   10    4   10   11   37    3]
 [   4    0   17  921    2   21    4   12   20    9]
 [   1    2    5    3  915    0   10    2    6   38]
 [  10    2    0   41   10  770   17    7   28    7]
 [   9    3    7    2    6   20  907    1    3    0]
 [   2    7   22    5    8    1    1  950    5   27]
 [  10   15    5   21   15   27    7   11  851   12]
 [   7    8    2   13   32   13    0   24   12  898]]
Overall accuracy: 0.9204

Use matplotlib to display the confusion matrix as a graph. In this graph, the x-axis shows the actual values, and the y-axis shows the predicted values. The color in each grid shows the error rate. The lighter the color, the higher the error rate is. For example, many 5's are misclassified as 3's. So you see a bright grid at (5,3):

# normalize the diagonal cells so that they don't overpower the rest of the cells when visualized
row_sums = conf_mx.sum(axis=1, keepdims=True)
norm_conf_mx = conf_mx / row_sums
np.fill_diagonal(norm_conf_mx, 0)

fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111)
cax = ax.matshow(norm_conf_mx, cmap=plt.cm.bone)
ticks = np.arange(0, 10, 1)
plt.ylabel('true labels', fontsize=14)
plt.xlabel('predicted values', fontsize=14)

Chart showing confusion matrix

Deploy as a web service

After you tested the model and you're satisfied with the results, deploy the model as a web service hosted in Container Instances.

To build the correct environment for Container Instances, provide the following components:

  • A scoring script to show how to use the model.
  • An environment file to show what packages need to be installed.
  • A configuration file to build the container instance.
  • The model you trained previously.

Create scoring script

Create the scoring script, called score.py. The web service call uses this script to show how to use the model.

Include these two required functions in the scoring script:

  • The init() function, which typically loads the model into a global object. This function is run only once when the Docker container is started.

  • The run(input_data) function uses the model to predict a value based on the input data. Inputs and outputs to the run typically use JSON for serialization and de-serialization, but other formats are supported.

%%writefile score.py
import json
import numpy as np
import os
import pickle
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression

from azureml.core.model import Model

def init():
    global model
    # retrieve the path to the model file using the model name
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'sklearn_mnist_model.pkl')
    model = joblib.load(model_path)

def run(raw_data):
    data = np.array(json.loads(raw_data)['data'])
    # make prediction
    y_hat = model.predict(data)
    # you can return any data type as long as it is JSON-serializable
    return y_hat.tolist()

Create environment file

Next create an environment file, called myenv.yml, that specifies all of the script's package dependencies. This file is used to make sure that all of those dependencies are installed in the Docker image. This model needs scikit-learn and azureml-sdk. All custom environment files need to list azureml-defaults with verion >= 1.0.45 as a pip dependency. This package contains the functionality needed to host the model as a web service.

from azureml.core.conda_dependencies import CondaDependencies

myenv = CondaDependencies()

with open("myenv.yml", "w") as f:

Review the content of the myenv.yml file:

with open("myenv.yml", "r") as f:

Create a configuration file

Create a deployment configuration file. Specify the number of CPUs and gigabytes of RAM needed for your Container Instances container. Although it depends on your model, the default of one core and 1 gigabyte of RAM is sufficient for many models. If you need more later, you have to re-create the image and redeploy the service.

from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               tags={"data": "MNIST",  
                                                     "method": "sklearn"},
                                               description='Predict MNIST with sklearn')

Deploy in Container Instances

The estimated time to finish deployment is about seven to eight minutes.

Configure the image and deploy. The following code goes through these steps:

  1. Build an image by using these files:
    • The scoring file, score.py.
    • The environment file, myenv.yml.
    • The model file.
  2. Register the image under the workspace.
  3. Send the image to the Container Instances container.
  4. Start up a container in Container Instances by using the image.
  5. Get the web service HTTP endpoint.

Please note that if you are defining your own environment file, you must list azureml-defaults with version >= 1.0.45 as a pip dependency. This package contains the functionality needed to host the model as a web service.

from azureml.core.webservice import Webservice
from azureml.core.model import InferenceConfig
from azureml.core.environment import Environment

myenv = Environment.from_conda_specification(name="myenv", file_path="myenv.yml")
inference_config = InferenceConfig(entry_script="score.py", environment=myenv)

service = Model.deploy(workspace=ws,


Get the scoring web service's HTTP endpoint, which accepts REST client calls. You can share this endpoint with anyone who wants to test the web service or integrate it into an application:


Test the deployed service

Earlier, you scored all the test data with the local version of the model. Now you can test the deployed model with a random sample of 30 images from the test data.

The following code goes through these steps:

  1. Send the data as a JSON array to the web service hosted in Container Instances.

  2. Use the SDK's run API to invoke the service. You can also make raw calls by using any HTTP tool such as curl.

  3. Print the returned predictions and plot them along with the input images. Red font and inverse image, white on black, is used to highlight the misclassified samples.

Because the model accuracy is high, you might have to run the following code a few times before you can see a misclassified sample:

import json

# find 30 random samples from test set
n = 30
sample_indices = np.random.permutation(X_test.shape[0])[0:n]

test_samples = json.dumps({"data": X_test[sample_indices].tolist()})
test_samples = bytes(test_samples, encoding='utf8')

# predict using the deployed model
result = service.run(input_data=test_samples)

# compare actual value vs. the predicted values:
i = 0
plt.figure(figsize=(20, 1))

for s in sample_indices:
    plt.subplot(1, n, i + 1)
    # use different color for misclassified sample
    font_color = 'red' if y_test[s] != result[i] else 'black'
    clr_map = plt.cm.gray if y_test[s] != result[i] else plt.cm.Greys
    plt.text(x=10, y=-10, s=result[i], fontsize=18, color=font_color)
    plt.imshow(X_test[s].reshape(28, 28), cmap=clr_map)
    i = i + 1

This result is from one random sample of test images:

Graphic showing results

You can also send a raw HTTP request to test the web service:

import requests

# send a random row from the test set to score
random_index = np.random.randint(0, len(X_test)-1)
input_data = "{\"data\": [" + str(list(X_test[random_index])) + "]}"

headers = {'Content-Type': 'application/json'}

# for AKS deployment you'd need to the service key in the header as well
# api_key = service.get_key()
# headers = {'Content-Type':'application/json',  'Authorization':('Bearer '+ api_key)} 

resp = requests.post(service.scoring_uri, input_data, headers=headers)

print("POST to url", service.scoring_uri)
#print("input data:", input_data)
print("label:", y_test[random_index])
print("prediction:", resp.text)

Clean up resources

To keep the resource group and workspace for other tutorials and exploration, you can delete only the Container Instances deployment by using this API call:



The resources you created can be used as prerequisites to other Azure Machine Learning tutorials and how-to articles.

If you don't plan to use the resources you created, delete them, so you don't incur any charges:

  1. In the Azure portal, select Resource groups on the far left.

    Delete in the Azure portal

  2. From the list, select the resource group you created.

  3. Select Delete resource group.

  4. Enter the resource group name. Then select Delete.

Next steps