您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

教程:在 Azure 容器实例中部署映像分类模型Tutorial: Deploy an image classification model in Azure Container Instances

本教程是由两个部分构成的系列教程的第二部分 。This tutorial is part two of a two-part tutorial series. 上一个教程中,定型了机器学习模型,然后在云中的工作区内注册了模型。In the previous tutorial, you trained machine learning models and then registered a model in your workspace on the cloud.

现在,你已准备好在 Azure 容器实例中部署模型作为 Web 服务。Now you're ready to deploy the model as a web service in Azure Container Instances. Web 服务是一个映像,在本例中是 Docker 映像。A web service is an image, in this case a Docker image. 它用于封装评分逻辑和模型本身。It encapsulates the scoring logic and the model itself.

在教程的此部分,你将使用 Azure 机器学习服务完成以下任务:In this part of the tutorial, you use Azure Machine Learning service for the following tasks:

  • 设置测试环境。Set up your testing environment.
  • 从工作区检索模型。Retrieve the model from your workspace.
  • 在本地测试模型。Test the model locally.
  • 将模型部署到容器实例。Deploy the model to Container Instances.
  • 测试已部署的模型。Test the deployed model.

容器实例是用于测试和了解工作流的理想解决方案。Container Instances is a great solution for testing and understanding the workflow. 对于可缩放的生产部署,请考虑使用 Azure Kubernetes 服务。For scalable production deployments, consider using Azure Kubernetes Service. 有关详细信息,请参阅部署方式及位置For more information, see how to deploy and where.

备注

本文中的代码已使用 Azure 机器学习 SDK 版本 1.0.41 进行测试。Code in this article was tested with Azure Machine Learning SDK version 1.0.41.

先决条件Prerequisites

跳到设置开发环境来通读 Notebook 步骤。Skip to Set the development environment to read through the notebook steps.

若要运行 Notebook,请首先完成以下教程中的模型训练:教程(第 1 部分):使用 Azure 机器学习服务训练图像分类模型To run the notebook, first complete the model training in Tutorial (part 1): Train an image classification model with Azure Machine Learning service. 然后,使用同一 Notebook 服务器运行 tutorials/img-classification-part2-deploy.ipynb Notebook。Then run the tutorials/img-classification-part2-deploy.ipynb notebook using the same notebook server.

设置环境Set up the environment

首先,设置测试环境。Start by setting up a testing environment.

导入包Import packages

导入本教程所需的 Python 包:Import the Python packages needed for this tutorial:

%matplotlib inline
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
 
import azureml
from azureml.core import Workspace, Run

# display the core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

检索模型Retrieve the model

你在上一个教程中于工作区内注册了一个模型。You registered a model in your workspace in the previous tutorial. 现在,加载此工作区并将模型下载到本地目录:Now load this workspace and download the model to your local directory:

from azureml.core import Workspace
from azureml.core.model import Model
import os
ws = Workspace.from_config()
model = Model(ws, 'sklearn_mnist')

model.download(target_dir=os.getcwd(), exist_ok=True)

# verify the downloaded model file
file_path = os.path.join(os.getcwd(), "sklearn_mnist_model.pkl")

os.stat(file_path)

在本地测试模型Test the model locally

部署之前,请确保模型在本地运行:Before you deploy, make sure your model is working locally:

  • 加载测试数据。Load test data.
  • 预测测试数据。Predict test data.
  • 检查混淆矩阵。Examine the confusion matrix.

加载测试数据Load test data

从训练教程中创建的 ./data/ 目录中加载测试数据:Load the test data from the ./data/ directory created during the training tutorial:

from utils import load_data
import os

data_folder = os.path.join(os.getcwd(), 'data')
# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster
X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0
y_test = load_data(os.path.join(
    data_folder, 'test-labels.gz'), True).reshape(-1)

预测测试数据Predict test data

若要获得预测结果,请将测试数据集馈送到模型:To get predictions, feed the test dataset to the model:

import pickle
from sklearn.externals import joblib

clf = joblib.load(os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl'))
y_hat = clf.predict(X_test)

检查混淆矩阵Examine the confusion matrix

生成一个混淆矩阵,便于查看测试集中有多少样本已正确分类。Generate a confusion matrix to see how many samples from the test set are classified correctly. 注意不正确预测的错误分类值:Notice the misclassified value for the incorrect predictions:

from sklearn.metrics import confusion_matrix

conf_mx = confusion_matrix(y_test, y_hat)
print(conf_mx)
print('Overall accuracy:', np.average(y_hat == y_test))

输出显示混淆矩阵:The output shows the confusion matrix:

[[ 960    0    1    2    1    5    6    3    1    1]
 [   0 1112    3    1    0    1    5    1   12    0]
 [   9    8  920   20   10    4   10   11   37    3]
 [   4    0   17  921    2   21    4   12   20    9]
 [   1    2    5    3  915    0   10    2    6   38]
 [  10    2    0   41   10  770   17    7   28    7]
 [   9    3    7    2    6   20  907    1    3    0]
 [   2    7   22    5    8    1    1  950    5   27]
 [  10   15    5   21   15   27    7   11  851   12]
 [   7    8    2   13   32   13    0   24   12  898]]
Overall accuracy: 0.9204

使用 matplotlib 将混淆矩阵显示为图形。Use matplotlib to display the confusion matrix as a graph. 在此图中,x 轴显示实际值,y 轴显示预测值。In this graph, the x-axis shows the actual values, and the y-axis shows the predicted values. 每个网格的颜色表示错误率。The color in each grid shows the error rate. 颜色越浅,错误率越高。The lighter the color, the higher the error rate is. 例如,许多应分类为 5 的值被错误地分类为 3 的值。For example, many 5's are misclassified as 3's. 因此,(5,3) 处的网格颜色较亮:So you see a bright grid at (5,3):

# normalize the diagonal cells so that they don't overpower the rest of the cells when visualized
row_sums = conf_mx.sum(axis=1, keepdims=True)
norm_conf_mx = conf_mx / row_sums
np.fill_diagonal(norm_conf_mx, 0)

fig = plt.figure(figsize=(8, 5))
ax = fig.add_subplot(111)
cax = ax.matshow(norm_conf_mx, cmap=plt.cm.bone)
ticks = np.arange(0, 10, 1)
ax.set_xticks(ticks)
ax.set_yticks(ticks)
ax.set_xticklabels(ticks)
ax.set_yticklabels(ticks)
fig.colorbar(cax)
plt.ylabel('true labels', fontsize=14)
plt.xlabel('predicted values', fontsize=14)
plt.savefig('conf.png')
plt.show()

显示混淆矩阵的图表

部署为 Web 服务Deploy as a web service

测试模型并对结果感到满意后,请将模型部署为容器实例中托管的 Web 服务。After you tested the model and you're satisfied with the results, deploy the model as a web service hosted in Container Instances.

若要为容器实例构建正确的环境,请提供以下组件:To build the correct environment for Container Instances, provide the following components:

  • 显示如何使用模型的评分脚本。A scoring script to show how to use the model.
  • 显示需要安装的包的环境文件。An environment file to show what packages need to be installed.
  • 用于生成容器实例的配置文件。A configuration file to build the container instance.
  • 之前已训练的模型。The model you trained previously.

创建评分脚本Create scoring script

创建名为 score.py 的评分脚本。Create the scoring script, called score.py. Web 服务调用使用此脚本来显示模型的用法。The web service call uses this script to show how to use the model.

在评分脚本中包含两个必需的函数:Include these two required functions in the scoring script:

  • init() 函数,它通常将模型加载到全局对象中。The init() function, which typically loads the model into a global object. 此函数只能在 Docker 容器启动时运行一次。This function is run only once when the Docker container is started.

  • run(input_data) 函数,它使用模型来基于输入数据预测值。The run(input_data) function uses the model to predict a value based on the input data. 运行的输入和输出通常使用 JSON 进行序列化和反序列化,但支持其他格式。Inputs and outputs to the run typically use JSON for serialization and de-serialization, but other formats are supported.

%%writefile score.py
import json
import numpy as np
import os
import pickle
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression

from azureml.core.model import Model

def init():
    global model
    # retrieve the path to the model file using the model name
    model_path = Model.get_model_path('sklearn_mnist')
    model = joblib.load(model_path)

def run(raw_data):
    data = np.array(json.loads(raw_data)['data'])
    # make prediction
    y_hat = model.predict(data)
    # you can return any data type as long as it is JSON-serializable
    return y_hat.tolist()

创建环境文件Create environment file

接下来,创建名为 myenv.yml 的环境文件,用于指定脚本的所有包依赖项。Next create an environment file, called myenv.yml, that specifies all of the script's package dependencies. 此文件用于确保在 Docker 映像中安装所有这些依赖项。This file is used to make sure that all of those dependencies are installed in the Docker image. 此模型需要 scikit-learnazureml-sdkThis model needs scikit-learn and azureml-sdk:

from azureml.core.conda_dependencies import CondaDependencies

myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")

with open("myenv.yml", "w") as f:
    f.write(myenv.serialize_to_string())

查看 myenv.yml 文件的内容:Review the content of the myenv.yml file:

with open("myenv.yml", "r") as f:
    print(f.read())

创建配置文件Create a configuration file

创建部署配置文件。Create a deployment configuration file. 指定容器实例容器所需的 CPU 数目和 RAM 大小(单位为 GB)。Specify the number of CPUs and gigabytes of RAM needed for your Container Instances container. 虽然这取决于具体的模型,但对于许多模型而言,默认的单核和 1 GB RAM 便已足够。Although it depends on your model, the default of one core and 1 gigabyte of RAM is sufficient for many models. 如果以后需要更多核心或 RAM,必须重新创建映像并重新部署服务。If you need more later, you have to re-create the image and redeploy the service.

from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1,
                                               memory_gb=1,
                                               tags={"data": "MNIST",
                                                     "method": "sklearn"},
                                               description='Predict MNIST with sklearn')

在容器实例中部署Deploy in Container Instances

完成部署的估计时间为大约 7 到 8 分钟The estimated time to finish deployment is about seven to eight minutes.

配置映像和部署。Configure the image and deploy. 下面的代码将完成这些步骤:The following code goes through these steps:

  1. 使用以下文件生成映像:Build an image by using these files:
    • 评分文件 score.pyThe scoring file, score.py.
    • 环境文件 myenv.ymlThe environment file, myenv.yml.
    • 模型文件。The model file.
  2. 在工作区下注册该映像。Register the image under the workspace.
  3. 将映像发送到容器实例容器。Send the image to the Container Instances container.
  4. 使用映像在容器实例中启动容器。Start up a container in Container Instances by using the image.
  5. 获取 Web 服务 HTTP 终结点。Get the web service HTTP endpoint.
%%time
from azureml.core.webservice import Webservice
from azureml.core.image import ContainerImage

# configure the image
image_config = ContainerImage.image_configuration(execution_script="score.py", 
                                                  runtime="python", 
                                                  conda_file="myenv.yml")

service = Webservice.deploy_from_model(workspace=ws,
                                       name='sklearn-mnist-svc',
                                       deployment_config=aciconfig,
                                       models=[model],
                                       image_config=image_config)

service.wait_for_deployment(show_output=True)

获取评分 Web 服务的 HTTP 终结点,该终结点接受 REST 客户端调用。Get the scoring web service's HTTP endpoint, which accepts REST client calls. 可以与想要测试 Web 服务或要将其集成到应用程序中的任何人共享此终结点:You can share this endpoint with anyone who wants to test the web service or integrate it into an application:

print(service.scoring_uri)

测试已部署的服务Test the deployed service

之前你使用本地版本的模型对所有测试数据进行了评分。Earlier, you scored all the test data with the local version of the model. 现在,可以使用来自测试数据的 30 个映像的随机样本来测试部署的模型。Now you can test the deployed model with a random sample of 30 images from the test data.

下面的代码将完成这些步骤:The following code goes through these steps:

  1. 将数据作为 JSON 数组发送到容器实例中托管的 Web 服务。Send the data as a JSON array to the web service hosted in Container Instances.

  2. 使用 SDK 的 run API 来调用服务。Use the SDK's run API to invoke the service. 还可以使用任何 HTTP 工具(如 curl)进行原始调用。You can also make raw calls by using any HTTP tool such as curl.

  3. 打印返回的预测并将其与输入映像一起绘制。Print the returned predictions and plot them along with the input images. 红色字体和反色图像(黑底白色)用于突出显示错误分类的样本。Red font and inverse image, white on black, is used to highlight the misclassified samples.

由于模型精度较高,可能需要运行以下代码几次才能看到错误分类的样本:Because the model accuracy is high, you might have to run the following code a few times before you can see a misclassified sample:

import json

# find 30 random samples from test set
n = 30
sample_indices = np.random.permutation(X_test.shape[0])[0:n]

test_samples = json.dumps({"data": X_test[sample_indices].tolist()})
test_samples = bytes(test_samples, encoding='utf8')

# predict using the deployed model
result = service.run(input_data=test_samples)

# compare actual value vs. the predicted values:
i = 0
plt.figure(figsize=(20, 1))

for s in sample_indices:
    plt.subplot(1, n, i + 1)
    plt.axhline('')
    plt.axvline('')

    # use different color for misclassified sample
    font_color = 'red' if y_test[s] != result[i] else 'black'
    clr_map = plt.cm.gray if y_test[s] != result[i] else plt.cm.Greys

    plt.text(x=10, y=-10, s=result[i], fontsize=18, color=font_color)
    plt.imshow(X_test[s].reshape(28, 28), cmap=clr_map)

    i = i + 1
plt.show()

下面是某个随机测试映像样本的结果:This result is from one random sample of test images:

显示结果的图形

还可以发送原始 HTTP 请求以测试 Web 服务:You can also send a raw HTTP request to test the web service:

import requests

# send a random row from the test set to score
random_index = np.random.randint(0, len(X_test)-1)
input_data = "{\"data\": [" + str(list(X_test[random_index])) + "]}"

headers = {'Content-Type': 'application/json'}

# for AKS deployment you'd need to the service key in the header as well
# api_key = service.get_key()
# headers = {'Content-Type':'application/json',  'Authorization':('Bearer '+ api_key)}

resp = requests.post(service.scoring_uri, input_data, headers=headers)

print("POST to url", service.scoring_uri)
#print("input data:", input_data)
print("label:", y_test[random_index])
print("prediction:", resp.text)

清理资源Clean up resources

若要保留资源组和工作区用于其他教程和探索,可以使用此 API 调用仅删除容器实例部署:To keep the resource group and workspace for other tutorials and exploration, you can delete only the Container Instances deployment by using this API call:

service.delete()

重要

已创建的资源可以用作其他 Azure 机器学习服务教程和操作方法文章的先决条件。The resources you created can be used as prerequisites to other Azure Machine Learning service tutorials and how-to articles.

如果不打算使用已创建的资源,请删除它们,以免产生任何费用:If you don't plan to use the resources you created, delete them, so you don't incur any charges:

  1. 在 Azure 门户中,选择最左侧的“资源组” 。In the Azure portal, select Resource groups on the far left.

    在 Azure 门户中删除

  2. 从列表中选择已创建的资源组。From the list, select the resource group you created.

  3. 选择“删除资源组” 。Select Delete resource group.

  4. 输入资源组名称。Enter the resource group name. 然后选择“删除” 。Then select Delete.

后续步骤Next steps