教學課程:在 Azure 容器執行個體中部署映像分類模型Tutorial: Deploy an image classification model in Azure Container Instances

本教學課程是兩部分教學課程系列的第二部分This tutorial is part two of a two-part tutorial series. 先前的教學課程中,您定型了機器學習模型,並在您的雲端工作區內註冊模型。In the previous tutorial, you trained machine learning models and then registered a model in your workspace on the cloud.

現在,您已準備好將模型以 Web 服務的形式部署在 Azure 容器執行個體中。Now you're ready to deploy the model as a web service in Azure Container Instances. Web 服務是映像,在此案例中為 Docker 映像。A web service is an image, in this case a Docker image. 它封裝了評分邏輯和模型本身。It encapsulates the scoring logic and the model itself.

在本教學課程部分中,您將使用 Azure Machine Learning 服務來進行下列工作:In this part of the tutorial, you use Azure Machine Learning service for the following tasks:

  • 設定您的測試環境。Set up your testing environment.
  • 從您的工作區擷取模型。Retrieve the model from your workspace.
  • 在本機測試模型。Test the model locally.
  • 將模型部署到容器執行個體。Deploy the model to Container Instances.
  • 測試已部署的模型。Test the deployed model.

容器執行個體是很適合用來測試和了解工作流程的解決方案。Container Instances is a great solution for testing and understanding the workflow. 如需可調整的生產環境部署,請考慮使用 Azure Kubernetes Service。For scalable production deployments, consider using Azure Kubernetes Service. 如需詳細資訊,請參閱部署方式和位置For more information, see how to deploy and where.

注意

本文中的程式碼已進行過 Azure Machine Learning SDK 1.0.41 版的測試。Code in this article was tested with Azure Machine Learning SDK version 1.0.41.

必要條件Prerequisites

跳至設定開發環境以讀過所有筆記本步驟。Skip to Set the development environment to read through the notebook steps.

若要執行筆記本,請先完成模型訓練,相關內容位於教學課程 (第 1 部分):使用 Azure Machine Learning 服務將映像分類模型定型To run the notebook, first complete the model training in Tutorial (part 1): Train an image classification model with Azure Machine Learning service. 然後使用同一個筆記本伺服器執行 tutorials/img-classification-part2-deploy.ipynb 筆記本。Then run the tutorials/img-classification-part2-deploy.ipynb notebook using the same notebook server.

設定環境Set up the environment

著手開始設定測試環境。Start by setting up a testing environment.

匯入套件Import packages

匯入本教學課程所需的 Python 套件:Import the Python packages needed for this tutorial:

%matplotlib inline
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
 
import azureml
from azureml.core import Workspace, Run

# display the core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

擷取模型Retrieve the model

您已在先前教學課程中的工作區內註冊模型。You registered a model in your workspace in the previous tutorial. 現在,請載入此工作區,並將模型下載到您的本機目錄:Now load this workspace and download the model to your local directory:

from azureml.core import Workspace
from azureml.core.model import Model
import os 
ws = Workspace.from_config()
model=Model(ws, 'sklearn_mnist')

model.download(target_dir=os.getcwd(), exist_ok=True)

# verify the downloaded model file
file_path = os.path.join(os.getcwd(), "sklearn_mnist_model.pkl")

os.stat(file_path)

於本機測試模型Test the model locally

在部署之前,請確定您的模型在本機正常運作:Before you deploy, make sure your model is working locally:

  • 載入測試資料。Load test data.
  • 預測測試資料。Predict test data.
  • 檢查混淆矩陣。Examine the confusion matrix.

載入測試資料Load test data

從在定型教學課程中建立的 ./data/ 目錄載入測試資料:Load the test data from the ./data/ directory created during the training tutorial:

from utils import load_data
import os

data_folder = os.path.join(os.getcwd(), 'data')
# note we also shrink the intensity values (X) from 0-255 to 0-1. This helps the neural network converge faster
X_test = load_data(os.path.join(data_folder, 'test-images.gz'), False) / 255.0
y_test = load_data(os.path.join(data_folder, 'test-labels.gz'), True).reshape(-1)

預測測試資料Predict test data

若要取得預測,請將測試資料集饋送給模型:To get predictions, feed the test dataset to the model:

import pickle
from sklearn.externals import joblib

clf = joblib.load( os.path.join(os.getcwd(), 'sklearn_mnist_model.pkl'))
y_hat = clf.predict(X_test)

檢查混淆矩陣Examine the confusion matrix

產生混淆矩陣,查看測試集中有多少範例的分類正確。Generate a confusion matrix to see how many samples from the test set are classified correctly. 請注意不正確預測的分類錯誤值:Notice the misclassified value for the incorrect predictions:

from sklearn.metrics import confusion_matrix

conf_mx = confusion_matrix(y_test, y_hat)
print(conf_mx)
print('Overall accuracy:', np.average(y_hat == y_test))

輸出會顯示混淆矩陣:The output shows the confusion matrix:

[[ 960    0    1    2    1    5    6    3    1    1]
 [   0 1112    3    1    0    1    5    1   12    0]
 [   9    8  920   20   10    4   10   11   37    3]
 [   4    0   17  921    2   21    4   12   20    9]
 [   1    2    5    3  915    0   10    2    6   38]
 [  10    2    0   41   10  770   17    7   28    7]
 [   9    3    7    2    6   20  907    1    3    0]
 [   2    7   22    5    8    1    1  950    5   27]
 [  10   15    5   21   15   27    7   11  851   12]
 [   7    8    2   13   32   13    0   24   12  898]]
Overall accuracy: 0.9204

使用 matplotlib,將混淆矩陣作為圖表顯示。Use matplotlib to display the confusion matrix as a graph. 在此圖表中,X 軸顯示的是實際值,Y 軸顯示的是預測值。In this graph, the x-axis shows the actual values, and the y-axis shows the predicted values. 每個方格中顏色顯示的是錯誤率。The color in each grid shows the error rate. 顏色越淡,表示錯誤率越高。The lighter the color, the higher the error rate is. 例如,很多 5 都錯誤分類為 3。For example, many 5's are misclassified as 3's. 因此,您會在 (5,3) 看到明亮的方格:So you see a bright grid at (5,3):

# normalize the diagonal cells so that they don't overpower the rest of the cells when visualized
row_sums = conf_mx.sum(axis=1, keepdims=True)
norm_conf_mx = conf_mx / row_sums
np.fill_diagonal(norm_conf_mx, 0)

fig = plt.figure(figsize=(8,5))
ax = fig.add_subplot(111)
cax = ax.matshow(norm_conf_mx, cmap=plt.cm.bone)
ticks = np.arange(0, 10, 1)
ax.set_xticks(ticks)
ax.set_yticks(ticks)
ax.set_xticklabels(ticks)
ax.set_yticklabels(ticks)
fig.colorbar(cax)
plt.ylabel('true labels', fontsize=14)
plt.xlabel('predicted values', fontsize=14)
plt.savefig('conf.png')
plt.show()

顯示混淆矩陣的圖表

部署成 Web 服務Deploy as a web service

在您測試完模型並對結果感到滿意之後,便可將模型部署成裝載在「容器執行個體」中的 Web 服務。After you tested the model and you're satisfied with the results, deploy the model as a web service hosted in Container Instances.

若要為「容器執行個體」建置正確的環境,請提供下列元件:To build the correct environment for Container Instances, provide the following components:

  • 一個示範模型使用方式的評分指令碼。A scoring script to show how to use the model.
  • 一個顯示需要安裝哪些套件的環境檔案。An environment file to show what packages need to be installed.
  • 一個用於建置容器執行個體的設定檔。A configuration file to build the container instance.
  • 您之前定型的模型。The model you trained previously.

建立評分指令碼Create scoring script

建立名為 score.py 的評分指令碼。Create the scoring script, called score.py. Web 服務呼叫會使用此指令碼來示範模型的使用方式。The web service call uses this script to show how to use the model.

請在評分指令碼中包含下列兩個必要函式:Include these two required functions in the scoring script:

  • init() 函式,通常會將模型載入全域物件。The init() function, which typically loads the model into a global object. 此函式只會在 Docker 容器啟動時執行一次。This function is run only once when the Docker container is started.

  • run(input_data) 函式會使用模型依據輸入資料預測值。The run(input_data) function uses the model to predict a value based on the input data. run 的輸入和輸出通常會使用 JSON 進行序列化及還原序列化,但也支援其他格式。Inputs and outputs to the run typically use JSON for serialization and de-serialization, but other formats are supported.

%%writefile score.py
import json
import numpy as np
import os
import pickle
from sklearn.externals import joblib
from sklearn.linear_model import LogisticRegression

from azureml.core.model import Model

def init():
    global model
    # retrieve the path to the model file using the model name
    model_path = Model.get_model_path('sklearn_mnist')
    model = joblib.load(model_path)

def run(raw_data):
    data = np.array(json.loads(raw_data)['data'])
    # make prediction
    y_hat = model.predict(data)
    # you can return any data type as long as it is JSON-serializable
    return y_hat.tolist()

建立環境檔案Create environment file

接下來,建立名為 myenv.yml 的環境檔案,以指定指令碼的所有套件相依性。Next create an environment file, called myenv.yml, that specifies all of the script's package dependencies. 此檔案可用來確保 Docker 映像中會安裝所有這些相依性。This file is used to make sure that all of those dependencies are installed in the Docker image. 此模型需要 scikit-learnazureml-sdkThis model needs scikit-learn and azureml-sdk:

from azureml.core.conda_dependencies import CondaDependencies 

myenv = CondaDependencies()
myenv.add_conda_package("scikit-learn")

with open("myenv.yml","w") as f:
    f.write(myenv.serialize_to_string())

檢閱 myenv.yml 檔案的內容:Review the content of the myenv.yml file:

with open("myenv.yml","r") as f:
    print(f.read())

建立設定檔Create a configuration file

建立部署設定檔。Create a deployment configuration file. 指定您「容器執行個體」容器所需的 CPU 數量及 RAM GB 數。Specify the number of CPUs and gigabytes of RAM needed for your Container Instances container. 雖然這取決於您的模型,但預設的 1 核心及 1 GB RAM 對許多模型來說已經夠用。Although it depends on your model, the default of one core and 1 gigabyte of RAM is sufficient for many models. 若您稍後需要更多,則必須重新建立映像並重新部署服務。If you need more later, you have to re-create the image and redeploy the service.

from azureml.core.webservice import AciWebservice

aciconfig = AciWebservice.deploy_configuration(cpu_cores=1, 
                                               memory_gb=1, 
                                               tags={"data": "MNIST",  "method" : "sklearn"}, 
                                               description='Predict MNIST with sklearn')

在容器執行個體中部署Deploy in Container Instances

預估完成部署所需的時間大約是 7 到 8 分鐘The estimated time to finish deployment is about seven to eight minutes.

設定映像並部署。Configure the image and deploy. 下列程式碼會執行這些步驟:The following code goes through these steps:

  1. 使用下列檔案來建置映像:Build an image by using these files:
    • 評分檔案 (score.py)。The scoring file, score.py.
    • 環境檔案 (myenv.yml)。The environment file, myenv.yml.
    • 模型檔案。The model file.
  2. 在工作區下註冊該映像。Register the image under the workspace.
  3. 將映像傳送到容器執行個體容器。Send the image to the Container Instances container.
  4. 使用映像在容器執行個體中啟動容器。Start up a container in Container Instances by using the image.
  5. 取得 Web 服務 HTTP 端點。Get the web service HTTP endpoint.
%%time
from azureml.core.webservice import Webservice
from azureml.core.image import ContainerImage

# configure the image
image_config = ContainerImage.image_configuration(execution_script="score.py", 
                                                  runtime="python", 
                                                  conda_file="myenv.yml")

service = Webservice.deploy_from_model(workspace=ws,
                                       name='sklearn-mnist-svc',
                                       deployment_config=aciconfig,
                                       models=[model],
                                       image_config=image_config)

service.wait_for_deployment(show_output=True)

取得評分 Web 服務的 HTTP 端點,該端點會接受 REST 用戶端呼叫。Get the scoring web service's HTTP endpoint, which accepts REST client calls. 您可以和任何想要測試 Web 服務或將其整合至應用程式的使用者,共用此端點:You can share this endpoint with anyone who wants to test the web service or integrate it into an application:

print(service.scoring_uri)

測試已部署的服務Test the deployed service

先前您已使用模型的本機版本,對所有測試資料進行評分。Earlier, you scored all the test data with the local version of the model. 現在,您可以使用來自測試資料的 30 個隨機影像範例來測試已部署模型。Now you can test the deployed model with a random sample of 30 images from the test data.

下列程式碼會執行這些步驟:The following code goes through these steps:

  1. 將資料作為 JSON 陣列傳送至容器執行個體中的託管 Web 服務。Send the data as a JSON array to the web service hosted in Container Instances.

  2. 使用 SDK 的 run API 叫用服務。Use the SDK's run API to invoke the service. 您也可以使用任何 HTTP 工具 (例如 curl) 來進行原始呼叫。You can also make raw calls by using any HTTP tool such as curl.

  3. 印出傳回的預測,並將它們與輸入影像繪製在一起。Print the returned predictions and plot them along with the input images. 紅色字型及反白影像 (黑底白字) 是用來醒目提示分類錯誤的範例。Red font and inverse image, white on black, is used to highlight the misclassified samples.

由於模型準確度很高,因此您可能需要執行下列程式碼幾次,才能看到分類錯誤的範例:Because the model accuracy is high, you might have to run the following code a few times before you can see a misclassified sample:

import json

# find 30 random samples from test set
n = 30
sample_indices = np.random.permutation(X_test.shape[0])[0:n]

test_samples = json.dumps({"data": X_test[sample_indices].tolist()})
test_samples = bytes(test_samples, encoding='utf8')

# predict using the deployed model
result = service.run(input_data=test_samples)

# compare actual value vs. the predicted values:
i = 0
plt.figure(figsize = (20, 1))

for s in sample_indices:
    plt.subplot(1, n, i + 1)
    plt.axhline('')
    plt.axvline('')
    
    # use different color for misclassified sample
    font_color = 'red' if y_test[s] != result[i] else 'black'
    clr_map = plt.cm.gray if y_test[s] != result[i] else plt.cm.Greys
    
    plt.text(x=10, y =-10, s=result[i], fontsize=18, color=font_color)
    plt.imshow(X_test[s].reshape(28, 28), cmap=clr_map)
    
    i = i + 1
plt.show()

以下結果來自其中一個隨機的測試影像範例:This result is from one random sample of test images:

顯示結果的圖形

您也可以傳送原始 HTTP 要求來測試 Web 服務:You can also send a raw HTTP request to test the web service:

import requests

# send a random row from the test set to score
random_index = np.random.randint(0, len(X_test)-1)
input_data = "{\"data\": [" + str(list(X_test[random_index])) + "]}"

headers = {'Content-Type':'application/json'}

# for AKS deployment you'd need to the service key in the header as well
# api_key = service.get_key()
# headers = {'Content-Type':'application/json',  'Authorization':('Bearer '+ api_key)} 

resp = requests.post(service.scoring_uri, input_data, headers=headers)

print("POST to url", service.scoring_uri)
#print("input data:", input_data)
print("label:", y_test[random_index])
print("prediction:", resp.text)

清除資源Clean up resources

若要保留資源群組及工作區,以用於其他教學課程和探索,您可以使用此 API 呼叫,只刪除容器執行個體部署:To keep the resource group and workspace for other tutorials and exploration, you can delete only the Container Instances deployment by using this API call:

service.delete()

重要

您所建立的資源可用來作為其他 Azure Machine Learning 服務教學課程和操作說明文章的先決條件。The resources you created can be used as prerequisites to other Azure Machine Learning service tutorials and how-to articles.

如果您不打算使用您建立的資源,請刪除它們,以免產生任何費用:If you don't plan to use the resources you created, delete them, so you don't incur any charges:

  1. 在 Azure 入口網站中,選取最左邊的 [資源群組] 。In the Azure portal, select Resource groups on the far left.

    在 Azure 入口網站中刪除

  2. 在清單中,選取您所建立的資源群組。From the list, select the resource group you created.

  3. 選取 [刪除資源群組] 。Select Delete resource group.

  4. 輸入資源群組名稱。Enter the resource group name. 然後選取 [刪除] 。Then select Delete.

後續步驟Next steps