バッチエンドポイントを使用してトレーニングパイプラインを運用化する方法

[アーティクル]
04/10/2024

適用対象:Azure CLI ml extension v2 (現行)Python SDK azure-ai-ml v2 (現行)

この記事では、バッチエンドポイントでトレーニングパイプラインを運用化する方法について説明します。パイプラインでは、モデルトレーニング、データ前処理、モデル評価を含む複数のコンポーネント (ステップ) が使用されます。

次のことを学習します。

トレーニングパイプラインを作成してテストする
パイプラインをバッチエンドポイントにデプロイする
パイプラインを変更し、同じエンドポイントに新しいデプロイを作成する
新しいデプロイをテストし、既定のデプロイとして設定する

この例の概要

この例では、入力トレーニングデータ (ラベル付き) を受け取って、予測モデル、評価結果、および前処理中に適用された変換を生成する、トレーニングパイプラインをデプロイします。パイプラインでは、UCI Heart Disease Data Set の表形式データを使用して、XGBoost モデルをトレーニングします。データは、モデルの適合と評価を行うためにトレーニングコンポーネントに送信される前に、データ前処理コンポーネントを使用して前処理されます。

パイプラインの視覚化は次のとおりです。

この記事の例は、azureml-examples リポジトリに含まれているコードサンプルを基にしています。 YAML などのファイルをコピーして貼り付けることなくコマンドをローカルで実行するには、最初にリポジトリを複製してから、ディレクトリをそのフォルダーに変更します。

Azure CLI
Python

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/cli

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/sdk/python

この例のファイルは、次の場所にあります。

cd endpoints/batch/deploy-pipelines/training-with-components

Jupyter ノートブックで作業を進める

この例の Python SDK バージョンに沿って作業を進めることができます。そのためには、複製されたリポジトリで sdk-deploy-and-test.ipynb ノートブックを開きます。

前提条件

この記事の手順に従う前に、次の前提条件が満たされていることをご確認ください。

Azure サブスクリプション。 Azure サブスクリプションをお持ちでない場合は、開始する前に無料アカウントを作成してください。無料版または有料版の Azure Machine Learning をお試しください。
Azure Machine Learning ワークスペース。準備できていない場合は、Microsoft Azure Machine Learning ワークスペースの管理に関する記事の手順を使用して作成します。
ワークスペースに次のアクセス許可があることを確認します。
- バッチエンドポイントとバッチデプロイを作成または管理する: 所有者または共同作成者のロール、あるいは Microsoft.MachineLearningServices/workspaces/batchEndpoints/* を許可するカスタムロールを使用します。
- ワークスペースリソースグループに ARM デプロイを作成する: 所有者または共同作成者のロール、あるいはワークスペースがデプロイされているリソースグループで Microsoft.Resources/deployments/write を許可するカスタムロールを使用します。
Azure Machine Learning を使用するには、次のソフトウェアをインストールする必要があります。
- Azure CLI
- Python
Azure CLI と mlAzure Machine Learning 用の拡張機能。
```
az extension add -n ml
```
注意

Batch エンドポイントのパイプラインコンポーネントデプロイは、Azure CLI 用 ml 拡張機能のバージョン 2.7 で導入されました。 az extension update --name ml を使用して、最新バージョンを取得します。
Azure Machine Learning SDK for Python。
```
pip install azure-ai-ml
```
注意

クラス ModelBatchDeployment と PipelineComponentBatchDeployment は、SDK のバージョン 1.7.0 で導入されました。 pip install -U azure-ai-ml を使用して、最新バージョンを取得します。

ワークスペースに接続する

ワークスペースは、Azure Machine Learning の最上位のリソースで、Azure Machine Learning を使用するときに作成するすべての成果物を操作するための一元的な場所を提供します。このセクションでは、デプロイタスクを実行するワークスペースに接続します。

Azure CLI
Python

次のコードで、サブスクリプション ID、ワークスペース、場所、リソースグループの値を渡します。

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

必要なライブラリをインポートします。

from azure.ai.ml import MLClient, Input, load_component
from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.ai.ml.dsl import pipeline
from azure.identity import DefaultAzureCredential

ワークスペースの詳細を構成し、ワークスペースへのハンドルを取得します。

次のコードで、サブスクリプション ID、ワークスペース、およびリソースグループの値を渡します。
```
subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)
```

トレーニングパイプラインコンポーネントを作成する

このセクションでは、トレーニングパイプラインに必要なすべてのアセットを作成します。まず、モデルをトレーニングするために必要なライブラリを含む環境を作成します。次に、バッチデプロイを実行するコンピューティングクラスターを作成し、最後に入力データをデータアセットとして登録します。

環境の作成

この例のコンポーネントでは、XGBoost および scikit-learn ライブラリを含む環境を使用します。 environment/conda.yml ファイルには、環境の構成が含まれます。

environment/conda.yml

channels:
- conda-forge
dependencies:
- python=3.8.5
- pip
- pip:
  - mlflow
  - azureml-mlflow
  - datasets
  - jobtools
  - cloudpickle==1.6.0
  - dask==2023.2.0
  - scikit-learn==1.1.2
  - xgboost==1.3.3
  - pandas==1.4
name: mlflow-env

次のように環境を作成します。

環境を定義します。

Azure CLI
Python

environment/xgboost-sklearn-py38.yml

$schema: https://azuremlschemas.azureedge.net/latest/environment.schema.json
name: xgboost-sklearn-py38
image: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
conda_file: conda.yml
description: An environment for models built with XGBoost and Scikit-learn.

environment = Environment(
    name="xgboost-sklearn-py38",
    description="An environment for models built with XGBoost and Scikit-learn.",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
    conda_file="environment/conda.yml",
)

環境を作成します。

Azure CLI
Python

az ml environment create -f environment/xgboost-sklearn-py38.yml

try:
    ml_client.environments.create_or_update(environment)
except ResourceExistsError:
    pass

コンピューティングクラスターを作成する

バッチエンドポイントとバッチデプロイは、コンピューティングクラスター上で実行されます。これらは、ワークスペースに既に存在する任意の Azure Machine Learning コンピューティングクラスター上で実行できます。したがって、複数のバッチデプロイが同じコンピューティングインフラストラクチャを共有できます。この例では、batch-cluster という名前の Azure Machine Learning コンピューティングクラスター上で作業します。ワークスペースにコンピューティングが存在することを確認し、存在しない場合は作成します。

Azure CLI
Python

az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 5

compute_name = "batch-cluster"
if not any(filter(lambda m: m.name == compute_name, ml_client.compute.list())):
    compute_cluster = AmlCompute(
        name=compute_name,
        description="Batch endpoints compute cluster",
        min_instances=0,
        max_instances=5,
    )
    ml_client.begin_create_or_update(compute_cluster).result()

トレーニングデータをデータアセットとして登録する

トレーニングデータは CSV ファイルで表されます。より運用レベルのワークロードを再現するために、トレーニングデータをワークスペースのデータアセットとして heart.csv ファイルに登録します。このデータアセットは、後でエンドポイントへの入力として示されます。

Azure CLI
Python

az ml data create --name heart-classifier-train --type uri_folder --path data/train

data_path = "data/train"
dataset_name = "heart-dataset-train"

heart_dataset_train = Data(
    path=data_path,
    type=AssetTypes.URI_FOLDER,
    description="A training dataset for heart classification",
    name=dataset_name,
)

データ資産を作成します。

ml_client.data.create_or_update(heart_dataset_train)

新しいデータアセットへの参照を取得します。

heart_dataset_train = ml_client.data.get(name=dataset_name, label="latest")

パイプラインを作成する

運用化するパイプラインは、1 つの入力 (トレーニングデータ) を受け取り、3 つの出力 (トレーニング済みのモデル、評価結果、前処理として適用されたデータ変換) を生成します。パイプラインは、2 つのコンポーネントで構成されています。

preprocess_job: このステップでは、入力データを読み取り、準備されたデータと適用された変換を返します。このステップでは、3 つの入力を受け取ります。
- data: 変換およびスコアリングする入力データを含むフォルダー
- transformations: (オプション) 適用される変換へのパス (使用可能な場合)。パスが指定されない場合、変換は入力データから学習されます。 transformations 入力はオプションであるため、トレーニングとスコアリング中に preprocess_job コンポーネントを使用できます。
- categorical_encoding: カテゴリ特徴量 (ordinal または onehot) のエンコード戦略。
train_job: このステップでは、準備されたデータに基づいて XGBoost モデルをトレーニングし、評価結果とトレーニング済みモデルを返します。このステップでは、3 つの入力を受け取ります。
- data: 前処理されたデータ。
- target_column: 予測する列。
- eval_size: 評価に使用される入力データの割合を示します。

Azure CLI
Python

パイプライン構成は、deployment-ordinal/pipeline.yml ファイルで定義されます。

deployment-ordinal/pipeline.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponent.schema.json
type: pipeline

name: uci-heart-train-pipeline
display_name: uci-heart-train
description: This pipeline demonstrates how to train a machine learning classifier over the UCI heart dataset.

inputs:
  input_data:
    type: uri_folder

outputs: 
  model:
    type: mlflow_model
    mode: upload
  evaluation_results:
    type: uri_folder
    mode: upload
  prepare_transformations:
    type: uri_folder
    mode: upload

jobs:
  preprocess_job:
    type: command
    component: ../components/prepare/prepare.yml
    inputs:
      data: ${{parent.inputs.input_data}}
      categorical_encoding: ordinal
    outputs:
      prepared_data:
      transformations_output: ${{parent.outputs.prepare_transformations}}
  
  train_job:
    type: command
    component: ../components/train_xgb/train_xgb.yml
    inputs:
      data: ${{parent.jobs.preprocess_job.outputs.prepared_data}}
      target_column: target
      register_best_model: false
      eval_size: 0.3
    outputs:
      model: 
        mode: upload
        type: mlflow_model
        path: ${{parent.outputs.model}}
      evaluation_results:
        mode: upload
        type: uri_folder
        path: ${{parent.outputs.evaluation_results}}

注意

pipeline.yml ファイルで、preprocess_job から transformations 入力が欠落しています。そのため、スクリプトは入力データから変換パラメーターを学習します。

パイプラインコンポーネントの構成は、prepare.yml および train_xgb.yml ファイルにあります。コンポーネントを読み込みます。

prepare_data = load_component(source="components/prepare/prepare.yml")
train_xgb = load_component(source="components/train_xgb/train_xgb.yml")

パイプラインを構築します。

@pipeline()
def uci_heart_classifier_trainer(input_data: Input(type=AssetTypes.URI_FOLDER)):
    prepared_data = prepare_data(data=input_data)
    trained_model = train_xgb(
        data=prepared_data.outputs.prepared_data,
        target_column="target",
        register_best_model=False,
        eval_size=0.3,
    )

    return {
        "model": trained_model.outputs.model,
        "evaluation_results": trained_model.outputs.evaluation_results,
        "transformations_output": prepared_data.outputs.transformations_output,
    }

注意

パイプラインで、transformations 入力が欠落しています。そのため、スクリプトは入力データからパラメーターを学習します。

パイプラインの視覚化は次のとおりです。

パイプラインをテストする

いくつかのサンプルデータを使用してパイプラインをテストします。これを行うために、パイプラインと、以前に作成した batch-cluster コンピューティングクラスターを使用してジョブを作成します。

Azure CLI
Python

次の pipeline-job.yml ファイルには、パイプラインジョブの構成が含まれています。

deployment-ordinal/pipeline-job.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

experiment_name: uci-heart-train-pipeline
display_name: uci-heart-train-job
description: This pipeline demonstrates how to train a machine learning classifier over the UCI heart dataset.

compute: batch-cluster
component: pipeline.yml
inputs:
  input_data:
    type: uri_folder
outputs: 
  model:
    type: mlflow_model
    mode: upload
  evaluation_results:
    type: uri_folder
    mode: upload
  prepare_transformations:
    mode: upload

pipeline_job = uci_heart_classifier_trainer(
    Input(type="uri_folder", path=heart_dataset_train.id)
)

次に、テストを実行するためにいくつかの実行設定を構成します。

pipeline_job.settings.default_datastore = "workspaceblobstore"
pipeline_job.settings.default_compute = "batch-cluster"

テストジョブを作成します。

Azure CLI
Python

az ml job create -f deployment-ordinal/pipeline-job.yml --set inputs.input_data.path=azureml:heart-classifier-train@latest

pipeline_job_run = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="uci-heart-train-pipeline"
)
pipeline_job_run

バッチエンドポイントを作成する

エンドポイントの名前を指定します。バッチエンドポイントの名前は、呼び出し URI の構成に使用されるため、各リージョンで一意である必要があります。一意性を確保するために、次のコードで指定する名前に末尾文字を追加します。
- Azure CLI
- Python
```
ENDPOINT_NAME="uci-classifier-train"
```
```
endpoint_name = "uci-classifier-train"
```

エンドポイントを構成します。

Azure CLI
Python

endpoint.yml ファイルには、エンドポイントの構成が含まれます。

endpoint.yml

$schema: https://azuremlschemas.azureedge.net/latest/batchEndpoint.schema.json
name: uci-classifier-train
description: An endpoint to perform training of the Heart Disease Data Set prediction task.
auth_mode: aad_token

endpoint = BatchEndpoint(
    name=endpoint_name,
    description="An endpoint to perform training of the Heart Disease Data Set prediction task",
)

エンドポイントを作成します。

Azure CLI
Python

az ml batch-endpoint create --name $ENDPOINT_NAME -f endpoint.yml

ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

エンドポイント URI にクエリを実行します。

Azure CLI
Python

az ml batch-endpoint show --name $ENDPOINT_NAME

endpoint = ml_client.batch_endpoints.get(name=endpoint_name)
print(endpoint)

パイプラインコンポーネントをデプロイする

パイプラインコンポーネントをデプロイするには、バッチデプロイを作成する必要があります。デプロイは、実際の作業を行うアセットをホスティングするために必要なリソースのセットです。

デプロイを構成します。

Azure CLI
Python

deployment-ordinal/deployment.yml ファイルには、デプロイの構成が含まれます。追加のプロパティについては、完全なバッチエンドポイント YAML スキーマを確認してください。

deployment-ordinal/deployment.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponentBatchDeployment.schema.json
name: uci-classifier-train-xgb
description: A sample deployment that trains an XGBoost model for the UCI dataset.
endpoint_name: uci-classifier-train
type: pipeline
component: pipeline.yml
settings:
    continue_on_step_failure: false
    default_compute: batch-cluster

パイプラインは関数で定義されています。それをコンポーネントに変換するには、それから component プロパティを使用します。パイプラインコンポーネントは再利用可能なコンピューティンググラフであり、バッチデプロイに含めたり、より複雑なパイプラインを構成するために使用したりできます。

pipeline_component = ml_client.components.create_or_update(
    uci_heart_classifier_trainer().component
)

これで、デプロイを定義できます。

deployment = PipelineComponentBatchDeployment(
    name="uci-classifier-train-xgb",
    description="A sample deployment that trains an XGBoost model for the UCI dataset.",
    endpoint_name=endpoint.name,
    component=pipeline_component,
    settings={"continue_on_step_failure": False, "default_compute": compute_name},
)

デプロイを作成します。
- Azure CLI
- Python
次のコードを実行して、バッチエンドポイントの下にバッチデプロイを作成し、それを既定のデプロイとして設定します。
```
az ml batch-deployment create --endpoint $ENDPOINT_NAME -f deployment-ordinal/deployment.yml --set-default
```
ヒント

この新しいデプロイが既定になったことを示すために --set-default フラグが使用されていることに注目してください。
このコマンドは、デプロイの作成を開始し、デプロイの作成が続行されている間に確認応答を返します。
```
ml_client.batch_deployments.begin_create_or_update(deployment).result()
```
作成したら、この新しいデプロイを既定として構成します。
```
endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()
```
デプロイを使用する準備が整いました。

展開をテスト

デプロイが作成されると、ジョブを受け取る準備が整います。次の手順に従ってテストします。

デプロイでは、1 つのデータ入力を指定する必要があります。
- Azure CLI
- Python
inputs.yml ファイルには、入力データアセットの定義が含まれます。

inputs.yml
```
inputs:
  input_data:
    type: uri_folder
    path: azureml:heart-classifier-train@latest
```
入力データアセットを定義します。
```
input_data = Input(type=AssetTypes.URI_FOLDER, path=heart_dataset_train.id)
```
ヒント

入力を指定する方法の詳細については、「バッチエンドポイントのジョブと入力データを作成する」を参照してください。
次のように、既定のデプロイを呼び出すことができます。
- Azure CLI
- Python
```
JOB_NAME=$(az ml batch-endpoint invoke -n $ENDPOINT_NAME --f inputs.yml --query name -o tsv)
```
ヒント

エンドポイントを呼び出すときの inputs と input の違いは何ですか?

一般的には invoke メソッドで辞書 inputs = {} を使用すると、"モデルデプロイ" または "パイプラインデプロイ" を含むバッチエンドポイントに、必要な入力を任意の数だけ与えることができます。

"モデルデプロイ" では常に 1 つのデータ入力しかとらないため、モデルデプロイの場合、input を使用して、デプロイの入力データの場所を簡単に指定することができます。
```
job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name, inputs={"input_data": input_data}
)
```
次を使用して、ログの表示およびストリーミングの進行状況を監視できます。
- Azure CLI
- Python
```
az ml job stream -n $JOB_NAME
```
```
ml_client.jobs.get(job.name)
```
ジョブが完了するまで待つには、次のコードを実行します。
```
ml_client.jobs.stream(name=job.name)
```

特筆すべきは、パイプラインの入力のみがバッチエンドポイントの入力として公開されるということです。たとえば、categorical_encoding はパイプラインのあるステップの入力ですが、パイプライン自体の入力ではありません。このファクトを使用して、クライアントに公開する入力と非表示にする入力を制御します。

ジョブ出力にアクセスする

ジョブが完了すると、その出力の一部にアクセスできます。このパイプラインでは、そのコンポーネントに対して次の出力が生成されます。

preprocess job: 出力は transformations_output です
train job: 出力は model と evaluation_results です

次を使用して、関連する結果をダウンロードできます。

Azure CLI
Python

az ml job download --name $JOB_NAME --output-name transformations
az ml job download --name $JOB_NAME --output-name model
az ml job download --name $JOB_NAME --output-name evaluation_results

ml_client.jobs.download(
    name=job.name, download_path=".", output_name="transformations_output"
)
ml_client.jobs.download(name=job.name, download_path=".", output_name="model")
ml_client.jobs.download(
    name=job.name, download_path=".", output_name="evaluation_results"
)

エンドポイントで新しいデプロイを作成する

エンドポイントは、一度に複数のデプロイをホストできますが、規定として保持できるデプロイは 1 つのみです。そのため、さまざまなモデルを反復処理し、さまざまなモデルをエンドポイントにデプロイしてテストし、最終的に規定のデプロイを最適なモデルデプロイに切り替えることができます。

パイプラインで前処理を行う方法を変更して、パフォーマンスが向上するモデルが得られるかどうかを確認します。

パイプラインの前処理コンポーネントのパラメーターを変更する

前処理コンポーネントには categorical_encoding という名前の入力があり、その値は ordinal または onehot です。これらの値は、カテゴリ特徴量をエンコードする 2 つの異なる方法に対応しています。

ordinal: [1:n] から得られた数値 (序数) で特徴量の値をエンコードします。ここで n は、特徴量内のカテゴリの数です。序数エンコードは、特徴量カテゴリの中に自然なランク順が存在することを意味します。
onehot: 自然なランク順の関係を示すものではありませんが、カテゴリの数が多いと次元の問題が発生します。

既定では、以前は ordinal を使用していました。次に、onehot 使用するようにカテゴリエンコードを変更し、モデルがどの程度機能するかを確認します。

ヒント

あるいは、categorial_encoding 入力をパイプラインジョブ自体への入力としてクライアントに公開することも可能でした。しかし、デプロイ内部でパラメーターを非表示にして制御し、同じエンドポイントで複数のデプロイを持つ機会を利用できるようにするために、前処理ステップでパラメーター値を変更することを選択しました。

パイプラインを変更します。次のようになります。

Azure CLI
Python

パイプライン構成は、deployment-onehot/pipeline.yml ファイルで定義されます。

deployment-onehot/pipeline.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponent.schema.json
type: pipeline

name: uci-heart-train-pipeline
display_name: uci-heart-train
description: This pipeline demonstrates how to train a machine learning classifier over the UCI heart dataset.

inputs:
  input_data:
    type: uri_folder

outputs: 
  model:
    type: mlflow_model
    mode: upload
  evaluation_results:
    type: uri_folder
    mode: upload
  prepare_transformations:
    type: uri_folder
    mode: upload

jobs:
  preprocess_job:
    type: command
    component: ../components/prepare/prepare.yml
    inputs:
      data: ${{parent.inputs.input_data}}
      categorical_encoding: onehot
    outputs:
      prepared_data:
      transformations_output: ${{parent.outputs.prepare_transformations}}
  
  train_job:
    type: command
    component: ../components/train_xgb/train_xgb.yml
    inputs:
      data: ${{parent.jobs.preprocess_job.outputs.prepared_data}}
      target_column: target
      eval_size: 0.3
    outputs:
      model: 
        type: mlflow_model
        path: ${{parent.outputs.model}}
      evaluation_results:
        type: uri_folder
        path: ${{parent.outputs.evaluation_results}}

@pipeline()
def uci_heart_classifier_onehot(input_data: Input(type=AssetTypes.URI_FOLDER)):
    prepared_data = prepare_data(data=input_data, categorical_encoding="onehot")
    trained_model = train_xgb(
        data=prepared_data.outputs.prepared_data,
        target_column="target",
        register_best_model=False,
        eval_size=0.3,
    )

    return {
        "model": trained_model.outputs.model,
        "evaluation_results": trained_model.outputs.evaluation_results,
        "transformations_output": prepared_data.outputs.transformations_output,
    }

デプロイを構成します。

Azure CLI
Python

deployment-onehot/deployment.yml ファイルには、デプロイの構成が含まれます。追加のプロパティについては、完全なバッチエンドポイント YAML スキーマを確認してください。

deployment-onehot/deployment.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponentBatchDeployment.schema.json
name: uci-classifier-train-onehot
description: A sample deployment that trains an XGBoost model for the UCI dataset using onehot encoding for variables.
endpoint_name: uci-classifier-train
type: pipeline
component: pipeline.yml
settings:
    continue_on_step_failure: false
    default_compute: batch-cluster

パイプラインは関数で定義されています。それをコンポーネントに変換するには、build() メソッドを使用します。パイプラインコンポーネントは再利用可能なコンピューティンググラフであり、バッチデプロイに含めたり、より複雑なパイプラインを構成するために使用したりできます。

pipeline_component = uci_heart_classifier_onehot._pipeline_builder.build()

これで、デプロイを定義できます。

deployment_onehot = PipelineComponentBatchDeployment(
    name="uci-classifier-train-onehot",
    description="A sample deployment that trains an XGBoost model for the UCI dataset with one hot encoding of categorical variables.",
    endpoint_name=endpoint.name,
    component=pipeline_component,
    settings={"continue_on_step_failure": False, "default_compute": compute_name},
)

デプロイを作成します。
- Azure CLI
- Python
次のコードを実行して、バッチエンドポイントの下にバッチデプロイを作成し、それを既定のデプロイとして設定します。
```
az ml batch-deployment create --endpoint $ENDPOINT_NAME -f deployment-onehot/deployment.yml
```
デプロイを使用する準備が整いました。
このコマンドは、デプロイの作成を開始し、デプロイの作成が続行されている間に確認応答を返します。
```
ml_client.batch_deployments.begin_create_or_update(deployment_onehot).result()
```
デプロイを使用する準備が整いました。

既定以外のデプロイをテストする

デプロイが作成されると、ジョブを受け取る準備が整います。以前と同じ方法でテストできますが、今回は特定のデプロイを呼び出します。

次のように、特定のデプロイ uci-classifier-train-onehot をトリガーするようにデプロイパラメーターを指定して、デプロイを呼び出します。
- Azure CLI
- Python
```
DEPLOYMENT_NAME="uci-classifier-train-onehot"
JOB_NAME=$(az ml batch-endpoint invoke -n $ENDPOINT_NAME -d $DEPLOYMENT_NAME --f inputs.yml --query name -o tsv)
```
ヒント

エンドポイントを呼び出すときの inputs と input の違いは何ですか?

一般的には invoke メソッドで辞書 inputs = {} を使用すると、"モデルデプロイ" または "パイプラインデプロイ" を含むバッチエンドポイントに、必要な入力を任意の数だけ与えることができます。

"モデルデプロイ" では常に 1 つのデータ入力しかとらないため、モデルデプロイの場合、input を使用して、デプロイの入力データの場所を簡単に指定することができます。
```
job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name,
    deployment_name=deployment_onehot.name,
    inputs={"input_data": input_data},
)
```
次を使用して、ログの表示およびストリーミングの進行状況を監視できます。
- Azure CLI
- Python
```
az ml job stream -n $JOB_NAME
```
```
ml_client.jobs.get(name=job.name)
```
ジョブが完了するまで待つには、次のコードを実行します。
```
ml_client.jobs.stream(name=job.name)
```

新しいデプロイを既定として構成する

新しいデプロイのパフォーマンスに問題がなければ、この新しいデプロイを既定として設定できます。

Azure CLI
Python

az ml batch-endpoint update --name $ENDPOINT_NAME --set defaults.deployment_name=$DEPLOYMENT_NAME

endpoint = ml_client.batch_endpoints.get(endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

古いデプロイを削除する

完了したら、不要になった以前のデプロイを削除できます。

Azure CLI
Python

az ml batch-deployment delete --name uci-classifier-train-xgb --endpoint-name $ENDPOINT_NAME --yes

ml_client.batch_deployments.begin_delete(
    name=deployment.name, endpoint_name=endpoint.name
).result()

リソースをクリーンアップする

完了したら、関連付けられているリソースをワークスペースから削除します。

Azure CLI
Python

次のコードを実行して、バッチエンドポイントとその基になっているデプロイを削除します。 --yes は、削除を確認するために使用します。

az ml batch-endpoint delete -n $ENDPOINT_NAME --yes

エンドポイントを削除します:

ml_client.batch_endpoints.begin_delete(endpoint_name).result()

(オプション) 後のデプロイでコンピューティングクラスターを再利用する予定がない場合は、コンピューティングを削除します。

Azure CLI
Python

az ml compute delete -n batch-cluster

ml_client.compute.begin_delete(name="batch-cluster")

バッチエンドポイントを使用してトレーニングパイプラインを運用化する方法

この例の概要

Jupyter ノートブックで作業を進める

前提条件

ワークスペースに接続する

トレーニングパイプラインコンポーネントを作成する

環境の作成

コンピューティングクラスターを作成する

トレーニングデータをデータアセットとして登録する

パイプラインを作成する

パイプラインをテストする

バッチエンドポイントを作成する

パイプラインコンポーネントをデプロイする

展開をテスト

ジョブ出力にアクセスする

エンドポイントで新しいデプロイを作成する

パイプラインの前処理コンポーネントのパラメーターを変更する

既定以外のデプロイをテストする

新しいデプロイを既定として構成する

古いデプロイを削除する

リソースをクリーンアップする

次のステップ

その他のリソース

バッチ エンドポイントを使用してトレーニング パイプラインを運用化する方法

この例の概要

Jupyter ノートブックで作業を進める

前提条件

ワークスペースに接続する

トレーニング パイプライン コンポーネントを作成する

環境の作成

コンピューティング クラスターを作成する

トレーニング データをデータ アセットとして登録する

パイプラインを作成する

パイプラインをテストする

バッチ エンドポイントを作成する

パイプライン コンポーネントをデプロイする

展開をテスト

ジョブ出力にアクセスする

エンドポイントで新しいデプロイを作成する

パイプラインの前処理コンポーネントのパラメーターを変更する

既定以外のデプロイをテストする

新しいデプロイを既定として構成する

古いデプロイを削除する

リソースをクリーンアップする

次のステップ

その他のリソース

バッチエンドポイントを使用してトレーニングパイプラインを運用化する方法

トレーニングパイプラインコンポーネントを作成する

コンピューティングクラスターを作成する

トレーニングデータをデータアセットとして登録する

バッチエンドポイントを作成する

パイプラインコンポーネントをデプロイする