CLI （v2）批次部署 YAML 架構

發行項
04/07/2024

您可以在找到 https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json來源 JSON 架構。

注意

本文件中詳述的 YAML 語法是以最新版 ML CLI v2 延伸模組的 JSON 結構描述為基礎。此語法僅保證能與最新版的 ML CLI v2 延伸模組搭配運作。您可以在 https://azuremlschemasprod.azureedge.net/ 找到舊版延伸模組的結構描述。

YAML 語法

機碼	類型	描述	允許的值	預設值
`$schema`	string	YAML 結構描述。如果您使用 Azure Machine Learning VS Code 擴充功能來撰寫 YAML 檔案，在檔案頂端包含 `$schema` 可讓您叫用結構描述和資源完成。
`name`	string	必要。部署的名稱。
`description`	string	部署的描述。
`tags`	object	部署標記的字典。
`endpoint_name`	string	必要。要在其下建立部署的端點名稱。
`type`	string	必要。浴場部署的類型。用於`model`模型部署和`pipeline`管線元件部署。 1.7 版的新功能。	`model`, `pipeline`	`model`
`settings`	object	部署的組態。如需允許的值，請參閱模型和管線元件的特定 YAML 參考。 1.7 版的新功能。

提示

密鑰 type 已在 CLI 延伸模組和更新版本的 1.7 版中引進。若要完全支援回溯相容性，此屬性預設為 model。不過，如果未明確指出，則不會強制執行索引鍵 settings ，而且模型部署設定的所有屬性都應該在 YAML 規格的根目錄中指出。

模型部署的 YAML 語法

當為時 type: model，會強制執行下列語法：

機碼	類型	描述	允許的值	預設值
`model`	字串或物件	必要。要用於部署的模型。此值可以是工作區中現有已建立版本模型的參考，也可以是內嵌模型規格。若要參考現有的模型，請使用 `azureml:<model-name>:<version>` 語法。若要內嵌定義模型，請遵循模型架構。作為生產案例的最佳做法，您應該個別建立模型並在這裡參考該模型。
`code_configuration`	object	評分程式代碼邏輯的設定。如果您的模型是 MLflow 格式，則不需要此屬性。
`code_configuration.code`	string	用於模型評分的所有 Python 原始程式碼目錄。
`code_configuration.scoring_script`	string	上述目錄中的 Python 檔案。這個檔案必須有一個 `init()` 函式和個 `run()` 函式。將 `init()` 函式用於任何昂貴或一般的準備 (例如，將模型載入記憶體)。 `init()` 只會在進程開始時呼叫一次。使用 `run(mini_batch)` 來為每個項目評分；`mini_batch` 的值為檔案路徑的清單。 `run()` 函式應該傳回 Pandas 資料框架或陣列。每個傳回的元素表示 `mini_batch` 中輸入元素的一個成功執行。如需了解撰寫評分指令碼的詳細資訊，請參閱了解評分指令碼。
`environment`	字串或物件	要用於部署的環境。此值可以是工作區中現有已建立版本環境的參考，也可以是內嵌環境規格。如果您的模型是 MLflow 格式，則不需要此屬性。若要參考現有的環境，請使用 `azureml:<environment-name>:<environment-version>` 語法。若要內嵌定義環境，請遵循環境架構。作為生產案例的最佳做法，您應該個別建立環境並在這裡參考該環境。
`compute`	string	必要。要對其執行批次評分作業的計算目標名稱。此值應該是使用 `azureml:<compute-name>` 語法在工作區中現有計算的參考。
`resources.instance_count`	整數	要對每個 Batch 評分作業使用的節點數目。		`1`
`settings`	object	模型部署的特定組態。在1.7版中變更。
`settings.max_concurrency_per_instance`	整數	每個實例的平行 `scoring_script` 執行數目上限。		`1`
`settings.error_threshold`	整數	應該忽略的檔案失敗數目。如果整個輸入的錯誤計數超過此值，則會終止批次評分作業。 `error_threshold` 是針對整個輸入，而不是針對個別迷你批次。如果省略，則允許任意數目的檔案失敗，而不會終止作業。		`-1`
`settings.logging_level`	string	記錄詳細程度層級。	`warning`、、 `infodebug`	`info`
`settings.mini_batch_size`	整數	可以在一次`run()`呼叫中處理的檔案`code_configuration.scoring_script`數目。		`10`
`settings.retry_settings`	object	用於評分每個迷你 Batch 的重試設定。
`settings.retry_settings.max_retries`	整數	失敗或逾時迷你批次的重試次數上限。		`3`
`settings.retry_settings.timeout`	整數	為單一迷你批次評分以秒為單位的逾時。當迷你批次大小較大或模型執行成本更高時，請使用較大的值。		`30`
`settings.output_action`	string	指出輸出應在輸出檔案中的組織方式。如果您要產生輸出檔案，如自訂模型部署中的輸出中所述，請使用 `summary_only` 。如果您要傳回預測做為函`return`式語句的一部分，`run()`請使用 `append_row` 。	`append_row`, `summary_only`	`append_row`
`settings.output_file_name`	string	批次評分輸出檔案的名稱。		`predictions.csv`
`settings.environment_variables`	object	要為每個批次評分作業設定的環境變數索引鍵/值組字典。

管線元件部署的 YAML 語法

當為時 type: pipeline，會強制執行下列語法：

機碼	類型	描述	允許的值	預設值
`component`	字串或物件	必要。用於部署的管線元件。這個值可以是工作區或登錄中現有版本化管線元件的參考，或是內嵌管線規格。若要參考現有的元件，請使用 `azureml:<component-name>:<version>` 語法。若要內嵌定義管線元件，請遵循管線元件架構。作為生產案例的最佳作法，您應該個別建立元件，並在這裏加以參考。 1.7 版的新功能。
`settings`	object	管線作業的預設設定。如需可設定的屬性集，請參閱設定索引鍵的屬性。 1.7 版的新功能。

備註

az ml batch-deployment命令可用來管理 Azure 機器學習批次部署。

範例

範例 GitHub 存放庫中有範例可用。以下是其中一些參考：

YAML：MLflow 模型部署

包含 MLflow 模型的模型部署，不需要指示 code_configuration 或 environment：

$schema: https://azuremlschemas.azureedge.net/latest/modelBatchDeployment.schema.json
endpoint_name: heart-classifier-batch
name: classifier-xgboost-mlflow
description: A heart condition classifier based on XGBoost
type: model
model: azureml:heart-classifier-mlflow@latest
compute: azureml:batch-cluster
resources:
  instance_count: 2
settings:
  max_concurrency_per_instance: 2
  mini_batch_size: 2
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 300
  error_threshold: -1
  logging_level: info

YAML：具有評分腳本的自定義模型部署

模型部署，指出要使用的評分腳本和環境：

$schema: https://azuremlschemas.azureedge.net/latest/modelBatchDeployment.schema.json
name: mnist-torch-dpl
description: A deployment using Torch to solve the MNIST classification dataset.
endpoint_name: mnist-batch
type: model
model:
  name: mnist-classifier-torch
  path: model
code_configuration:
  code: code
  scoring_script: batch_driver.py
environment:
  name: batch-torch-py38
  image: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest
  conda_file: environment/conda.yaml
compute: azureml:batch-cluster
resources:
  instance_count: 1
settings:
  max_concurrency_per_instance: 2
  mini_batch_size: 10
  output_action: append_row
  output_file_name: predictions.csv
  retry_settings:
    max_retries: 3
    timeout: 30
  error_threshold: -1
  logging_level: info

YAML：舊版模型部署

如果 YAML 中未指出屬性 type ，則會推斷模型部署。不過， settings 金鑰將無法使用，而且屬性應該放在 YAML 的根目錄中，如本範例所示。 強烈建議一律指定屬性 type。

$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
endpoint_name: heart-classifier-batch
name: classifier-xgboost-mlflow
description: A heart condition classifier based on XGBoost
model: azureml:heart-classifier-mlflow@latest
compute: azureml:batch-cluster
resources:
  instance_count: 2
max_concurrency_per_instance: 2
mini_batch_size: 2
output_action: append_row
output_file_name: predictions.csv
retry_settings:
  max_retries: 3
  timeout: 300
error_threshold: -1
logging_level: info

YAML：管線元件部署

簡單的管線元件部署：

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponentBatchDeployment.schema.json
name: hello-batch-dpl
endpoint_name: hello-pipeline-batch
type: pipeline
component: azureml:hello_batch@latest
settings:
    default_compute: batch-cluster

下一步

安裝和使用 CLI (v2)

Share via

CLI （v2）批次部署 YAML 架構

YAML 語法

模型部署的 YAML 語法

管線元件部署的 YAML 語法

備註

範例

YAML：MLflow 模型部署

YAML：具有評分腳本的自定義模型部署

YAML：舊版模型部署

YAML：管線元件部署

下一步

其他資源

Share via

CLI （v2） 批次部署 YAML 架構

YAML 語法

模型部署的 YAML 語法

管線元件部署的 YAML 語法

備註

範例

YAML：MLflow 模型部署

YAML：具有評分腳本的自定義模型部署

YAML：舊版模型部署

YAML：管線元件部署

下一步

其他資源

CLI （v2）批次部署 YAML 架構