CLI (v2) command job YAML schema
The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/commandJob.schema.json.
Important
This feature is currently in public preview. This preview version is provided without a service-level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
Note
The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.
YAML syntax
| Key | Type | Description | Allowed values | Default value |
|---|---|---|---|---|
$schema |
string | The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including $schema at the top of your file enables you to invoke schema and resource completions. |
||
type |
const | The type of job. | command |
command |
name |
string | Name of the job. Must be unique across all jobs in the workspace. If omitted, Azure ML will autogenerate a GUID for the name. | ||
display_name |
string | Display name of the job in the studio UI. Can be non-unique within the workspace. If omitted, Azure ML will autogenerate a human-readable adjective-noun identifier for the display name. | ||
experiment_name |
string | Experiment name to organize the job under. Each job's run record will be organized under the corresponding experiment in the studio's "Experiments" tab. If omitted, Azure ML will default it to the name of the working directory where the job was created. | ||
description |
string | Description of the job. | ||
tags |
object | Dictionary of tags for the job. | ||
command |
string | Required. The command to execute. | ||
code.local_path |
string | Local path to the source code directory to be uploaded and used for the job. | ||
environment |
string or object | Required. The environment to use for the job. This can be either a reference to an existing versioned environment in the workspace or an inline environment specification. To reference an existing environment use the azureml:<environment_name>:<environment_version> syntax. To define an environment inline please follow the Environment schema. Exclude the name and version properties as they are not supported for inline environments. |
||
environment_variables |
object | Dictionary of environment variable name-value pairs to set on the process where the command is executed. | ||
distribution |
object | The distribution configuration for distributed training scenarios. One of MpiConfiguration, PyTorchConfiguration, or TensorFlowConfiguration. | ||
compute |
string | Name of the compute target to execute the job on. This can be either a reference to an existing compute in the workspace (using the azureml:<compute_name> syntax) or local to designate local execution. |
local |
|
resources.instance_count |
integer | The number of nodes to use for the job. | 1 |
|
limits.timeout |
integer | The maximum time in seconds the job is allowed to run. Once this limit is reached the system will cancel the job. | ||
inputs |
object | Dictionary of inputs to the job. The key is a name for the input within the context of the job and the value is the input value. Inputs can be referenced in the command using the ${{ inputs.<input_name> }} expression. |
||
inputs.<input_name> |
number, integer, boolean, string or object | One of a literal value (of type number, integer, boolean, or string), JobInputUri, or JobInputDataset. | ||
outputs |
object | Dictionary of output configurations of the job. The key is a name for the output within the context of the job and the value is the output configuration. Outputs can be referenced in the command using the ${{ outputs.<output_name> }} expression. |
||
outputs.<output_name> |
object | You can either specify an optional mode or leave the object empty. For each named output specified in the outputs dictionary, Azure ML will autogenerate an output location. |
||
outputs.<output_name>.mode |
string | Mode of how output file(s) will get delivered to the destination storage. For read-write mount mode the output directory will be a mounted directory. For upload mode the files written to the output directory will get uploaded at the end of the job. | rw_mount, upload |
rw_mount |
Distribution configurations
MpiConfiguration
| Key | Type | Description | Allowed values |
|---|---|---|---|
type |
const | Required. Distribution type. | mpi |
process_count_per_instance |
integer | Required. The number of processes per node to launch for the job. |
PyTorchConfiguration
| Key | Type | Description | Allowed values | Default value |
|---|---|---|---|---|
type |
const | Required. Distribution type. | pytorch |
|
process_count_per_instance |
integer | The number of processes per node to launch for the job. | 1 |
TensorFlowConfiguration
| Key | Type | Description | Allowed values | Default value |
|---|---|---|---|---|
type |
const | Required. Distribution type. | tensorflow |
|
worker_count |
integer | The number of workers to launch for the job. | Defaults to resources.instance_count. |
|
parameter_server_count |
integer | The number of parameter servers to launch for the job. | 0 |
Job inputs
JobInputUri
| Key | Type | Description | Allowed values | Default value |
|---|---|---|---|---|
file |
string | URI to a single file to use as input. Supported URI types are azureml, https, wasbs, abfss, adl. See Core yaml syntax for more information on how to use the azureml:// URI format. One of file or folder is required. |
||
folder |
string | URI to a folder to use as input. Supported URI types are azureml, wasbs, abfss, adl. See Core yaml syntax for more information on how to use the azureml:// URI format. One of file or folder is required. |
||
mode |
string | Mode of how the data should be delivered to the compute target. For read-only mount and read-write mount the data will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as a file. For download mode the data will be consumed as a downloaded path. | ro_mount, rw_mount, download |
ro_mount |
JobInputDataset
| Key | Type | Description | Allowed values | Default value |
|---|---|---|---|---|
dataset |
string or object | Required. A dataset to use as input. This can be either a reference to an existing versioned dataset in the workspace or an inline dataset specification. To reference an existing dataset use the azureml:<dataset_name>:<dataset_version> syntax. To define a dataset inline please follow the Dataset schema. Exclude the name and version properties as they are not supported for inline datasets. |
||
mode |
string | Mode of how the dataset should be delivered to the compute target. For read-only mount the dataset will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as the parent folder. For download mode the dataset will be consumed as a downloaded path. | ro_mount, download |
ro_mount |
Remarks
The az ml job command can be used for managing Azure Machine Learning jobs.
Examples
Examples are available in the examples GitHub repository. Several are shown below.
YAML: hello world
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
image: python:latest
compute: azureml:cpu-cluster
YAML: display name, experiment name, description, and tags
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
image: python:latest
compute: azureml:cpu-cluster
tags:
hello: world
display_name: hello-world-example
experiment_name: hello-world-example
description: |
# Azure Machine Learning "hello world" job
This is a "hello world" job running in the cloud via Azure Machine Learning!
## Description
Markdown is supported in the studio for job descriptions! You can edit the description there or via CLI.
YAML: environment variables
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo $hello_env_var
environment:
image: python:latest
compute: azureml:cpu-cluster
environment_variables:
hello_env_var: "hello world"
YAML: source code
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: ls
code:
local_path: src
environment:
image: python:latest
compute: azureml:cpu-cluster
YAML: literal inputs
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
echo ${{inputs.hello_string}}
echo ${{inputs.hello_number}}
environment:
image: python:latest
inputs:
hello_string: "hello world"
hello_number: 42
compute: azureml:cpu-cluster
YAML: write to default outputs
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ./outputs/helloworld.txt
environment:
image: python:latest
compute: azureml:cpu-cluster
YAML: write to named data output
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ${{outputs.hello_output}}/helloworld.txt
outputs:
hello_output:
environment:
image: python
compute: azureml:cpu-cluster
YAML: datastore URI file input
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
echo "--iris-csv: ${{inputs.iris_csv}}"
python hello-iris.py --iris-csv ${{inputs.iris_csv}}
code:
local_path: src
inputs:
iris_csv:
file: azureml://datastores/workspaceblobstore/paths/example-data/iris.csv
environment: azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:3
compute: azureml:cpu-cluster
YAML: datastore URI folder input
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
ls ${{inputs.data_dir}}
echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
python hello-iris.py --iris-csv ${{inputs.data_dir}}/iris.csv
code:
local_path: src
inputs:
data_dir:
folder: azureml://datastores/workspaceblobstore/paths/example-data/
mode: rw_mount
environment: azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:3
compute: azureml:cpu-cluster
YAML: URI file input
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
echo "--iris-csv: ${{inputs.iris_csv}}"
python hello-iris.py --iris-csv ${{inputs.iris_csv}}
code:
local_path: src
inputs:
iris_csv:
file: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
environment: azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:3
compute: azureml:cpu-cluster
YAML: URI folder input
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
ls ${{inputs.data_dir}}
echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
python hello-iris.py --iris-csv ${{inputs.data_dir}}/iris.csv
code:
local_path: src
inputs:
data_dir:
folder: wasbs://datasets@azuremlexamples.blob.core.windows.net/
environment: azureml:AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:3
compute: azureml:cpu-cluster
YAML: Notebook via papermill
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
pip install ipykernel papermill
papermill hello-notebook.ipynb outputs/out.ipynb -k python
code:
local_path: src
environment:
image: python:latest
compute: azureml:cpu-cluster
YAML: basic Python model training
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code:
local_path: src
command: >-
python main.py
--iris-csv ${{inputs.iris_csv}}
--C ${{inputs.C}}
--kernel ${{inputs.kernel}}
--coef0 ${{inputs.coef0}}
inputs:
iris_csv:
file: wasbs://datasets@azuremlexamples.blob.core.windows.net/iris.csv
C: 0.8
kernel: "rbf"
coef0: 0.1
environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:9
compute: azureml:cpu-cluster
display_name: sklearn-iris-example
experiment_name: sklearn-iris-example
description: Train a scikit-learn SVM on the Iris dataset.
YAML: basic R model training with local Docker build context
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >
Rscript train.R
--data_folder ${{inputs.iris}}
code:
local_path: src
inputs:
iris:
file: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
environment:
build:
local_path: docker-context
compute: azureml:cpu-cluster
display_name: r-iris-example
experiment_name: r-iris-example
description: Train an R model on the Iris dataset.
YAML: distributed PyTorch
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code:
local_path: src
command: >-
python train.py
--epochs ${{inputs.epochs}}
--learning-rate ${{inputs.learning_rate}}
--data-dir ${{inputs.cifar}}
inputs:
epochs: 1
learning_rate: 0.2
cifar:
dataset: azureml:cifar-10-example:1
environment: azureml:AzureML-pytorch-1.9-ubuntu18.04-py37-cuda11-gpu:6
compute: azureml:gpu-cluster
distribution:
type: pytorch
process_count_per_instance: 2
resources:
instance_count: 2
display_name: pytorch-cifar-distributed-example
experiment_name: pytorch-cifar-distributed-example
description: Train a basic convolutional neural network (CNN) with PyTorch on the CIFAR-10 dataset, distributed via PyTorch.
YAML: distributed TensorFlow
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code:
local_path: src
command: >-
python train.py
--epochs ${{inputs.epochs}}
--model-dir ${{inputs.model_dir}}
inputs:
epochs: 1
model_dir: outputs/keras-model
environment: azureml:AzureML-tensorflow-2.4-ubuntu18.04-py37-cuda11-gpu:14
compute: azureml:gpu-cluster
resources:
instance_count: 2
distribution:
type: tensorflow
worker_count: 2
worker_count: 2
display_name: tensorflow-mnist-distributed-example
experiment_name: tensorflow-mnist-distributed-example
description: Train a basic neural network with TensorFlow on the MNIST dataset, distributed via TensorFlow.
YAML: distributed MPI
$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code:
local_path: src
command: >-
python train.py
--epochs ${{inputs.epochs}}
inputs:
epochs: 1
environment: azureml:AzureML-tensorflow-2.4-ubuntu18.04-py37-cuda11-gpu:14
compute: azureml:gpu-cluster
resources:
instance_count: 2
distribution:
type: mpi
process_count_per_instance: 2
display_name: tensorflow-mnist-distributed-horovod-example
experiment_name: tensorflow-mnist-distributed-horovod-example
description: Train a basic neural network with TensorFlow on the MNIST dataset, distributed via Horovod.