CLI (v2) command job YAML schema

The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/commandJob.schema.json.

Important

This feature is currently in public preview. This preview version is provided without a service-level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

YAML syntax

Key Type Description Allowed values Default value
$schema string The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including $schema at the top of your file enables you to invoke schema and resource completions.
type const The type of job. command command
name string Name of the job. Must be unique across all jobs in the workspace. If omitted, Azure ML will autogenerate a GUID for the name.
display_name string Display name of the job in the studio UI. Can be non-unique within the workspace. If omitted, Azure ML will autogenerate a human-readable adjective-noun identifier for the display name.
experiment_name string Experiment name to organize the job under. Each job's run record will be organized under the corresponding experiment in the studio's "Experiments" tab. If omitted, Azure ML will default it to the name of the working directory where the job was created.
description string Description of the job.
tags object Dictionary of tags for the job.
command string Required. The command to execute.
code.local_path string Local path to the source code directory to be uploaded and used for the job.
environment string or object Required. The environment to use for the job. This can be either a reference to an existing versioned environment in the workspace or an inline environment specification.

To reference an existing environment use the azureml:<environment_name>:<environment_version> syntax.

To define an environment inline please follow the Environment schema. Exclude the name and version properties as they are not supported for inline environments.
environment_variables object Dictionary of environment variable name-value pairs to set on the process where the command is executed.
distribution object The distribution configuration for distributed training scenarios. One of MpiConfiguration, PyTorchConfiguration, or TensorFlowConfiguration.
compute string Name of the compute target to execute the job on. This can be either a reference to an existing compute in the workspace (using the azureml:<compute_name> syntax) or local to designate local execution. local
resources.instance_count integer The number of nodes to use for the job. 1
limits.timeout integer The maximum time in seconds the job is allowed to run. Once this limit is reached the system will cancel the job.
inputs object Dictionary of inputs to the job. The key is a name for the input within the context of the job and the value is the input value.

Inputs can be referenced in the command using the ${{ inputs.<input_name> }} expression.
inputs.<input_name> number, integer, boolean, string or object One of a literal value (of type number, integer, boolean, or string), JobInputUri, or JobInputDataset.
outputs object Dictionary of output configurations of the job. The key is a name for the output within the context of the job and the value is the output configuration.

Outputs can be referenced in the command using the ${{ outputs.<output_name> }} expression.
outputs.<output_name> object You can either specify an optional mode or leave the object empty. For each named output specified in the outputs dictionary, Azure ML will autogenerate an output location.
outputs.<output_name>.mode string Mode of how output file(s) will get delivered to the destination storage. For read-write mount mode the output directory will be a mounted directory. For upload mode the files written to the output directory will get uploaded at the end of the job. rw_mount, upload rw_mount

Distribution configurations

MpiConfiguration

Key Type Description Allowed values
type const Required. Distribution type. mpi
process_count_per_instance integer Required. The number of processes per node to launch for the job.

PyTorchConfiguration

Key Type Description Allowed values Default value
type const Required. Distribution type. pytorch
process_count_per_instance integer The number of processes per node to launch for the job. 1

TensorFlowConfiguration

Key Type Description Allowed values Default value
type const Required. Distribution type. tensorflow
worker_count integer The number of workers to launch for the job. Defaults to resources.instance_count.
parameter_server_count integer The number of parameter servers to launch for the job. 0

Job inputs

JobInputUri

Key Type Description Allowed values Default value
file string URI to a single file to use as input. Supported URI types are azureml, https, wasbs, abfss, adl. See Core yaml syntax for more information on how to use the azureml:// URI format. One of file or folder is required.
folder string URI to a folder to use as input. Supported URI types are azureml, wasbs, abfss, adl. See Core yaml syntax for more information on how to use the azureml:// URI format. One of file or folder is required.
mode string Mode of how the data should be delivered to the compute target. For read-only mount and read-write mount the data will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as a file. For download mode the data will be consumed as a downloaded path. ro_mount, rw_mount, download ro_mount

JobInputDataset

Key Type Description Allowed values Default value
dataset string or object Required. A dataset to use as input. This can be either a reference to an existing versioned dataset in the workspace or an inline dataset specification.

To reference an existing dataset use the azureml:<dataset_name>:<dataset_version> syntax.

To define a dataset inline please follow the Dataset schema. Exclude the name and version properties as they are not supported for inline datasets.
mode string Mode of how the dataset should be delivered to the compute target. For read-only mount the dataset will be consumed as a mount path. A folder will be mounted as a folder and a file will be mounted as the parent folder. For download mode the dataset will be consumed as a downloaded path. ro_mount, download ro_mount

Remarks

The az ml job command can be used for managing Azure Machine Learning jobs.

Examples

Examples are available in the examples GitHub repository. Several are shown below.

YAML: hello world

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: display name, experiment name, description, and tags

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
  image: python:latest
compute: azureml:cpu-cluster
tags:
  hello: world
display_name: hello-world-example
experiment_name: hello-world-example
description: |
  # Azure Machine Learning "hello world" job

  This is a "hello world" job running in the cloud via Azure Machine Learning!

  ## Description

  Markdown is supported in the studio for job descriptions! You can edit the description there or via CLI.

YAML: environment variables

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo $hello_env_var
environment:
  image: python:latest
compute: azureml:cpu-cluster
environment_variables:
  hello_env_var: "hello world"

YAML: source code

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: ls
code:
  local_path: src
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: literal inputs

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >-
  echo ${{inputs.hello_string}}
  &&
  echo ${{inputs.hello_number}}
environment:
  image: python:latest
inputs:
  hello_string: "hello world"
  hello_number: 42
compute: azureml:cpu-cluster

YAML: write to default outputs

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ./outputs/helloworld.txt
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: write to named data output

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ${{outputs.hello_output}}/helloworld.txt
outputs:
  hello_output:
environment:
  image: python
compute: azureml:cpu-cluster

YAML: datastore URI file input

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >-
  echo "--iris-csv: ${{inputs.iris_csv}}"
  &&
  pip install pandas
  &&
  python hello-iris.py
  --iris-csv ${{inputs.iris_csv}}
code:
  local_path: src
inputs:
  iris_csv: 
    file: azureml://datastores/workspaceblobstore/paths/example-data/iris.csv
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: datastore URI folder input

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >-
  ls ${{inputs.data_dir}}
  &&
  echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
  &&
  pip install pandas
  &&
  python hello-iris.py
  --iris-csv ${{inputs.data_dir}}/iris.csv
code:
  local_path: src
inputs:
  data_dir: 
    folder: azureml://datastores/workspaceblobstore/paths/example-data/
    mode: rw_mount
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: URI file input

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >-
  echo "--iris-csv: ${{inputs.iris_csv}}"
  &&
  pip install pandas
  &&
  python hello-iris.py
  --iris-csv ${{inputs.iris_csv}}
code:
  local_path: src
inputs:
  iris_csv: 
    file: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: URI folder input

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >-
  ls ${{inputs.data_dir}}
  &&
  echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
  &&
  pip install pandas
  &&
  python hello-iris.py
  --iris-csv ${{inputs.data_dir}}/iris.csv
code:
  local_path: src
inputs:
  data_dir: 
    folder: wasbs://datasets@azuremlexamples.blob.core.windows.net/
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: Notebook via papermill

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >-
  pip install ipykernel papermill
  &&
  papermill hello-notebook.ipynb outputs/out.ipynb -k python
code:
  local_path: src
environment:
  image: python:latest
compute: azureml:cpu-cluster

YAML: basic Python model training

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: 
  local_path: src
command: >-
  python main.py 
  --iris-csv ${{inputs.iris_csv}}
  --C ${{inputs.C}}
  --kernel ${{inputs.kernel}}
  --coef0 ${{inputs.coef0}}
inputs:
  iris_csv: 
    file: wasbs://datasets@azuremlexamples.blob.core.windows.net/iris.csv
  C: 0.8
  kernel: "rbf"
  coef0: 0.1
environment: azureml:AzureML-sklearn-0.24-ubuntu18.04-py37-cpu:9
compute: azureml:cpu-cluster
display_name: sklearn-iris-example
experiment_name: sklearn-iris-example
description: Train a scikit-learn SVM on the Iris dataset.

YAML: basic R model training with local Docker build context

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >
  Rscript train.R 
  --data_folder ${{inputs.iris}}
code:
  local_path: src
inputs:
  iris: 
    file: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
environment:
  build:
    local_path: docker-context
compute: azureml:cpu-cluster
display_name: r-iris-example
experiment_name: r-iris-example
description: Train an R model on the Iris dataset.

YAML: distributed PyTorch

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: 
  local_path: src
command: >-
  python train.py 
  --epochs ${{inputs.epochs}}
  --learning-rate ${{inputs.learning_rate}}
  --data-dir ${{inputs.cifar}}
inputs:
  epochs: 1
  learning_rate: 0.2
  cifar:
    dataset: azureml:cifar-10-example:1
environment: azureml:AzureML-pytorch-1.9-ubuntu18.04-py37-cuda11-gpu:6
compute: azureml:gpu-cluster
distribution:
  type: pytorch 
  process_count_per_instance: 2
resources:
  instance_count: 2
display_name: pytorch-cifar-distributed-example
experiment_name: pytorch-cifar-distributed-example
description: Train a basic convolutional neural network (CNN) with PyTorch on the CIFAR-10 dataset, distributed via PyTorch.

YAML: distributed TensorFlow

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code:
  local_path: src
command: >-
  python train.py 
  --epochs ${{inputs.epochs}}
  --model-dir ${{inputs.model_dir}}
inputs:
  epochs: 1
  model_dir: outputs/keras-model
environment: azureml:AzureML-tensorflow-2.4-ubuntu18.04-py37-cuda11-gpu:14
compute: azureml:gpu-cluster
resources:
  instance_count: 2
distribution:
  type: tensorflow
  worker_count: 2
  worker_count: 2
display_name: tensorflow-mnist-distributed-example
experiment_name: tensorflow-mnist-distributed-example
description: Train a basic neural network with TensorFlow on the MNIST dataset, distributed via TensorFlow.

YAML: distributed MPI

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code:
  local_path: src
command: >-
  python train.py
  --epochs ${{inputs.epochs}}
inputs:
  epochs: 1
environment: azureml:AzureML-tensorflow-2.4-ubuntu18.04-py37-cuda11-gpu:14
compute: azureml:gpu-cluster
resources:
  instance_count: 2
distribution:
  type: mpi
  process_count_per_instance: 2
display_name: tensorflow-mnist-distributed-horovod-example
experiment_name: tensorflow-mnist-distributed-horovod-example
description: Train a basic neural network with TensorFlow on the MNIST dataset, distributed via Horovod.

Next steps