azureml.pipeline.steps.mpi_step.MpiStep class - Azure Machine Learning Python

name: str

default value: None

[Required] The name of the module.

source_directory: str

default value: None

[Required] A folder that contains Python script, conda env, and other resources used in the step.

script_name: str

default value: None

[Required] The name of a Python script relative to source_directory.

arguments: list

default value: None

[Required] A list of command-line arguments.

compute_target: AmlCompute, str

default value: None

[Required] A compute target to use.

node_count: int

default value: None

[Required] The number of nodes in the compute target used for training. If greater than 1, an mpi distributed job will be run. Only AmlCompute compute target is supported for distributed jobs. PipelineParameter values are supported.

process_count_per_node: int

default value: None

[Required] The number of processes per node. If greater than 1, an mpi distributed job will be run. Only AmlCompute compute target is supported for distributed jobs. PipelineParameter values are supported.

inputs: list[Union[InputPortBinding, DataReference, PortDataReference, PipelineData, PipelineOutputAbstractDataset, DatasetConsumptionConfig]]

default value: None

A list of input port bindings.

outputs: list[Union[PipelineData, PipelineOutputAbstractDataset, OutputPortBinding]]

default value: None

A list of output port bindings.

params: dict

Required

A dictionary of name-value pairs registered as environment variables with "AML_PARAMETER_".

allow_reuse: bool

default value: True

Indicates whether the step should reuse previous results when re-run with the same settings. Reuse is enabled by default. If the step contents (scripts/dependencies) as well as inputs and parameters remain unchanged, the output from the previous run of this step is reused. When reusing the step, instead of submitting the job to compute, the results from the previous run are immediately made available to any subsequent steps. If you use Azure Machine Learning datasets as inputs, reuse is determined by whether the dataset's definition has changed, not by whether the underlying data has changed.

version: str

default value: None

An optional version tag to denote a change in functionality for the module.

hash_paths: list

default value: None

DEPRECATED: no longer needed.

A list of paths to hash when checking for changes to the step contents. If there are no changes detected, the pipeline will reuse the step contents from a previous run. By default, the contents of source_directory is hashed except for files listed in .amlignore or .gitignore.

use_gpu: bool

Required

Indicates whether the environment to run the experiment should support GPUs. If True, a GPU-based default Docker image will be used in the environment. If False, a CPU-based image will be used. Default docker images (CPU or GPU) will be used only if the custom_docker_image parameter is not set. This setting is used only in Docker-enabled compute targets.

use_docker: bool

Required

Indicates whether the environment to run the experiment should be Docker-based.

custom_docker_image: str

Required

The name of the Docker image from which the image to use for training will be built. If not set, a default CPU-based image will be used as the base image.

image_registry_details: ContainerRegistry

Required

The details of the Docker image registry.

user_managed: bool

Required

Indicates whether Azure ML reuses an existing Python environment; False means that Azure ML will create a Python environment based on the conda dependencies specification.

conda_packages: list

Required

A list of strings representing conda packages to be added to the Python environment.

pip_packages: list

Required

A list of strings representing pip packages to be added to the Python environment.

pip_requirements_file_path: str

Required

The relative path to the pip requirements text file. This parameter can be specified in combination with the pip_packages parameter.

environment_definition: EnvironmentDefinition

Required

The EnvironmentDefinition for the experiment. It includes PythonSection and DockerSection and environment variables. Any environment option not directly exposed through other parameters to the MpiStep construction can be set using environment_definition parameter. If this parameter is specified, it will take precedence over other environment related parameters like use_gpu, custom_docker_image, conda_packages or pip_packages and errors will be reported on these invalid combinations.

name: str

Required

[Required] The name of the module.

source_directory: str

Required

[Required] A folder that contains Python script, conda env, and other resources used in the step.

script_name: str

Required

[Required] The name of a Python script relative to source_directory.

arguments: list

Required

[Required] A list of command-line arguments.

compute_target: <xref:azureml.core.compute.AmlComputeCompute>, str

Required

[Required] A compute target to use.

node_count: int

Required

[Required] Number of nodes in the compute target used for training. If greater than 1, mpi distributed job will be run. Only AmlCompute compute target is supported for distributed jobs. PipelineParameter values are supported.

process_count_per_node: int

Required

[Required] Number of processes per node. If greater than 1, mpi distributed job will be run. Only AmlCompute compute target is supported for distributed jobs. PipelineParameter values are supported.

inputs: list[Union[InputPortBinding, DataReference, PortDataReference, PipelineData, PipelineOutputAbstractDataset, DatasetConsumptionConfig]]

Required

A list of input port bindings.

outputs: list[Union[PipelineData, OutputDatasetConfig, PipelineOutputAbstractDataset, OutputPortBinding]]

Required

A list of output port bindings.

params: dict

Required

A dictionary of name-value pairs registered as environment variables with ">>AML_PARAMETER_<<".

allow_reuse: bool

Required

Indicates Whether the step should reuse previous results when re-run with the same parameters remain unchanged, the output from the previous run of this step is reused. When reusing the step, instead of submitting the job to compute, the results from the previous run are immediately made available to any subsequent steps. If you use Azure Machine Learning datasets as inputs, reuse is determined by whether the dataset's definition has changed, not by whether the underlying data has changed.

version: str

Required

Optional version tag to denote a change in functionality for the module

hash_paths: list

Required

DEPRECATED: no longer needed.

A list of paths to hash when checking for changes to the step contents. If there are no changes detected, the pipeline will reuse the step contents from a previous run. By default, the contents of source_directory is hashed except for files listed in .amlignore or .gitignore.

use_gpu: bool

Required

Indicates whether the environment to run the experiment should support GPUs. If True, a GPU-based default Docker image will be used in the environment. If False, a CPU-based image will be used. Default docker images (CPU or GPU) will be used only if the custom_docker_image parameter is not set. This setting is used only in Docker-enabled compute targets.

use_docker: bool

Required

Indicates whether the environment to run the experiment should be Docker-based. custom_docker_image (str): The name of the docker image from which the image to use for mpi job will be built. If not set, a default CPU based image will be used as the base image.

custom_docker_image: str

Required

The name of the Docker image from which the image to use for training will be built. If not set, a default CPU-based image will be used as the base image.

image_registry_details: ContainerRegistry

Required

The details of the Docker image registry.

user_managed: bool

Required

Indicates whether Azure ML reuses an existing Python environment; False means that Azure ML will create a Python environment based on the conda dependencies specification.

conda_packages: list

Required

A list of strings representing conda packages to be added to the Python environment.

pip_packages: list

Required

A list of strings representing pip packages to be added to the Python environment.

pip_requirements_file_path: str

Required

The relative path to the pip requirements text file. This parameter can be specified in combination with the pip_packages parameter.

environment_definition: EnvironmentDefinition

Required

The EnvironmentDefinition for the experiment. It includes PythonSection and DockerSection and environment variables. Any environment option not directly exposed through other parameters to the MpiStep construction can be set using environment_definition parameter. If this parameter is specified, it will take precedence over other environment related parameters like use_gpu, custom_docker_image, conda_packages or pip_packages and errors will be reported on these invalid combinations.

MpiStep Class

Constructor

Parameters

Feedback

Feedback

Additional resources