steps package

Contains pre-built steps that can be executed in an Azure Machine Learning Pipeline.

Azure ML Pipeline steps can be configured together to construct a Pipeline, which represents a shareable and reusable Azure Machine Learning workflow. Each step of a pipeline can be configured to allow reuse of its previous run results if the step contents (scripts and dependencies) as well as inputs and parameters remain unchanged.

The classes in this package are typically used together with the classes in the core package. The core package contains classes for configuring data (PipelineData), scheduling (Schedule), and managing the output of steps (StepRun).

The pre-built steps in this package cover many common scenarios encountered in machine learning workflows. To get started with pre-built pipeline steps, see:

Modules

adla_step

Contains functionality to create an Azure ML Pipeline step to run a U-SQL script with Azure Data Lake Analytics.

azurebatch_step

Contains functionality to create an Azure ML Pipeline step that runs a Windows executable in Azure Batch.

data_transfer_step

Contains functionality to create an Azure ML Pipeline step that transfers data between storage options.

databricks_step

Contains functionality to create an Azure ML pipeline step to run a Databricks notebook or Python script on DBFS.

estimator_step

Contains functionality to create a pipeline step that runs an Estimator for Machine Learning model training.

hyper_drive_step

Contains funtionality for creating and managing Azure ML Pipeline steps that run hyperparameter tuning.

module_step

Contains functionality to add an Azure Machine Learning Pipeline step using an existing version of a Module.

mpi_step

Contains functionality to add a Azure ML Pipeline step to run an MPI job for Machine Learning model training.

python_script_step

Contains functionality to create an Azure ML Pipeline step that runs Python script.

r_script_step

Contains functionality to create an Azure ML Pipeline step that runs R script.

Classes

AdlaStep

Creates an Azure ML Pipeline step to run a U-SQL script with Azure Data Lake Analytics.

For an example of using this AdlaStep, see the notebook https://aka.ms/pl-adla.

AzureBatchStep

Creates an Azure ML Pipeline step for submitting jobs to Azure Batch.

Note: This step does not support upload/download of directories and their contents.

For an example of using AzureBatchStep, see the notebook https://aka.ms/pl-azbatch.

DatabricksStep

Creates an Azure ML Pipeline step to add a DataBricks notebook, Python script, or JAR as a node.

For an example of using DatabricksStep, see the notebook https://aka.ms/pl-databricks.

DataTransferStep

Creates an Azure ML Pipeline step that transfers data between storage options.

This step supports the following storage types as sources and sinks except where noted:

  • Azure Blob Storage

  • Azure Data Lake Storage Gen1 and Gen2

  • Azure SQL Database

  • Azure Database for PostgreSQL

  • Azure Database for MySQL

For an example of using DataTransferStep, see the notebook https://aka.ms/pl-data-trans.

EstimatorStep

Creates an Azure ML Pipeline step to run Estimator for Machine Learning model training.

For an example of using EstimatorStep, see the notebook https://aka.ms/pl-estimator.

Supported values: 'NodeCount', 'MpiProcessCountPerNode', 'TensorflowWorkerCount', 'TensorflowParameterServerCount'

HyperDriveStep

Creates an Azure ML Pipeline step to run hyperparameter tunning for Machine Learning model training.

For an example of using HyperDriveStep, see the notebook https://aka.ms/pl-hyperdrive.

HyperDriveStepRun

Manage, check status, and retrieve run details for a HyperDriveStep pipeline step.

HyperDriveStepRun provides the functionality of HyperDriveRun with the additional support of StepRun. The HyperDriveStepRun class enables you to manage, check status, and retrieve run details for the HyperDrive run and each of its generated child runs. The StepRun class enables you to do this once the parent pipeline run is submitted and the pipeline has submitted the step run.

ModuleStep

Creates an Azure Machine Learning pipeline step to run a specific version of a Module.

Module objects define reusable computations, such as scripts or executables, that can be used in different machine learning scenarios and by different users. To use a specific version of a Module in a pipeline create a ModuleStep. A ModuleStep is a step in pipeline that uses an existing ModuleVersion.

For an example of using ModuleStep, see the notebook https://aka.ms/pl-modulestep.

MpiStep

Creates an Azure ML pipeline step to run an MPI job.

For an example of using MpiStep, see the notebook https://aka.ms/pl-style-trans.

PythonScriptStep

Creates an Azure ML Pipeline step that runs Python script.

For an example of using PythonScriptStep, see the notebook https://aka.ms/pl-get-started.

Supported values: 'NodeCount', 'MpiProcessCountPerNode', 'TensorflowWorkerCount', 'TensorflowParameterServerCount'

RScriptStep

Creates an Azure ML Pipeline step that runs R script.

Supported values: 'NodeCount', 'MpiProcessCountPerNode', 'TensorflowWorkerCount', 'TensorflowParameterServerCount'