Deploy existing pipeline jobs to batch endpoints

Article
11/15/2023

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

Batch endpoints allow you to deploy pipeline components, providing a convenient way to operationalize pipelines in Azure Machine Learning. Batch endpoints accept pipeline components for deployment. However, if you already have a pipeline job that runs successfully, Azure Machine Learning can accept that job as input to your batch endpoint and create the pipeline component automatically for you. In this article, you'll learn how to use your existing pipeline job as input for batch deployment.

You'll learn to:

Run and create the pipeline job that you want to deploy
Create a batch deployment from the existing job
Test the deployment

About this example

In this example, we're going to deploy a pipeline consisting of a simple command job that prints "hello world!". Instead of registering the pipeline component before deployment, we indicate an existing pipeline job to use for deployment. Azure Machine Learning will then create the pipeline component automatically and deploy it as a batch endpoint pipeline component deployment.

The example in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy/paste YAML and other files, first clone the repo and then change directories to the folder:

Azure CLI
Python

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/cli

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/sdk/python

The files for this example are in:

cd endpoints/batch/deploy-pipelines/hello-batch

Prerequisites

Before following the steps in this article, make sure you have the following prerequisites:

An Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning.
An Azure Machine Learning workspace. If you don't have one, use the steps in the Manage Azure Machine Learning workspaces article to create one.
Ensure that you have the following permissions in the workspace:
- Create or manage batch endpoints and deployments: Use an Owner, Contributor, or Custom role that allows Microsoft.MachineLearningServices/workspaces/batchEndpoints/*.
- Create ARM deployments in the workspace resource group: Use an Owner, Contributor, or Custom role that allows Microsoft.Resources/deployments/write in the resource group where the workspace is deployed.
You need to install the following software to work with Azure Machine Learning:
- Azure CLI
- Python
The Azure CLI and the ml extension for Azure Machine Learning.
```
az extension add -n ml
```
Note

Pipeline component deployments for Batch Endpoints were introduced in version 2.7 of the ml extension for Azure CLI. Use az extension update --name ml to get the last version of it.
The Azure Machine Learning SDK for Python.
```
pip install azure-ai-ml
```
Note

Classes ModelBatchDeployment and PipelineComponentBatchDeployment were introduced in version 1.7.0 of the SDK. Use pip install -U azure-ai-ml to get the last version of it.

Connect to your workspace

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section, we'll connect to the workspace in which you'll perform deployment tasks.

Azure CLI
Python

Pass in the values for your subscription ID, workspace, location, and resource group in the following code:

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

Import the required libraries:

from azure.ai.ml import MLClient, Input, load_component
from azure.ai.ml.entities import BatchEndpoint, ModelBatchDeployment, ModelBatchDeploymentSettings, PipelineComponentBatchDeployment, Model, AmlCompute, Data, BatchRetrySettings, CodeConfiguration, Environment, Data
from azure.ai.ml.constants import AssetTypes, BatchDeploymentOutputAction
from azure.ai.ml.dsl import pipeline
from azure.identity import DefaultAzureCredential

Configure the workspace details and get a handle to the workspace:

Pass in the values for your subscription ID, workspace, and resource group in the following code:

subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

Run the pipeline job you want to deploy

In this section, we begin by running a pipeline job:

Azure CLI
Python

The following pipeline-job.yml file contains the configuration for the pipeline job:

pipeline-job.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

experiment_name: hello-pipeline-batch
display_name: hello-pipeline-batch-job
description: This job demonstrates how to run the a pipeline component in a pipeline job. You can use this example to test a component in an standalone job before deploying it in an endpoint.

compute: batch-cluster
component: hello-component/hello.yml

Load the pipeline component and instantiate it:

hello_batch = load_component(source="hello-component/hello.yml")
pipeline_job = hello_batch()

Now, configure some run settings to run the test. This article assumes you have a compute cluster named batch-cluster. You can replace the cluster with the name of yours.

pipeline_job.settings.default_compute = "batch-cluster"
pipeline_job.settings.default_datastore = "workspaceblobstore"

Create the pipeline job:

Azure CLI
Python

JOB_NAME=$(az ml job create -f pipeline-job.yml --query name -o tsv)

pipeline_job_run = ml_client.jobs.create_or_update(
    pipeline_job, experiment_name="hello-batch-pipeline"
)
pipeline_job_run

Create a batch endpoint

Before we deploy the pipeline job, we need to deploy a batch endpoint to host the deployment.

Provide a name for the endpoint. A batch endpoint's name needs to be unique in each region since the name is used to construct the invocation URI. To ensure uniqueness, append any trailing characters to the name specified in the following code.
- Azure CLI
- Python
```
ENDPOINT_NAME="hello-batch"
```
```
endpoint_name="hello-batch"
```

Configure the endpoint:

Azure CLI
Python

The endpoint.yml file contains the endpoint's configuration.

endpoint.yml

$schema: https://azuremlschemas.azureedge.net/latest/batchEndpoint.schema.json
name: hello-batch
description: A hello world endpoint for component deployments.
auth_mode: aad_token

endpoint = BatchEndpoint(
    name=endpoint_name,
    description="A hello world endpoint for component deployments",
)

Create the endpoint:

Azure CLI
Python

az ml batch-endpoint create --name $ENDPOINT_NAME  -f endpoint.yml

ml_client.batch_endpoints.begin_create_or_update(endpoint).result()

Query the endpoint URI:

Azure CLI
Python

az ml batch-endpoint show --name $ENDPOINT_NAME

endpoint = ml_client.batch_endpoints.get(name=endpoint_name)
print(endpoint)

Deploy the pipeline job

To deploy the pipeline component, we have to create a batch deployment from the existing job.

We need to tell Azure Machine Learning the name of the job that we want to deploy. In our case, that job is indicated in the following variable:
- Azure CLI
- Python
```
echo $JOB_NAME
```
```
print(job.name)
```

Configure the deployment.

Azure CLI
Python

The deployment-from-job.yml file contains the deployment's configuration. Notice how we use the key job_definition instead of component to indicate that this deployment is created from a pipeline job:

deployment-from-job.yml

$schema: https://azuremlschemas.azureedge.net/latest/pipelineComponentBatchDeployment.schema.json
name: hello-batch-from-job
endpoint_name: hello-pipeline-batch
type: pipeline
job_definition: azureml:job_name_placeholder
settings:
    continue_on_step_failure: false
    default_compute: batch-cluster

Notice now how we use the property job_definition instead of component:

deployment = PipelineComponentBatchDeployment(
    name="hello-batch-from-job",
    description="A hello world deployment with a single step. This deployment is created from a pipeline job.",
    endpoint_name=endpoint.name,
    job_definition=pipeline_job_run,
    settings={
        "default_compute": "batch-cluster",
        "continue_on_step_failure": False
    }
)

Tip

This configuration assumes you have a compute cluster named batch-cluster. You can replace this value with the name of your cluster.

Create the deployment:
- Azure CLI
- Python
Run the following code to create a batch deployment under the batch endpoint and set it as the default deployment.
```
az ml batch-deployment create --endpoint $ENDPOINT_NAME --set job_definition=azureml:$JOB_NAME -f deployment-from-job.yml
```
Tip

Notice the use of --set job_definition=azureml:$JOB_NAME. Since job names are unique, the command --set is used here to change the name of the job when you run it in your workspace.
This command starts the deployment creation and returns a confirmation response while the deployment creation continues.
```
ml_client.batch_deployments.begin_create_or_update(deployment).result()
```
Once created, let's configure this new deployment as the default one:
```
endpoint = ml_client.batch_endpoints.get(endpoint.name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()
```
Your deployment is ready for use.

Test the deployment

Once the deployment is created, it's ready to receive jobs. You can invoke the default deployment as follows:

Azure CLI
Python

JOB_NAME=$(az ml batch-endpoint invoke -n $ENDPOINT_NAME --query name -o tsv)

job = ml_client.batch_endpoints.invoke(
    endpoint_name=endpoint.name, 
)

You can monitor the progress of the show and stream the logs using:

Azure CLI
Python

az ml job stream -n $JOB_NAME

ml_client.jobs.get(name=job.name)

To wait for the job to finish, run the following code:

ml_client.jobs.stream(name=job.name)

Clean up resources

Once you're done, delete the associated resources from the workspace:

Azure CLI
Python

Run the following code to delete the batch endpoint and its underlying deployment. --yes is used to confirm the deletion.

az ml batch-endpoint delete -n $ENDPOINT_NAME --yes

Delete the endpoint:

ml_client.batch_endpoints.begin_delete(endpoint.name).result()