Use batch endpoints (preview) for batch scoring
Learn how to use batch endpoints (preview) to do batch scoring. Batch endpoints simplify the process of hosting your models for batch scoring, so you can focus on machine learning, not infrastructure. For more information, see What are Azure Machine Learning endpoints (preview)?.
In this article, you learn to do the following tasks:
- Create a batch endpoint and a default batch deployment
- Start a batch scoring job using Azure CLI
- Monitor batch scoring job execution progress and check scoring results
- Deploy a new MLflow model with auto generated code and environment to an existing endpoint without impacting the existing flow
- Test the new deployment and set it as the default deployment
- Delete the not in-use endpoint and deployment
Important
This feature is currently in public preview. This preview version is provided without a service-level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.
Prerequisites
You must have an Azure subscription to use Azure Machine Learning. If you don't have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning today.
Install the Azure CLI and the
mlextension. Follow the installation steps in Install, set up, and use the CLI (v2) (preview).Create an Azure resource group if you don't have one, and you (or the service principal you use) must have
Contributorpermission. For resource group creation, see Install, set up, and use the CLI (v2) (preview).Create an Azure Machine Learning workspace if you don't have one. For workspace creation, see Install, set up, and use the CLI (v2) (preview).
Configure your default workspace and resource group for the Azure CLI. Machine Learning CLI commands require the
--workspace/-wand--resource-group/-gparameters. Configure the defaults can avoid passing in the values multiple times. You can override these on the command line. Run the following code to set up your defaults. For more information, see Install, set up, and use the CLI (v2) (preview).
az account set -s "<subscription ID>"
az configure --defaults group="<resource group>" workspace="<workspace name>" location="<location>"
Clone the example repository
Run the following commands to clone the AzureML Example repository and go to the cli directory. This article uses the assets in /cli/endpoints/batch, and the end to end working example is /cli/batch-score.sh.
git clone https://github.com/Azure/azureml-examples
cd azureml-examples/cli
Set your endpoint name. Replace YOUR_ENDPOINT_NAME with a unique name within an Azure region.
For Unix, run this command:
export ENDPOINT_NAME="<YOUR_ENDPOINT_NAME>"
For Windows, run this command:
set ENDPOINT_NAME="<YOUR_ENDPOINT_NAME>"
Note
Batch endpoint names need to be unique within an Azure region. For example, there can be only one batch endpoint with the name mybatchendpoint in westus2.
Create compute
Batch endpoint runs only on cloud computing resources, not locally. The cloud computing resource is a reusable virtual computer cluster. Run the following code to create an Azure Machine Learning compute cluster. The following examples in this article use the compute created here named batch-cluster. Adjust as needed and reference your compute using azureml:<your-compute-name>.
az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 5
Note
You are not charged for compute at this point as the cluster will remain at 0 nodes until a batch endpoint is invoked and a batch scoring job is submitted. Learn more about manage and optimize cost for AmlCompute.
Understand batch endpoints and batch deployments
A batch endpoint is an HTTPS endpoint that clients can call to trigger a batch scoring job. A batch scoring job is a job that scores multiple inputs (for more, see What are batch endpoints?). A batch deployment is a set of compute resources hosting the model that does the actual batch scoring. One batch endpoint can have multiple batch deployments.
Tip
One of the batch deployments will serve as the default deployment for the endpoint. The default deployment will be used to do the actual batch scoring when the endpoint is invoked. Learn more about batch endpoints and batch deployment.
The following YAML file defines a batch endpoint, which you can include in the CLI command for batch endpoint creation. In the repository, this file is located at /cli/endpoints/batch/batch-endpoint.yml.
$schema: https://azuremlschemas.azureedge.net/latest/batchEndpoint.schema.json
name: mybatchedp
description: my sample batch endpoint
auth_mode: aad_token
The following table describes the key properties of the endpoint YAML. For the full batch endpoint YAML schema, see CLI (v2) batch endpoint YAML schema.
| Key | Description |
|---|---|
$schema |
[Optional] The YAML schema. You can view the schema in the above example in a browser to see all available options for a batch endpoint YAML file. |
name |
The name of the batch endpoint. Needs to be unique at the Azure region level. |
auth_mode |
The authentication method for the batch endpoint. Currently only Azure Active Directory token-based authentication (aad_token) is supported. |
defaults.deployment_name |
The name of the deployment that will serve as the default deployment for the endpoint. |
To create a batch deployment, you need all the following items:
- Model files, or a registered model in your workspace referenced using
azureml:<model-name>:<model-version>. - The code to score the model.
- The environment in which the model runs. It can be a Docker image with Conda dependencies, or an environment already registered in your workspace referenced using
azureml:<environment-name>:<environment-version>. - The pre-created compute referenced using
azureml:<compute-name>and resource settings.
For more information about how to reference an Azure ML entity, see Referencing an Azure ML entity.
The example repository contains all the required files. The following YAML file defines a batch deployment with all the required inputs and optional settings. You can include this file in your CLI command to create your batch deployment. In the repository, this file is located at /cli/endpoints/batch/nonmlflow-deployment.yml.
$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
name: nonmlflowdp
endpoint_name: mybatchedp
model:
local_path: ./mnist/model/
code_configuration:
code:
local_path: ./mnist/code/
scoring_script: digit_identification.py
environment:
conda_file: ./mnist/environment/conda.yml
image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest
compute: azureml:batch-cluster
resources:
instance_count: 1
max_concurrency_per_instance: 2
mini_batch_size: 10
output_action: append_row
output_file_name: predictions.csv
retry_settings:
max_retries: 3
timeout: 30
error_threshold: -1
logging_level: info
The following table describes the key properties of the deployment YAML. For the full batch deployment YAML schema, see CLI (v2) batch deployment YAML schema.
| Key | Description |
|---|---|
$schema |
[Optional] The YAML schema. You can view the schema in the above example in a browser to see all available options for a batch deployment YAML file. |
name |
The name of the deployment. |
endpoint_name |
The name of the endpoint to create the deployment under. |
model |
The model to be used for batch scoring. The example defines a model inline using local_path. Model files will be automatically uploaded and registered with an autogenerated name and version. Follow the Model schema for more options. As a best practice for production scenarios, you should create the model separately and reference it here. To reference an existing model, use the azureml:<model-name>:<model-version> syntax. |
code_configuration.code.local_path |
The directory that contains all the Python source code to score the model. |
code_configuration.scoring_script |
The Python file in the above directory. This file must have an init() function and a run() function. Use the init() function for any costly or common preparation (for example, load the model in memory). init() will be called only once at beginning of process. Use run(mini_batch) to score each entry; the value of mini_batch is a list of file paths. The run() function should return a pandas DataFrame or an array. Each returned element indicates one successful run of input element in the mini_batch. Make sure that enough data is included in your run() response to correlate the input with the output. |
environment |
The environment to score the model. The example defines an environment inline using conda_file and image. The conda_file dependencies will be installed on top of the image. The environment will be automatically registered with an autogenerated name and version. Follow the Environment schema for more options. As a best practice for production scenarios, you should create the environment separately and reference it here. To reference an existing environment, use the azureml:<environment-name>:<environment-version> syntax. |
compute |
The compute to run batch scoring. The example uses the batch-cluster created at the beginning and reference it using azureml:<compute-name> syntax. |
resources.instance_count |
The number of instances to be used for each batch scoring job. |
max_concurrency_per_instance |
[Optional] The maximum number of parallel scoring_script runs per instance. |
mini_batch_size |
[Optional] The number of files the scoring_script can process in one run() call. |
output_action |
[Optional] How the output should be organized in the output file. append_row will merge all run() returned output results into one single file named output_file_name. summary_only will not merge the output results and only calculate error_threshold. |
output_file_name |
[Optional] The name of the batch scoring output file for append_row output_action. |
retry_settings.max_retries |
[Optional] The number of max tries for a failed scoring_script run(). |
retry_settings.timeout |
[Optional] The timeout in seconds for a scoring_script run() for scoring a mini batch. |
error_threshold |
[Optional] The number of input file scoring failures that should be ignored. If the error count for the entire input goes above this value, the batch scoring job will be terminated. The example uses -1, which indicates that any number of failures is allowed without terminating the batch scoring job. |
logging_level |
[Optional] Log verbosity. Values in increasing verbosity are: WARNING, INFO, and DEBUG. |
Understand the scoring script
As mentioned earlier, the code_configuration.scoring_script must contain two functions:
init(): Use this function for any costly or common preparation. For example, use it to load the model into a global object. This function will be called once at the beginning of the process.run(mini_batch): This function will be called for eachmini_batchand do the actual scoring.mini_batch: Themini_batchvalue is a list of file paths.response: Therun()method should return a pandas DataFrame or an array. Each returned output element indicates one successful run of an input element in the inputmini_batch. Make sure that enough data (for example, an identifier of each input element) is included in therun()response to correlate an input with an output result.
The example uses /cli/endpoints/batch/mnist/code/digit_identification.py. The model is loaded in init() from AZUREML_MODEL_DIR, which is the path to the model folder created during deployment. run(mini_batch) iterates each file in mini_batch, does the actual model scoring and then returns output results.
Deploy with batch endpoints and run batch scoring
Now, let's deploy the model with batch endpoints and run batch scoring.
Create a batch endpoint
The simplest way to create a batch endpoint is to run the following code providing only a --name.
az ml batch-endpoint create --name $ENDPOINT_NAME
You can also create a batch endpoint using a YAML file. Add --file parameter in above command and specify the YAML file path.
Create a batch deployment
Run the following code to create a batch deployment named nonmlflowdp under the batch endpoint and set it as the default deployment.
az ml batch-deployment create --name nonmlflowdp --endpoint-name $ENDPOINT_NAME --file endpoints/batch/nonmlflow-deployment.yml --set-default
Tip
The --set-default parameter sets the newly created deployment as the default deployment of the endpoint. It's a convenient way to create a new default deployment of the endpoint, especially for the first deployment creation. As a best practice for production scenarios, you may want to create a new deployment without setting it as default, verify it, and update the default deployment later. For more information, see the Deploy a new model section.
Check batch endpoint and deployment details
Use show to check endpoint and deployment details.
To check a batch deployment, run the following code:
az ml batch-deployment show --name nonmlflowdp --endpoint-name $ENDPOINT_NAME
To check a batch endpoint, run the following code. As the newly created deployment is set as the default deployment, you should see nonmlflowdp in defaults.deployment_name from the response.
az ml batch-endpoint show --name $ENDPOINT_NAME
Invoke the batch endpoint to start a batch scoring job
Invoke a batch endpoint triggers a batch scoring job. A job name will be returned from the invoke response and can be used to track the batch scoring progress. The batch scoring job runs for a period of time. It splits the entire inputs into multiple mini_batch and processes in parallel on the compute cluster. One scoring_scrip run() takes one mini_batch and processes it by a process on an instance. The batch scoring job outputs will be stored in cloud storage, either in the workspace's default blob storage, or the storage you specified.
Invoke the batch endpoint with different input options
You can either use CLI or REST to invoke the endpoint. For REST experience, see Use batch endpoints(preview) with REST
There are three options to specify the data inputs in CLI invoke.
Option 1: Data in the cloud
Use
--input-pathto specify a folder (use prefixfolder:) or a file (use prefixfile:) in an Azure Machine Learning registered datastore. The syntax for the data URI isfolder:azureml://datastores/<datastore-name>/paths/<data-path>/for folder, andfile:azureml://datastores/<datastore-name>/paths/<data-path>/<file-name>for a specific file. For more information about data URI, see Azure Machine Learning data reference URI.The example uses publicly available data in a folder from
https://pipelinedata.blob.core.windows.net/sampledata/mnist, which contains thousands of hand-written digits. Name of the batch scoring job will be returned from the invoke response. Run the following code to invoke the batch endpoint using this data.--query nameis added to only return the job name from the invoke response, and it will be used later to Monitor batch scoring job execution progress and Check batch scoring results. Remove--query name -o tsvif you want to see the full invoke response. For more information on the--queryparameter, see Query Azure CLI command output.JOB_NAME=$(az ml batch-endpoint invoke --name $ENDPOINT_NAME --input-path folder:https://pipelinedata.blob.core.windows.net/sampledata/mnist --query name -o tsv)Option 2: Registered dataset
Use
--input-datasetto pass in an Azure Machine Learning registered dataset. To create a dataset, checkaz ml dataset create -hfor instruction, and follow the Dataset schema.Note
FileDataset that is created using the preceding version of the CLI and Python SDK can also be used. TabularDataset is not supported.
az ml batch-endpoint invoke --name $ENDPOINT_NAME --input-dataset azureml:<dataset-name>:<dataset-version>Option 3: Data stored locally
Use
--input-local-pathto pass in data files stored locally. The data files will be automatically uploaded and registered with an autogenerated name and version.az ml batch-endpoint invoke --name $ENDPOINT_NAME --input-local-path <local-path>
Configure the output location and overwrite settings
The batch scoring results are by default stored in the workspace's default blob store within a folder named by job name (a system-generated GUID). You can configure where to store the scoring outputs when you invoke the batch endpoint. Use --output-path to configure any folder: in an Azure Machine Learning registered datastore. The syntax for the --output-path folder: is the same as --input-path folder:. Use --set output_file_name=<your-file-name> to configure a new output file name if you prefer having one output file containing all scoring results (specified output_action=append_row in your deployment YAML).
Important
You must use a unique output location. If the output file exists, the batch scoring job will fail.
Some settings can be overwritten when invoke to make best use of the compute resources and to improve performance:
- Use
--instance-countto overwriteinstance_count. For example, for larger volume of data inputs, you may want to use more instances to speed up the end to end batch scoring. - Use
--mini-batch-sizeto overwritemini_batch_size. The number of mini batches is decided by total input file counts and mini_batch_size. Smaller mini_batch_size generates more mini batches. Mini batches can be run in parallel, but there might be extra scheduling and invocation overhead. - Use
--setto overwrite other settings includingmax_retries,timeout, anderror_threshold. These settings might impact the end to end batch scoring time for different workloads.
To specify the output location and overwrite settings when invoke, run the following code. The example stores the outputs in a folder with the same name as the endpoint in the workspace's default blob storage, and also uses a random file name to ensure the output location uniqueness. The code should work in Unix. Replace with your own unique folder and file name.
export OUTPUT_FILE_NAME=predictions_`echo $RANDOM`.csv
JOB_NAME=$(az ml batch-endpoint invoke --name $ENDPOINT_NAME --input-path folder:https://pipelinedata.blob.core.windows.net/sampledata/mnist --output-path folder:azureml://datastores/workspaceblobstore/paths/$ENDPOINT_NAME --set output_file_name=$OUTPUT_FILE_NAME --mini-batch-size 20 --instance-count 5 --query name -o tsv)
Monitor batch scoring job execution progress
Batch scoring jobs usually take some time to process the entire set of inputs.
You can use CLI job show to view the job. Run the following code to check job status from the previous endpoint invoke. To learn more about job commands, run az ml job -h.
STATUS=$(az ml job show -n $JOB_NAME --query status -o tsv)
echo $STATUS
if [[ $STATUS == "Completed" ]]
then
echo "Job completed"
elif [[ $STATUS == "Failed" ]]
then
echo "Job failed"
exit 1
else
echo "Job status not failed or completed"
exit 2
fi
Check batch scoring results
Follow the below steps to view the scoring results in Azure Storage Explorer when the job is completed:
Run the following code to open batch scoring job in Azure Machine Learning studio. The job studio link is also included in the response of
invoke, as the value ofinteractionEndpoints.Studio.endpoint.az ml job show -n $JOB_NAME --webIn the graph of the run, select the
batchscoringstep.Select the Outputs + logs tab and then select Show data outputs.
From Data outputs, select the icon to open Storage Explorer.
The scoring results in Storage Explorer are similar to the following sample page:
Deploy a new model
Once you have a batch endpoint, you can continue to refine your model and add new deployments.
Create a new batch deployment hosting an MLflow model
To create a new batch deployment under the existing batch endpoint but not set it as the default deployment, run the following code:
az ml batch-deployment create --name mlflowdp --endpoint-name $ENDPOINT_NAME --file endpoints/batch/mlflow-deployment.yml
Notice that --set-default is not used. If you show the batch endpoint again, you should see no change of the defaults.deployment_name.
The example uses a model (/cli/endpoints/batch/autolog_nyc_taxi) trained and tracked with MLflow. scoring_script and environment can be auto generated using model's metadata, no need to specify in the YAML file. For more about MLflow, see Train and track ML models with MLflow and Azure Machine Learning (preview).
Below is the YAML file the example uses to deploy an MLflow model, which only contains the minimum required properties. The source file in repository is /cli/endpoints/batch/mlflow-deployment.yml.
$schema: https://azuremlschemas.azureedge.net/latest/batchDeployment.schema.json
name: mlflowdp
endpoint_name: mybatchedp
model:
local_path: ./autolog_nyc_taxi
compute: azureml:batch-cluster
Note
scoring_script and environment auto generation only supports Python Function model flavor and column-based model signature.
Test a non-default batch deployment
To test the new non-default deployment, run the following code. The example uses a different model that accepts a publicly available csv file from https://pipelinedata.blob.core.windows.net/sampledata/nytaxi/taxi-tip-data.csv.
JOB_NAME=$(az ml batch-endpoint invoke --name $ENDPOINT_NAME --deployment-name mlflowdp --input-path file:https://pipelinedata.blob.core.windows.net/sampledata/nytaxi/taxi-tip-data.csv --query name -o tsv)
az ml job show -n $JOB_NAME --web
az ml job stream -n $JOB_NAME
STATUS=$(az ml job show -n $JOB_NAME --query status -o tsv)
echo $STATUS
if [[ $STATUS == "Completed" ]]
then
echo "Job completed"
elif [[ $STATUS == "Failed" ]]
then
echo "Job failed"
exit 1
else
echo "Job status not failed or completed"
exit 2
fi
Notice --deployment-name is used to specify the new deployment name. This parameter allows you to invoke a non-default deployment, and it will not update the default deployment of the batch endpoint.
Update the default batch deployment
To update the default batch deployment of the endpoint, run the following code:
az ml batch-endpoint update --name $ENDPOINT_NAME --defaults deployment_name=mlflowdp
Now, if you show the batch endpoint again, you should see defaults.deployment_name is set to mlflowdp. You can invoke the batch endpoint directly without the --deployment-name parameter.
(Optional) Update the deployment
If you want to update the deployment (for example, update code, model, environment, or settings), update the YAML file, and then run az ml batch-deployment update. You can also update without the YAML file by using --set. Check az ml batch-deployment update -h for more information.
Delete the batch endpoint and the deployment
If you aren't going to use the old batch deployment, you should delete it by running the following code. --yes is used to confirm the deletion.
az ml batch-deployment delete --name nonmlflowdp --endpoint-name $ENDPOINT_NAME --yes
Run the following code to delete the batch endpoint and all the underlying deployments. Batch scoring jobs will not be deleted.
az ml batch-endpoint delete --name $ENDPOINT_NAME --yes