Deploy models with REST (preview) for batch scoring

Learn how to use the Azure Machine Learning REST API to deploy models for batch scoring (preview).

Important

This feature is currently in public preview. This preview version is provided without a service-level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Note

This article utilizes the latest version of the CLI v2 that's in public preview. For guidance to update and install the latest version, see the Install and set up CLI (v2) doc.

The REST API uses standard HTTP verbs to create, retrieve, update, and delete resources. The REST API works with any language or tool that can make HTTP requests. REST's straightforward structure makes it a good choice in scripting environments and for MLOps automation.

In this article, you learn how to use the new REST APIs to:

  • Create machine learning assets
  • Create a batch endpoint and a batch deployment
  • Invoke a batch endpoint to start a batch scoring job

Prerequisites

Important

The code snippets in this article assume that you are using the Bash shell.

The code snippets are pulled from the /cli/batch-score-rest.sh file in the AzureML Example repository.

Set endpoint name

Note

Batch endpoint names need to be unique at the Azure region level. For example, there can be only one batch endpoint with the name mybatchendpoint in westus2.

export ENDPOINT_NAME=endpt-`echo $RANDOM`

Azure Machine Learning batch endpoints

Batch endpoints (preview) simplify the process of hosting your models for batch scoring, so you can focus on machine learning, not infrastructure. In this article, you'll create a batch endpoint and deployment, and invoking it to start a batch scoring job. But first you'll have to register the assets needed for deployment, including model, code, and environment.

There are many ways to create an Azure Machine Learning batch endpoint, including the Azure CLI, and visually with the studio. The following example creates a batch endpoint and deployment with the REST API.

Create machine learning assets

First, set up your Azure Machine Learning assets to configure your job.

In the following REST API calls, we use SUBSCRIPTION_ID, RESOURCE_GROUP, LOCATION, and WORKSPACE as placeholders. Replace the placeholders with your own values.

Administrative REST requests a service principal authentication token. Replace TOKEN with your own value. You can retrieve this token with the following command:

TOKEN=$(az account get-access-token --query accessToken -o tsv)

The service provider uses the api-version argument to ensure compatibility. The api-version argument varies from service to service. Set the API version as a variable to accommodate future versions:

API_VERSION="2021-10-01"

Create compute

Batch scoring runs only on cloud computing resources, not locally. The cloud computing resource is a reusable virtual computer cluster where you can run batch scoring workflows.

Create a compute cluster:

response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/computes/batch-cluster?api-version=$API_VERSION" \
--header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/json" \
--data-raw "{
    \"properties\":{
        \"computeType\": \"AmlCompute\",
        \"properties\": {
            \"osType\": \"Linux\",
            \"vmSize\": \"STANDARD_D2_V2\",
            \"scaleSettings\": {
                \"maxNodeCount\": 5,
                \"minNodeCount\": 0
            },
            \"remoteLoginPortPublicAccess\": \"NotSpecified\"
        },
    },
    \"location\": \"$LOCATION\"
}")

Tip

If you want to use an existing compute instead, you must specify the full Azure Resource Manager ID when creating the batch deployment. The full ID uses the format /subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/computes/<your-compute-name>.

Get storage account details

To register the model and code, first they need to be uploaded to a storage account. The details of the storage account are available in the data store. In this example, you get the default datastore and Azure Storage account for your workspace. Query your workspace with a GET request to get a JSON file with the information.

You can use the tool jq to parse the JSON result and get the required values. You can also use the Azure portal to find the same information:

# Get values for storage account
response=$(curl --location --request GET "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/datastores?api-version=$API_VERSION&isDefault=true" \
--header "Authorization: Bearer $TOKEN")
DATASTORE_PATH=$(echo $response | jq -r '.value[0].id')
BLOB_URI_PROTOCOL=$(echo $response | jq -r '.value[0].properties.protocol')
BLOB_URI_ENDPOINT=$(echo $response | jq -r '.value[0].properties.endpoint')
AZUREML_DEFAULT_DATASTORE=$(echo $response | jq -r '.value[0].name')
AZUREML_DEFAULT_CONTAINER=$(echo $response | jq -r '.value[0].properties.containerName')
AZURE_STORAGE_ACCOUNT=$(echo $response | jq -r '.value[0].properties.accountName')
export AZURE_STORAGE_ACCOUNT $(echo $AZURE_STORAGE_ACCOUNT)
STORAGE_RESPONSE=$(echo az storage account show-connection-string --name $AZURE_STORAGE_ACCOUNT)
AZURE_STORAGE_CONNECTION_STRING=$($STORAGE_RESPONSE | jq -r '.connectionString')

BLOB_URI_ROOT="$BLOB_URI_PROTOCOL://$AZURE_STORAGE_ACCOUNT.blob.$BLOB_URI_ENDPOINT/$AZUREML_DEFAULT_CONTAINER"

Upload & register code

Now that you have the datastore, you can upload the scoring script. Use the Azure Storage CLI to upload a blob into your default container:

az storage blob upload-batch -d $AZUREML_DEFAULT_CONTAINER/score -s endpoints/batch/mnist/code/  --connection-string $AZURE_STORAGE_CONNECTION_STRING

Tip

You can also use other methods to upload, such as the Azure portal or Azure Storage Explorer.

Once you upload your code, you can specify your code with a PUT request:

response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/codes/score-mnist/versions/1?api-version=$API_VERSION" \
--header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/json" \
--data-raw "{
  \"properties\": {
    \"description\": \"Score code\",
    \"codeUri\": \"$BLOB_URI_ROOT/score\"
  }
}")

Upload and register model

Similar to the code, Upload the model files:

az storage blob upload-batch -d $AZUREML_DEFAULT_CONTAINER/model -s endpoints/batch/mnist/model  --connection-string $AZURE_STORAGE_CONNECTION_STRING

Now, register the model:

response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/models/mnist/versions/1?api-version=$API_VERSION" \
--header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/json" \
--data-raw "{
    \"properties\": {
        \"modelUri\":\"azureml://subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/workspaces/$WORKSPACE/datastores/$AZUREML_DEFAULT_DATASTORE/paths/model\"
    }
}")

Create environment

The deployment needs to run in an environment that has the required dependencies. Create the environment with a PUT request. Use a docker image from Microsoft Container Registry. You can configure the docker image with image and add conda dependencies with condaFile.

Run the following code to read the condaFile defined in json. The source file is at /cli/endpoints/batch/mnist/environment/conda.json in the example repository:

CONDA_FILE=$(cat endpoints/batch/mnist/environment/conda.json | sed 's/"/\\"/g')

Now, run the following snippet to create an environment:

ENV_VERSION=$RANDOM
response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/environments/mnist-env/versions/$ENV_VERSION?api-version=$API_VERSION" \
--header "Authorization: Bearer $TOKEN" \
--header "Content-Type: application/json" \
--data-raw "{
    \"properties\":{
        \"condaFile\": $(echo \"$CONDA_FILE\"),
        \"image\": \"mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:latest\"
    }
}")

Deploy with batch endpoints

Next, create the batch endpoint, a deployment, and set the default deployment.

Create batch endpoint

Create the batch endpoint:

response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/batchEndpoints/$ENDPOINT_NAME?api-version=$API_VERSION" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $TOKEN" \
--data-raw "{
    \"properties\": {
        \"authMode\": \"aadToken\"
    },
    \"location\": \"$LOCATION\"
}")

Create batch deployment

Create a batch deployment under the endpoint:

DEPLOYMENT_NAME="nonmlflowedp"
response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/batchEndpoints/$ENDPOINT_NAME/deployments/$DEPLOYMENT_NAME?api-version=$API_VERSION" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $TOKEN" \
--data-raw "{
    \"location\": \"$LOCATION\",
    \"properties\": {        
        \"model\": {
            \"referenceType\": \"Id\",
            \"assetId\": \"/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/models/mnist/versions/1\"
        },
        \"codeConfiguration\": {
            \"codeId\": \"/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/codes/score-mnist/versions/1\",
            \"scoringScript\": \"digit_identification.py\"
        },
        \"environmentId\": \"/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/environments/mnist-env/versions/$ENV_VERSION\",
        \"compute\": \"/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/computes/batch-cluster\",
        \"resources\": {
            \"instanceCount\": 1
        },
        \"maxConcurrencyPerInstance\": \"4\",
        \"retrySettings\": {
            \"maxRetries\": 3,
            \"timeout\": \"PT30S\"
        },
        \"errorThreshold\": \"10\",
        \"loggingLevel\": \"info\",
        \"miniBatchSize\": \"5\",
    }
}")

Set the default batch deployment under the endpoint

There's only one default batch deployment under one endpoint, which will be used when invoke to run batch scoring job.

response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/batchEndpoints/$ENDPOINT_NAME?api-version=$API_VERSION" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $TOKEN" \
--data-raw "{
    \"properties\": {
        \"authMode\": \"aadToken\",
        \"defaults\": {
            \"deploymentName\": \"$DEPLOYMENT_NAME\"
        }
    },
    \"location\": \"$LOCATION\"
}")

operation_id=$(echo $response | jq -r '.properties' | jq -r '.properties' | jq -r '.AzureAsyncOperationUri')
wait_for_completion $operation_id $TOKEN

Run batch scoring

Invoking a batch endpoint triggers a batch scoring job. A job id is returned in the response, and can be used to track the batch scoring progress. In the following snippets, jq is used to get the job id.

Invoke the batch endpoint to start a batch scoring job

Get the scoring uri and access token to invoke the batch endpoint. First get the scoring uri:

response=$(curl --location --request GET "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/batchEndpoints/$ENDPOINT_NAME?api-version=$API_VERSION" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $TOKEN")

SCORING_URI=$(echo $response | jq -r '.properties.scoringUri')

Get the batch endpoint access token:

SCORING_TOKEN=$(az account get-access-token --resource https://ml.azure.com --query accessToken -o tsv)

Now, invoke the batch endpoint to start a batch scoring job. The following example scores data publicly available in the cloud:

response=$(curl --location --request POST $SCORING_URI \
--header "Authorization: Bearer $SCORING_TOKEN" \
--header "Content-Type: application/json" \
--data-raw "{
    \"properties\": {
        \"dataset\": {
            \"dataInputType\": \"DataUrl\",
            \"Path\": \"https://pipelinedata.blob.core.windows.net/sampledata/mnist\"
        }
    }
}")

JOB_ID=$(echo $response | jq -r '.id')
JOB_ID_SUFFIX=$(echo ${JOB_ID##/*/})

If your data is stored in an Azure Machine Learning registered datastore, you can invoke the batch endpoint with a dataset. The following code creates a new dataset:

DATASET_NAME="mnist"
DATASET_VERSION=$RANDOM

response=$(curl --location --request PUT "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/datasets/$DATASET_NAME/versions/$DATASET_VERSION?api-version=$API_VERSION" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $TOKEN" \
--data-raw "{
    \"properties\": {
        \"paths\": [
            {
                \"folder\": \"https://pipelinedata.blob.core.windows.net/sampledata/mnist\"
            }
        ]
    }
}")

Next, reference the dataset when invoking the batch endpoint:

response=$(curl --location --request POST $SCORING_URI \
--header "Authorization: Bearer $SCORING_TOKEN" \
--header "Content-Type: application/json" \
--data-raw "{
    \"properties\": {
        \"dataset\": {
            \"dataInputType\": \"DatasetVersion\",
            \"datasetName\": \"$DATASET_NAME\",
            \"datasetVersion\": \"$DATASET_VERSION\"
        },
        \"outputDataset\": {
            \"datastoreId\": \"/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/datastores/workspaceblobstore\",
            \"path\": \"$ENDPOINT_NAME\"
        },
        \"outputFileName\": \"$OUTPUT_FILE_NAME\"
    }
}")

In the previous code snippet, a custom output location is provided by using datastoreId, path, and outputFileName. These settings allow you to configure where to store the batch scoring results.

Important

You must provide a unique output location. If the output file already exists, the batch scoring job will fail.

For this example, the output is stored in the default blob storage for the workspace. The folder name is the same as the endpoint name, and the file name is randomly generated by the following code:

export OUTPUT_FILE_NAME=predictions_`echo $RANDOM`.csv

Check the batch scoring job

Batch scoring jobs usually take some time to process the entire set of inputs. Monitor the job status and check the results after it's completed:

Tip

The example invokes the default deployment of the batch endpoint. To invoke a non-default deployment, use the azureml-model-deployment HTTP header and set the value to the deployment name. For example, using a parameter of --header "azureml-model-deployment: $DEPLOYMENT_NAME" with curl.

wait_for_completion $SCORING_URI/$JOB_ID_SUFFIX $SCORING_TOKEN

Check batch scoring results

For information on checking the results, see Check batch scoring results.

Delete the batch endpoint

If you aren't going use the batch endpoint, you should delete it with the below command (it deletes the batch endpoint and all the underlying deployments):

curl --location --request DELETE "https://management.azure.com/subscriptions/$SUBSCRIPTION_ID/resourceGroups/$RESOURCE_GROUP/providers/Microsoft.MachineLearningServices/workspaces/$WORKSPACE/batchEndpoints/$ENDPOINT_NAME?api-version=$API_VERSION" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer $TOKEN" || true

Next steps