Safe rollout for online endpoints (preview)

You've an existing model deployed in production and you want to deploy a new version of the model. How do you roll out your new ML model without causing any disruption? A good answer is blue-green deployment, an approach in which a new version of a web service is introduced to production by rolling out the change to a small subset of users/requests before rolling it out completely. This article assumes you're using online endpoints; for more information, see What are Azure Machine Learning endpoints (preview)?.

In this article, you'll learn to:

  • Deploy a new online endpoint called "blue" that serves version 1 of the model
  • Scale this deployment so that it can handle more requests
  • Deploy version 2 of the model to an endpoint called "green" that accepts no live traffic
  • Test the green deployment in isolation
  • Send 10% of live traffic to the green deployment
  • Fully cut-over all live traffic to the green deployment
  • Delete the now-unused v1 blue deployment


This feature is currently in public preview. This preview version is provided without a service-level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.


  • To use Azure machine learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning today.

  • You must install and configure the Azure CLI and ML extension. For more information, see Install, set up, and use the CLI (v2) (preview).

  • You must have an Azure Resource group, in which you (or the service principal you use) need to have Contributor access. You'll have such a resource group if you configured your ML extension per the above article.

  • You must have an Azure Machine Learning workspace. You'll have such a workspace if you configured your ML extension per the above article.

  • If you've not already set the defaults for Azure CLI, you should save your default settings. To avoid having to repeatedly pass in the values, run:

    az account set --subscription <subscription id>
    az configure --defaults workspace=<azureml workspace name> group=<resource group>
  • An existing online endpoint and deployment. This article assumes that your deployment is as described in Deploy and score a machine learning model with a managed online endpoint (preview).

  • If you haven't already set the environment variable $ENDPOINT_NAME, do so now:

  • (Recommended) Clone the samples repository and switch to the repository's cli/ directory:

    git clone
    cd azureml-examples/cli

The commands in this tutorial are in the file and the YAML configuration files are in the endpoints/online/managed/sample/ subdirectory.

Confirm your existing deployment is created

You can view the status of your existing endpoint and deployment by running:

az ml online-endpoint show --name $ENDPOINT_NAME 

az ml online-deployment show --name blue --endpoint $ENDPOINT_NAME 

You should see the endpoint identified by $ENDPOINT_NAME and, a deployment called blue.

Scale your existing deployment to handle more traffic

In the deployment described in Deploy and score a machine learning model with a managed online endpoint (preview), you set the instance_count to the value 1 in the deployment yaml file. You can scale out using the update command :

az ml online-deployment update --name blue --endpoint $ENDPOINT_NAME --set instance_count=2


Notice that in the above command we use --set to override the deployment configuration. Alternatively you can update the yaml file and pass it as an input to the update command using the --file input.

Deploy a new model, but send it no traffic yet

Create a new deployment named green:

az ml online-deployment create --name green --endpoint $ENDPOINT_NAME -f endpoints/online/managed/sample/green-deployment.yml

Since we haven't explicitly allocated any traffic to green, it will have zero traffic allocated to it. You can verify that using the command:

az ml online-endpoint show -n $ENDPOINT_NAME --query traffic

Test the new deployment

Though green has 0% of traffic allocated, you can invoke it directly by specifying the --deployment name:

az ml online-endpoint invoke --name $ENDPOINT_NAME --deployment green --request-file endpoints/online/model-2/sample-request.json

If you want to use a REST client to invoke the deployment directly without going through traffic rules, set the following HTTP header: azureml-model-deployment: <deployment-name>. The below code snippet uses curl to invoke the deployment directly. The code snippet should work in Unix/WSL environments:

# get the scoring uri
SCORING_URI=$(az ml online-endpoint show -n $ENDPOINT_NAME -o tsv --query scoring_uri)
# use curl to invoke the endpoint
curl --request POST "$SCORING_URI" --header "Authorization: Bearer $ENDPOINT_KEY" --header 'Content-Type: application/json' --header "azureml-model-deployment: green" --data @endpoints/online/model-2/sample-request.json

Test the new deployment with a small percentage of live traffic

Once you've tested your green deployment, allocate a small percentage of traffic to it:

az ml online-endpoint update --name $ENDPOINT_NAME --traffic "blue=90 green=10"

Now, your green deployment will receive 10% of requests.

Send all traffic to your new deployment

Once you're satisfied that your green deployment is fully satisfactory, switch all traffic to it.

az ml online-endpoint update --name $ENDPOINT_NAME --traffic "blue=0 green=100"

Remove the old deployment

az ml online-deployment delete --name blue --endpoint $ENDPOINT_NAME --yes --no-wait

Delete the endpoint and deployment

If you aren't going use the deployment, you should delete it with:

az ml online-endpoint delete --name $ENDPOINT_NAME --yes --no-wait

Next steps