Safe rollout for online endpoints (preview)

You have an existing model deployed in production and you want to deploy a new version of the model. How do you roll out your new ML model without causing any disruption? A good answer is blue-green deployment, an approach in which a new version of a web service is introduced to production by rolling out the change to a small subset of users/requests before rolling it out completely. This article assumes you're using online endpoints; for more information, see What are Azure Machine Learning endpoints (preview)?.

In this article, you'll learn to:

  • Deploy a new online endpoint called "blue" that serves version 1 of the model
  • Scale this deployment so that it can handle more requests
  • Deploy version 2 of the model to an endpoint called "green" that accepts no live traffic
  • Test the green deployment in isolation
  • Send 10% of live traffic to the green deployment
  • Fully cut-over all live traffic to the green deployment
  • Delete the now-unused v1 blue deployment

Important

This feature is currently in public preview. This preview version is provided without a service-level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Prerequisites

  • To use Azure machine learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning today.

  • You must install and configure the Azure CLI and ML extension. For more information, see Install, set up, and use the CLI (v2) (preview).

  • You must have an Azure Resource group, in which you (or the service principal you use) need to have Contributor access. You'll have such a resource group if you configured your ML extension per the above article.

  • You must have an Azure Machine Learning workspace. You'll have such a workspace if you configured your ML extension per the above article.

  • If you've not already set the defaults for Azure CLI, you should save your default settings. To avoid having to repeatedly pass in the values, run:

az account set --subscription <subscription id>
az configure --defaults workspace=<azureml workspace name> group=<resource group>
export ENDPOINT_NAME="<YOUR_ENDPOINT_NAME>"
  • (Recommended) Clone the samples repository and switch to the repository's cli/ directory:
git clone https://github.com/Azure/azureml-examples
cd azureml-examples/cli

The commands in this tutorial are in the file deploy-declarative-safe-rollout-online-endpoints.sh and the YAML configuration files are in the endpoints/online/managed/canary-declarative-flow/ subdirectory.

Confirm your existing deployment is created

You can view the status of your existing deployment by running:

az ml endpoint show --name $ENDPOINT_NAME 

You should see the endpoint identified by $ENDPOINT_NAME and, a deployment called blue.

Scale your existing deployment to handle more traffic

In the deployment described in Deploy and score a machine learning model with a managed online endpoint (preview), you set the instance_count to the value 1. To handle more traffic, the second version of the YAML file (2-scale-blue.yml) changes the value to 2:

instance_count: 2

Update the deployment with:

az ml endpoint update -n $ENDPOINT_NAME -f endpoints/online/managed/canary-declarative-flow/2-scale-blue.yml

Important

Update using the YAML is declarative. That is, changes in the YAML will be reflected in the underlying Azure Resource Manager resources (endpoints & deployments). This approach facilitates GitOps: ALL changes to endpoints/deployments go through the YAML (even instance_count). As a side effect, if you remove a deployment from the YAML and run az ml endpoint update using the file, that deployment will be deleted.

Deploy a new model, but send it no traffic yet

To deploy your new model, add a new section to the deployments section of your configuration file, but specify in the traffic section that it should receive 0% of traffic. The file 3-create-green.yml incorporates this change:

green: 0
- name: green
  model:
    name: model-2
    version: 1
    local_path: ../../model-2/model/sklearn_regression_model.pkl
  code_configuration:
    code: 
      local_path: ../../model-2/onlinescoring/
    scoring_script: score.py
  environment:   
    name: env-model2
    version: 1      
    path: .
    conda_file: file:../../model-2/environment/conda.yml
    docker:
      image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210727.v1
  instance_type: Standard_F2s_v2
  scale_settings:
    scale_type: manual
    instance_count: 2
    min_instances: 1
    max_instances: 2

Update the deployment:

az ml endpoint update -n $ENDPOINT_NAME -f endpoints/online/managed/canary-declarative-flow/3-create-green.yml

Test the new deployment

The configuration specified 0% traffic to your just-created green deployment. To test it, you can invoke it directly by specifying the --deployment name:

az ml endpoint invoke --name $ENDPOINT_NAME --deployment green --request-file endpoints/online/model-2/sample-request.json

If you want to use a REST client to invoke the deployment directly without going through traffic rules, set the following HTTP header: azureml-model-deployment: <deployment-name>.

Test the new deployment with a small percentage of live traffic

Once you have tested your green deployment, the 4-flight-green.yml file demonstrates how to serve some percentage of traffic by modifying the traffic configuration in the configuration file:

traffic:
  blue: 90
  green: 10

Other than the highlighted lines, the configuration file is otherwise unchanged. Update your deployment with:

az ml endpoint update -n $ENDPOINT_NAME -f endpoints/online/managed/canary-declarative-flow/4-flight-green.yml

Now, your green deployment will receive 10% of requests.

Send all traffic to your new deployment

Once you're satisfied that your green deployment is fully satisfactory, switch all traffic to it. The following snippet shows only the relevant code from the configuration file, which is otherwise unchanged:

traffic:
  blue: 0
  green: 100

And update the deployment:

az ml endpoint update -n $ENDPOINT_NAME -f endpoints/online/managed/canary-declarative-flow/5-full-green.yml

Remove the old deployment

Complete the swap-over to your new model by deleting the older blue deployment. The final configuration file looks like:

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: my-endpoint
type: online
auth_mode: key
traffic:
  green: 100

deployments:
  #green deployment
  - name: green
    model:
      name: model-2
      version: 1
      local_path: ../../model-2/model/sklearn_regression_model.pkl
    code_configuration:
      code: 
        local_path: ../../model-2/onlinescoring/
      scoring_script: score.py
    environment:
      name: env-model2
      version: 1
      path: .
      conda_file: file:../../model-2/environment/conda.yml
      docker:
        image: mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04:20210727.v1
    instance_type: Standard_F2s_v2
    scale_settings:
      scale_type: manual
      instance_count: 2
      min_instances: 1
      max_instances: 2

Update the deployment with:

az ml endpoint update -n $ENDPOINT_NAME -f endpoints/online/managed/canary-declarative-flow/6-delete-blue.yml

Delete the endpoint and deployment

If you are not going use the deployment, you should delete it with:

az ml endpoint delete -n $ENDPOINT_NAME --yes --no-wait

Next steps