Deploy models with the Azure Machine Learning service

Learn how to deploy your machine learning model as a web service in the Azure cloud, or to IoT Edge devices.

The workflow is similar regardless of where you deploy your model:

  1. Register the model.
  2. Prepare to deploy (specify assets, usage, compute target).
  3. Deploy the model to the compute target.
  4. Test the deployed model, also called web service.

For more information on the concepts involved in the deployment workflow, see Manage, deploy, and monitor models with Azure Machine Learning Service.

Prerequisites

Connect to your workspace

The following code demonstrates how to connect to an Azure Machine Learning service workspace using information cached to the local development environment:

Using the SDK

from azureml.core import Workspace
ws = Workspace.from_config(path=".file-path/ws_config.json")

For more information on using the SDK to connect to a workspace, see the Azure Machine Learning SDK for Python.

Using the CLI

When using the CLI, use the -w or --workspace-name parameter to specify the workspace for the command.

Using VS Code

When using VS Code, the workspace is selected using a graphical interface. For more information, see Deploy and manage models in the VS Code extension documentation.

Register your model

A registered model is a logical container for one or more files that make up your model. For example, if you have a model that is stored in multiple files, you can register them as a single model in the workspace. After registration, you can then download or deploy the registered model and receive all the files that were registered.

Tip

When registering a model, you provide either a path to a cloud location (from a training run) or a local directory. This path is just to locate the files for upload as part of the registration process; it does not need to match the path used in the entry script. For more information, see What is get_model_path.

Machine learning models are registered in your Azure Machine Learning workspace. The model can come from Azure Machine Learning or can come from somewhere else. The following examples demonstrate how to register a model:

Register a model from an Experiment Run

The code snippets in this section demonstrate registering a model from a training run:

Important

These snippets assume that you have previously performed a training run and have access to the run object (SDK example) or run ID value (CLI example). For more information on training models, see Create and use compute targets for model training.

  • Using the SDK

    model = run.register_model(model_name='sklearn_mnist', model_path='outputs/sklearn_mnist_model.pkl')
    print(model.name, model.id, model.version, sep='\t')
    

    The model_path refers to the cloud location of the model. In this example, the path to a single file is used. To include multiple files in the model registration, set model_path to the directory that contains the files.

  • Using the CLI

    az ml model register -n sklearn_mnist  --asset-path outputs/sklearn_mnist_model.pkl  --experiment-name myexperiment --run-id myrunid
    

    Tip

    If you get an error that the ml extension is not installed, use the following command to install it:

    az extension add -n azure-cli-ml
    

    The --asset-path refers to the cloud location of the model. In this example, the path to a single file is used. To include multiple files in the model registration, set --asset-path to the directory that contains the files.

  • Using VS Code

    Register models using any model files or folders with the VS Code extension.

Register a model from a local file

You can register a model by providing a local path to the model. You can provide either a folder or a single file. You can use this method to register both models trained with Azure Machine Learning service and then downloaded, or models trained outside Azure Machine Learning.

Important

You should only use models that you create or obtain from a trusted source. Serialized models should be treated as code, as security vulnerabilities have been discovered in a number of popular formats. Further, models may be intentionally trained with malicious intent to provide biased or inaccurate output.

  • ONNX example with the Python SDK:

    import os
    import urllib.request
    from azureml.core import Model
    # Download model
    onnx_model_url = "https://www.cntk.ai/OnnxModels/mnist/opset_7/mnist.tar.gz"
    urllib.request.urlretrieve(onnx_model_url, filename="mnist.tar.gz")
    os.system('tar xvzf mnist.tar.gz')
    # Register model
    model = Model.register(workspace = ws,
                            model_path ="mnist/model.onnx",
                            model_name = "onnx_mnist",
                            tags = {"onnx": "demo"},
                            description = "MNIST image classification CNN from ONNX Model Zoo",)
    

    To include multiple files in the model registration, set model_path to the directory that contains the files.

  • Using the CLI

    az ml model register -n onnx_mnist -p mnist/model.onnx
    

    To include multiple files in the model registration, set -p to the directory that contains the files.

Time estimate: Approximately 10 seconds.

For more information, see the reference documentation for the Model class.

For more information on working with models trained outside Azure Machine Learning service, see How to deploy an existing model.

Choose a compute target

The following compute targets, or compute resources, can be used to host your web service deployment.

Compute target Usage GPU support FPGA support Description
Local web service Testing/debug maybe   Good for limited testing and troubleshooting. Hardware acceleration depends on using libraries in the local system.
Notebook VM web service Testing/debug maybe   Good for limited testing and troubleshooting.
Azure Kubernetes Service (AKS) Real-time inference yes yes Good for high-scale production deployments. Provides fast response time and autoscaling of the deployed service. Cluster autoscaling is not supported through the Azure Machine Learning SDK. To change the nodes in the AKS cluster, use the UI for your AKS cluster in the Azure portal. AKS is the only option available for the visual interface.
Azure Container Instances (ACI) Testing or dev     Good for low scale, CPU-based workloads requiring <48-GB RAM
Azure Machine Learning Compute (Preview) Batch inference yes   Run batch scoring on serverless compute. Supports normal and low-priority VMs.
Azure IoT Edge (Preview) IoT module     Deploy & serve ML models on IoT devices.
Azure Data Box Edge via IoT Edge   yes Deploy & serve ML models on IoT devices.

Prepare to deploy

Deploying the model requires several things:

  • An entry script. This script accepts requests, scores the request using the model, and returns the results.

    Important

    The entry script is specific to your model; it must understand the format of the incoming request data, the format of the data expected by your model, and the format of the data returned to clients.

    If the request data is in a format that is not usable by your model, the script can transform it into an acceptable format. It may also transform the response before returning to it to the client.

    Important

    The Azure Machine Learning SDK does not provide a way for web service or IoT Edge deployments to access your datastore or data sets. If you need the deployed model to access data stored outside the deployment, such as in an Azure Storage account, you must develop a custom code solution using the relevant SDK. For example, the Azure Storage SDK for Python.

    Another alternative that may work for your scenario is batch predictions, which does provide access to datastores when scoring.

  • Dependencies, such as helper scripts or Python/Conda packages required to run the entry script or model

  • The deployment configuration for the compute target that hosts the deployed model. This configuration describes things like memory and CPU requirements needed to run the model.

These entities are encapsulated into an inference configuration, and a deployment configuration. The inference configuration references the entry script and other dependencies. These configurations are defined programmatically when using the SDK, and as JSON files when using the CLI to perform the deployment.

1. Define your entry script & dependencies

The entry script receives data submitted to a deployed web service, and passes it to the model. It then takes the response returned by the model and returns that to the client. The script is specific to your model; it must understand the data that the model expects and returns.

The script contains two functions that load and run the model:

  • init(): Typically this function loads the model into a global object. This function is run only once when the Docker container for your web service is started.

  • run(input_data): This function uses the model to predict a value based on the input data. Inputs and outputs to the run typically use JSON for serialization and de-serialization. You can also work with raw binary data. You can transform the data before sending to the model, or before returning to the client.

What is get_model_path?

When you register a model, you provide a model name used for managing the model in the registry. You use this name with the Model.get_model_path() to retrieve the path of the model file(s) on the local file system. If you register a folder or a collection of files, this API returns the path to the directory that contains those files.

When you register a model, you give it a name, which corresponds to where the model is placed, either locally or during service deployment.

The below example will return a path to a single file called sklearn_mnist_model.pkl (which was registered with the name sklearn_mnist):

model_path = Model.get_model_path('sklearn_mnist')

(Optional) Automatic schema generation

To automatically generate a schema for your web service, provide a sample of the input and/or output in the constructor for one of the defined type objects, and the type and sample are used to automatically create the schema. Azure Machine Learning service then creates an OpenAPI (Swagger) specification for the web service during deployment.

The following types are currently supported:

  • pandas
  • numpy
  • pyspark
  • standard Python object

To use schema generation, include the inference-schema package in your conda environment file.

Example dependencies file

The following YAML is an example of a Conda dependencies file for inference:

name: project_environment
dependencies:
  - python=3.6.2
  - pip:
    - azureml-defaults
    - scikit-learn==0.20.0
    - inference-schema[numpy-support]

If you want to use automatic schema generation, your entry script must import the inference-schema packages.

Define the input and output sample formats in the input_sample and output_sample variables, which represent the request and response formats for the web service. Use these samples in the input and output function decorators on the run() function. The scikit-learn example below uses schema generation.

Example entry script

The following example demonstrates how to accept and return JSON data:

#example: scikit-learn and Swagger
import json
import numpy as np
from sklearn.externals import joblib
from sklearn.linear_model import Ridge
from azureml.core.model import Model

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType


def init():
    global model
    # note here "sklearn_regression_model.pkl" is the name of the model registered under
    # this is a different behavior than before when the code is run locally, even though the code is the same.
    model_path = Model.get_model_path('sklearn_regression_model.pkl')
    # deserialize the model file back into a sklearn model
    model = joblib.load(model_path)


input_sample = np.array([[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]])
output_sample = np.array([3726.995])


@input_schema('data', NumpyParameterType(input_sample))
@output_schema(NumpyParameterType(output_sample))
def run(data):
    try:
        result = model.predict(data)
        # you can return any datatype as long as it is JSON-serializable
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

The following example demonstrates how to define the input data as a <key: value> dictionary, using a Dataframe. This method is supported for consuming the deployed web service from Power BI (Learn more about how to consume the web service from Power BI):

import json
import pickle
import numpy as np
import pandas as pd
import azureml.train.automl
from sklearn.externals import joblib
from azureml.core.model import Model

from inference_schema.schema_decorators import input_schema, output_schema
from inference_schema.parameter_types.numpy_parameter_type import NumpyParameterType
from inference_schema.parameter_types.pandas_parameter_type import PandasParameterType


def init():
    global model
    # replace model_name with your actual model name, if needed
    model_path = Model.get_model_path('model_name')
    # deserialize the model file back into a sklearn model
    model = joblib.load(model_path)


input_sample = pd.DataFrame(data=[{
    # This is a decimal type sample. Use the data type that reflects this column in your data
    "input_name_1": 5.1,
    # This is a string type sample. Use the data type that reflects this column in your data
    "input_name_2": "value2",
    # This is a integer type sample. Use the data type that reflects this column in your data
    "input_name_3": 3
}])

# This is a integer type sample. Use the data type that reflects the expected result
output_sample = np.array([0])


@input_schema('data', PandasParameterType(input_sample))
@output_schema(NumpyParameterType(output_sample))
def run(data):
    try:
        result = model.predict(data)
        # you can return any datatype as long as it is JSON-serializable
        return result.tolist()
    except Exception as e:
        error = str(e)
        return error

For more example scripts, see the following examples:

Binary data

If your model accepts binary data, such as an image, you must modify the score.py file used for your deployment to accept raw HTTP requests. To accept raw data, use the AMLRequest class in your entry script and add the @rawhttp decorator to the run() function.

Here's an example of a score.py that accepts binary data:

from azureml.contrib.services.aml_request import AMLRequest, rawhttp
from azureml.contrib.services.aml_response import AMLResponse


def init():
    print("This is init()")


@rawhttp
def run(request):
    print("This is run()")
    print("Request: [{0}]".format(request))
    if request.method == 'GET':
        # For this example, just return the URL for GETs
        respBody = str.encode(request.full_path)
        return AMLResponse(respBody, 200)
    elif request.method == 'POST':
        reqBody = request.get_data(False)
        # For a real world solution, you would load the data from reqBody
        # and send to the model. Then return the response.

        # For demonstration purposes, this example just returns the posted data as the response.
        return AMLResponse(reqBody, 200)
    else:
        return AMLResponse("bad request", 500)

Important

The AMLRequest class is in the azureml.contrib namespace. Things in this namespace change frequently as we work to improve the service. As such, anything in this namespace should be considered as a preview, and not fully supported by Microsoft.

If you need to test this on your local development environment, you can install the components by using the following command:

pip install azureml-contrib-services

Cross-origin resource sharing (CORS)

Cross-origin resource sharing is a way to allow resources on a web page to be requested from another domain. CORS works based on HTTP headers sent with the client request and returned with the service response. For more information on CORS and valid headers, see Cross-origin resource sharing on Wikipedia.

To configure your model deployment to support CORS, use the AMLResponse class in your entry script. This class allows you to set the headers on the response object.

The following example sets the Access-Control-Allow-Origin header for the response from the entry script:

from azureml.contrib.services.aml_response import AMLResponse

def init():
    print("This is init()")

def run(request):
    print("This is run()")
    print("Request: [{0}]".format(request))
    if request.method == 'GET':
        # For this example, just return the URL for GETs
        respBody = str.encode(request.full_path)
        return AMLResponse(respBody, 200)
    elif request.method == 'POST':
        reqBody = request.get_data(False)
        # For a real world solution, you would load the data from reqBody
        # and send to the model. Then return the response.

        # For demonstration purposes, this example
        # adds a header and returns the request body
        resp = AMLResponse(reqBody, 200)
        resp.headers['Access-Control-Allow-Origin'] = "http://www.example.com"
        return resp
    else:
        return AMLResponse("bad request", 500)

Important

The AMLResponse class is in the azureml.contrib namespace. Things in this namespace change frequently as we work to improve the service. As such, anything in this namespace should be considered as a preview, and not fully supported by Microsoft.

If you need to test this on your local development environment, you can install the components by using the following command:

pip install azureml-contrib-services

2. Define your InferenceConfig

The inference configuration describes how to configure the model to make predictions. This configuration is not part of your entry script; it references your entry script and is used to locate all the resources required by the deployment. It is used later when actually deploying the model.

The following example demonstrates how to create an inference configuration. This configuration specifies the runtime, the entry script, and (optionally) the conda environment file:

from azureml.core.model import InferenceConfig

inference_config = InferenceConfig(runtime="python",
                                   entry_script="x/y/score.py",
                                   conda_file="env/myenv.yml")

For more information, see the InferenceConfig class reference.

For information on using a custom Docker image with inference configuration, see How to deploy a model using a custom Docker image.

CLI example of InferenceConfig

The entries in the inferenceconfig.json document map to the parameters for the InferenceConfig class. The following table describes the mapping between entities in the JSON document and the parameters for the method:

JSON entity Method parameter Description
entryScript entry_script Path to local file that contains the code to run for the image.
runtime runtime Which runtime to use for the image. Current supported runtimes are 'spark-py' and 'python'.
condaFile conda_file Optional. Path to local file containing a conda environment definition to use for the image.
extraDockerFileSteps extra_docker_file_steps Optional. Path to local file containing additional Docker steps to run when setting up image.
sourceDirectory source_directory Optional. Path to folders that contains all files to create the image.
enableGpu enable_gpu Optional. Whether to enable GPU support in the image. The GPU image must be used on Microsoft Azure Service. For example, Azure Container Instances, Azure Machine Learning Compute, Azure Virtual Machines, and Azure Kubernetes Service. Defaults to False.
baseImage base_image Optional. A custom image to be used as base image. If no base image is given, then the base image will be used based off of given runtime parameter.
baseImageRegistry base_image_registry Optional. Image registry that contains the base image.
cudaVersion cuda_version Optional. Version of CUDA to install for images that need GPU support. The GPU image must be used on Microsoft Azure Services. For example, Azure Container Instances, Azure Machine Learning Compute, Azure Virtual Machines, and Azure Kubernetes Service. Supported versions are 9.0, 9.1, and 10.0. If 'enable_gpu' is set, defaults to '9.1'.
description description A description for this image.

The following JSON is an example inference configuration for use with the CLI:

{
    "entryScript": "score.py",
    "runtime": "python",
    "condaFile": "myenv.yml",
    "extraDockerfileSteps": null,
    "sourceDirectory": null,
    "enableGpu": false,
    "baseImage": null,
    "baseImageRegistry": null
}

The following command demonstrates how to deploy a model using the CLI:

az ml model deploy -n myservice -m mymodel:1 --ic inferenceconfig.json

In this example, the configuration contains the following items:

  • That this model requires Python
  • The entry script, which is used to handle web requests sent to the deployed service
  • The conda file that describes the Python packages needed to inference

For information on using a custom Docker image with inference configuration, see How to deploy a model using a custom Docker image.

3. Define your Deployment configuration

Before deploying, you must define the deployment configuration. The deployment configuration is specific to the compute target that will host the web service. For example, when deploying locally you must specify the port where the service accepts requests. The deployment configuration is not part of your entry script. It is used to define the characteristics of the compute target that will host the model and entry script.

You may also need to create the compute resource. For example, if you do not already have an Azure Kubernetes Service associated with your workspace.

The following table provides an example of creating a deployment configuration for each compute target:

Compute target Deployment configuration example
Local deployment_config = LocalWebservice.deploy_configuration(port=8890)
Azure Container Instance deployment_config = AciWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)
Azure Kubernetes Service deployment_config = AksWebservice.deploy_configuration(cpu_cores = 1, memory_gb = 1)

Each of these classes for local, ACI, and AKS web services can be imported from azureml.core.webservice:

from azureml.core.webservice import AciWebservice, AksWebservice, LocalWebservice

Tip

Prior to deploying your model as a service, you may want to profile it to determine optimal CPU and memory requirements. You can profile your model using either the SDK or CLI. For more information, see the profile() and az ml model profile reference.

Model profiling results are emitted as a Run object. For more information, see the ModelProfile class reference.

Deploy to target

Deployment uses the inference configuration deployment configuration to deploy the models. The deployment process is similar regardless of the compute target. Deploying to AKS is slightly different, as you must provide a reference to the AKS cluster.

Local deployment

To deploy locally, you need to have Docker installed on your local machine.

Using the SDK

from azureml.core.webservice import LocalWebservice, Webservice

deployment_config = LocalWebservice.deploy_configuration(port=8890)
service = Model.deploy(ws, "myservice", [model], inference_config, deployment_config)
service.wait_for_deployment(show_output = True)
print(service.state)

For more information, see the reference documentation for LocalWebservice, Model.deploy(), and Webservice.

Using the CLI

To deploy using the CLI, use the following command. Replace mymodel:1 with the name and version of the registered model:

az ml model deploy -m mymodel:1 -ic inferenceconfig.json -dc deploymentconfig.json

The entries in the deploymentconfig.json document map to the parameters for LocalWebservice.deploy_configuration. The following table describes the mapping between the entities in the JSON document and the parameters for the method:

JSON entity Method parameter Description
computeType NA The compute target. For local, the value must be local.
port port The local port on which to expose the service's HTTP endpoint.

The following JSON is an example deployment configuration for use with the CLI:

{
    "computeType": "local",
    "port": 32267
}

For more information, see the az ml model deploy reference.

NotebookVM web service (DEVTEST)

See Deploy a model to Notebook VMs.

Azure Container Instances (DEVTEST)

See Deploy to Azure Container Instances.

Azure Kubernetes Service (DEVTEST & PRODUCTION)

See Deploy to Azure Kubernetes Service.

Consume web services

Every deployed web service provides a REST API, so you can create client applications in a variety of programming languages. If you have enabled key authentication for your service, you need to provide a service key as a token in your request header. If you have enabled token authentication for your service, you need to provide an Azure Machine Learning JWT token as a bearer token in your request header.

Tip

You can retrieve the schema JSON document after deploying the service. Use the swagger_uri property from the deployed web service, such as service.swagger_uri, to get the URI to the local web service's Swagger file.

Request-response consumption

Here is an example of how to invoke your service in Python:

import requests
import json

headers = {'Content-Type': 'application/json'}

if service.auth_enabled:
    headers['Authorization'] = 'Bearer '+service.get_keys()[0]
elif service.token_auth_enabled:
    headers['Authorization'] = 'Bearer '+service.get_token()[0]

print(headers)

test_sample = json.dumps({'data': [
    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
    [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]
]})

response = requests.post(
    service.scoring_uri, data=test_sample, headers=headers)
print(response.status_code)
print(response.elapsed)
print(response.json())

For more information, see Create client applications to consume webservices.

Web service schema (OpenAPI specification)

If you used the automatic schema generation with the deployment, you can get the address of the OpenAPI specification for the service by using the swagger_uri property. For example, print(service.swagger_uri). Use a GET request (or open the URI in a browser) to retrieve the specification.

The following JSON document is an example of a schema (OpenAPI specification) generated for a deployment:

{
    "swagger": "2.0",
    "info": {
        "title": "myservice",
        "description": "API specification for the Azure Machine Learning service myservice",
        "version": "1.0"
    },
    "schemes": [
        "https"
    ],
    "consumes": [
        "application/json"
    ],
    "produces": [
        "application/json"
    ],
    "securityDefinitions": {
        "Bearer": {
            "type": "apiKey",
            "name": "Authorization",
            "in": "header",
            "description": "For example: Bearer abc123"
        }
    },
    "paths": {
        "/": {
            "get": {
                "operationId": "ServiceHealthCheck",
                "description": "Simple health check endpoint to ensure the service is up at any given point.",
                "responses": {
                    "200": {
                        "description": "If service is up and running, this response will be returned with the content 'Healthy'",
                        "schema": {
                            "type": "string"
                        },
                        "examples": {
                            "application/json": "Healthy"
                        }
                    },
                    "default": {
                        "description": "The service failed to execute due to an error.",
                        "schema": {
                            "$ref": "#/definitions/ErrorResponse"
                        }
                    }
                }
            }
        },
        "/score": {
            "post": {
                "operationId": "RunMLService",
                "description": "Run web service's model and get the prediction output",
                "security": [
                    {
                        "Bearer": []
                    }
                ],
                "parameters": [
                    {
                        "name": "serviceInputPayload",
                        "in": "body",
                        "description": "The input payload for executing the real-time machine learning service.",
                        "schema": {
                            "$ref": "#/definitions/ServiceInput"
                        }
                    }
                ],
                "responses": {
                    "200": {
                        "description": "The service processed the input correctly and provided a result prediction, if applicable.",
                        "schema": {
                            "$ref": "#/definitions/ServiceOutput"
                        }
                    },
                    "default": {
                        "description": "The service failed to execute due to an error.",
                        "schema": {
                            "$ref": "#/definitions/ErrorResponse"
                        }
                    }
                }
            }
        }
    },
    "definitions": {
        "ServiceInput": {
            "type": "object",
            "properties": {
                "data": {
                    "type": "array",
                    "items": {
                        "type": "array",
                        "items": {
                            "type": "integer",
                            "format": "int64"
                        }
                    }
                }
            },
            "example": {
                "data": [
                    [ 10, 9, 8, 7, 6, 5, 4, 3, 2, 1 ]
                ]
            }
        },
        "ServiceOutput": {
            "type": "array",
            "items": {
                "type": "number",
                "format": "double"
            },
            "example": [
                3726.995
            ]
        },
        "ErrorResponse": {
            "type": "object",
            "properties": {
                "status_code": {
                    "type": "integer",
                    "format": "int32"
                },
                "message": {
                    "type": "string"
                }
            }
        }
    }
}

For more information on the specification, see the Open API specification.

For a utility that can create client libraries from the specification, see swagger-codegen.

Batch inference

Azure Machine Learning Compute targets are created and managed by the Azure Machine Learning service. They can be used for batch prediction from Azure Machine Learning Pipelines.

For a walkthrough of batch inference with Azure Machine Learning Compute, read the How to Run Batch Predictions article.

IoT Edge inference

Support for deploying to the edge is in preview. For more information, see the Deploy Azure Machine Learning as an IoT Edge module article.

Update web services

When you create a new model, you must manually update each service that you want to use the new model. To update the web service, use the update method. The following code demonstrates how to use the SDK to update the model for a web service:

from azureml.core.webservice import Webservice
from azureml.core.model import Model

# register new model
new_model = Model.register(model_path="outputs/sklearn_mnist_model.pkl",
                           model_name="sklearn_mnist",
                           tags={"key": "0.1"},
                           description="test",
                           workspace=ws)

service_name = 'myservice'
# Retrieve existing service
service = Webservice(name=service_name, workspace=ws)

# Update to new model(s)
service.update(models=[new_model])
print(service.state)
print(service.get_logs())

You can also update a web service using the ML CLI. The following example demonstrates registering a new model, and then updating web service to use the new model:

az ml model register -n sklearn_mnist  --asset-path outputs/sklearn_mnist_model.pkl  --experiment-name myexperiment --output-metadata-file modelinfo.json
az ml service update -n myservice --model-metadata-file modelinfo.json

Tip

In this example, a JSON document is used to pass the model information from the registration command into the update command.

Continuous model deployment

You can continuously deploy models using the Machine Learning extension for Azure DevOps. By using the Machine Learning extension for Azure DevOps, you can trigger a deployment pipeline when a new machine learning model is registered in Azure Machine Learning service workspace.

  1. Sign up for Azure Pipelines, which makes continuous integration and delivery of your application to any platform/any cloud possible. Azure Pipelines differs from ML pipelines.

  2. Create an Azure DevOps project.

  3. Install the Machine Learning extension for Azure Pipelines

  4. Use service Connections to set up a service principal connection to your Azure Machine Learning service workspace to access all your artifacts. Go to project settings, click on service connections, and select Azure Resource Manager.

    view-service-connection

  5. Define AzureMLWorkspace as the scope level and fill in the subsequent parameters.

    view-azure-resource-manager

  6. Next, to continuously deploy your machine learning model using the Azure Pipelines, under pipelines select release. Add a new artifact, select AzureML Model artifact and the service connection that was created in the earlier step. Select the model and version to trigger a deployment.

    select-AzureMLmodel-artifact

  7. Enable the model trigger on your model artifact. By turning on the trigger, every time the specified version (i.e the newest version) of that model is register in your workspace, an Azure DevOps release pipeline is triggered.

    enable-model-trigger

For more sample projects and examples, see the following sample repos:

Clean up resources

To delete a deployed web service, use service.delete(). To delete a registered model, use model.delete().

For more information, see the reference documentation for WebService.delete(), and Model.delete().

Next steps