Train and deploy machine learning models

Azure Pipelines

You can use a pipeline to automatically train and deploy machine learning models with the Azure Machine Learning service. Here you'll learn how to build a machine learning model, and then deploy the model as a web service. You'll end up with a pipeline that you can use to train your model.

Prerequisites

Before you read this topic, you should understand how the Azure Machine Learning service works.

Follow the steps in Azure Machine Learning quickstart: portal to create a workspace.

Get the code

Fork this repo in GitHub:

https://github.com/MicrosoftDocs/pipelines-azureml

This sample includes an azure-pipelines.yml file at the root of the repository.

Sign in to Azure Pipelines

Sign in to Azure Pipelines. After you sign in, your browser goes to https://dev.azure.com/my-organization-name and displays your Azure DevOps dashboard.

Within your selected organization, create a project. If you don't have any projects in your organization, you see a Create a project to get started screen. Otherwise, select the Create Project button in the upper-right corner of the dashboard.

Create the pipeline

You can use 1 of the following approach to create a new pipeline.

  1. Sign in to your Azure DevOps organization and navigate to your project.

  2. Go to Pipelines, and then select New Pipeline.

  3. Walk through the steps of the wizard by first selecting GitHub as the location of your source code.

    Select GitHub

    Note

    If this is not what you see, then make sure the Multi-stage pipelines experience is turned on.

  4. You might be redirected to GitHub to sign in. If so, enter your GitHub credentials.

  5. When the list of repositories appears, select your repository.

  6. You might be redirected to GitHub to install the Azure Pipelines app. If so, select Approve and install.

When your new pipeline appears:

  1. Replace myresourcegroup with the name of the Azure resource group that contains your Azure Machine Learning service workspace.

  2. Replace myworkspace with the name of your Azure Machine Learning service workspace.

  3. When you're ready, select Save and run.

  4. You're prompted to commit your changes to the azure-pipelines.yml file in your repository. After you're happy with the message, select Save and run again.

    If you want to watch your pipeline in action, select the build job.

You now have a YAML pipeline in your repository that's ready to train your model!

Azure Machine Learning service automation

There are two primary ways to use automation with the Azure Machine Learning service:

  • The Machine Learning CLI is an extension to the Azure CLI. It provides commands for working with the Azure Machine Learning service.
  • The Azure Machine Learning SDK is Python package that provides programmatic access to the Azure Machine Learning service.
    • The Python SDK includes automated machine learning to assist in automating the time consuming, iterative tasks of machine learning model development.

The example with this document uses the Machine Learning CLI.

Planning

Before you using Azure Pipelines to automate model training and deployment, you must understand the files needed by the model and what indicates a "good" trained model.

Machine learning files

In most cases, your data science team will provide the files and resources needed to train the machine learning model. The following files in the example project would be provided by the data scientists:

  • Training script (train.py): The training script contains logic specific to the model that you are training.
  • Scoring file (score.py): When the model is deployed as a web service, the scoring file receives data from clients and scores it against the model. The output is then returned to the client.
  • RunConfig settings (sklearn.runconfig): Defines how the training script is ran on the compute target that is used for training.
  • Training environment (myenv.yml): Defines the packages needed to run the training script.
  • Deployment environment (deploymentConfig.yml): Defines the resources and compute needed for the deployment environment.
  • Deployment environment (inferenceConfig.yml): Defines the packages needed to run and score the model in the deployment environment.

Some of these files are directly used when developing a model. For example, the train.py and score.py files. However the data scientist may be programmatically creating the run configuration and environment settings. If so, they can create the .runconfig and training environment files, by using RunConfiguration.save(). Alternatively, default run configuration files will be created for all compute targets already in the workspace when running the following command.

az ml folder attach --experiment-name myexp -w myws -g mygroup

The files created by this command are stored in the .azureml directory.

Determine the best model

The example pipeline deploys the trained model without doing any performance checks. In a production scenario, you may want to log metrics so that you can determine the "best" model.

For example, you have a model that is already deployed and has an accuracy of 90. You train a new model based on new checkins to the repo, and the accuracy is only 80, so you don't want to deply it. This is an example of a metric that you can create automation logic around, as you can do a simple comparison to evaluate the model. In other cases, you may have several metrics that are used to indicate the "best" model, and must be evaluated by a human before deployment.

Depending on what "best" looks like for your scenario, you may need to create a release pipeline where someone must inspect the metrics to determine if the model should be deployed.

You should work with your data scientists to understand what metrics are important for your model.

To log metrics during training, use the Run class.

Azure CLI Deploy task

The Azure CLI Deploy task is used to run Azure CLI commands. In the example, it installs the Azure Machine Learning CLI extension and then uses individual CLI commands to train and deploy the model.

Azure Service Connection

The Azure CLI Deploy task requires an Azure service connection. The Azure service connection stores the credentials needed to connect from Azure Pipelines to Azure.

The name of the connection used by the example is azmldemows

To create a service connection, see Create an Azure service connection.

Machine Learning CLI

The following Azure Machine Learning service CLI commands are used in the example for this documemt:

Command Purpose
az ml folder attach Associates the files in the project with your Azure Machine Learning service workspace.
az ml computetarget create Creates a compute target that is used to train the model.
az ml experiment list Lists experiments for your workspace.
az ml run submit-script Submits the model for training.
az ml model register Registers a trained model with your workspace.
az ml model deploy Deploys the model as a web service.
az ml service list Lists deployed services.
az ml service delete Deletes a deployed service.
az ml pipeline list Lists Azure Machine Learning pipelines.
az ml computetarget delete Deletes a compute target.

For more information on these commands, see the CLI extension reference.

Next steps

Learn how you can further integrate machine learning into your pipelines with the Machine Learning extension.

For more examples of using Azure Pipelines with Azure Machine Learning service, see the following repos: