Deploy MLflow models to online endpoints

Article
02/01/2024

APPLIES TO: Azure CLI ml extension v2 (current)

In this article, learn how to deploy your MLflow model to an online endpoint for real-time inference. When you deploy your MLflow model to an online endpoint, you don't need to specify a scoring script or an environment—this functionality is known as no-code deployment.

For no-code-deployment, Azure Machine Learning:

Dynamically installs Python packages provided in the conda.yaml file. Hence, dependencies get installed during container runtime.
Provides an MLflow base image/curated environment that contains the following items:
- azureml-inference-server-http
- mlflow-skinny
- A scoring script for inferencing.

Tip

Workspaces without public network access: Before you can deploy MLflow models to online endpoints without egress connectivity, you have to package the models (preview). By using model packaging, you can avoid the need for an internet connection, which Azure Machine Learning would otherwise require to dynamically install necessary Python packages for the MLflow models.

About the example

The example shows how you can deploy an MLflow model to an online endpoint to perform predictions. The example uses an MLflow model that's based on the Diabetes dataset. This dataset contains 10 baseline variables: age, sex, body mass index, average blood pressure, and six blood serum measurements obtained from 442 diabetes patients. It also contains the response of interest, a quantitative measure of disease progression one year after baseline.

The model was trained using a scikit-learn regressor, and all the required preprocessing has been packaged as a pipeline, making this model an end-to-end pipeline that goes from raw data to predictions.

The information in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo, and then change directories to cli, if you're using the Azure CLI. If you're using the Azure Machine Learning SDK for Python, change directories to sdk/python/endpoints/online/mlflow.

git clone https://github.com/Azure/azureml-examples --depth 1
cd azureml-examples/cli

Follow along in Jupyter Notebook

You can follow the steps for using the Azure Machine Learning Python SDK by opening the Deploy MLflow model to online endpoints notebook in the cloned repository.

Prerequisites

Before following the steps in this article, make sure you have the following prerequisites:

An Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning.
Azure role-based access controls (Azure RBAC) are used to grant access to operations in Azure Machine Learning. To perform the steps in this article, your user account must be assigned the owner or contributor role for the Azure Machine Learning workspace, or a custom role allowing Microsoft.MachineLearningServices/workspaces/onlineEndpoints/*. For more information on roles, see Manage access to an Azure Machine Learning workspace.
You must have an MLflow model registered in your workspace. This article registers a model trained for the Diabetes dataset in the workspace.
Also, you need to:
- Install the Azure CLI and the ml extension to the Azure CLI. For more information on installing the CLI, see Install and set up the CLI (v2).
- Install the Azure Machine Learning SDK for Python.
```
pip install azure-ai-ml azure-identity
```
- Install the MLflow SDK package mlflow and the Azure Machine Learning plug-in for MLflow azureml-mlflow.
```
pip install mlflow azureml-mlflow
```
- If you're not running code in the Azure Machine Learning compute, configure the MLflow tracking URI or MLflow's registry URI to point to the Azure Machine Learning workspace you're working on. For more information on how to connect MLflow to the workspace, see Configure MLflow for Azure Machine Learning.
No additional prerequisites when working in Azure Machine Learning studio.

Connect to your workspace

First, connect to the Azure Machine Learning workspace where you'll work.

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

The workspace is the top-level resource for Azure Machine Learning, providing a centralized place to work with all the artifacts you create when you use Azure Machine Learning. In this section, connect to the workspace in which you'll perform deployment tasks.

Import the required libraries:

from azure.ai.ml import MLClient, Input
from azure.ai.ml.entities import (
ManagedOnlineEndpoint,
ManagedOnlineDeployment,
Model,
Environment,
CodeConfiguration,
)
from azure.identity import DefaultAzureCredential
from azure.ai.ml.constants import AssetTypes

Configure workspace details and get a handle to the workspace:

subscription_id = "<subscription>"
resource_group = "<resource-group>"
workspace = "<workspace>"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

Import the required libraries

import json
import mlflow
import requests
import pandas as pd
from mlflow.deployments import get_deploy_client
from mlflow.tracking import MlflowClient

Initialize the MLflow client
```
mlflow_client = MlflowClient()
```

Configure the deployment client

deployment_client = get_deploy_client(mlflow.get_tracking_uri())

Register the model

You can deploy only registered models to online endpoints. In this case, you already have a local copy of the model in the repository, so you only need to publish the model to the registry in the workspace. You can skip this step if the model you're trying to deploy is already registered.

MODEL_NAME='sklearn-diabetes'
az ml model create --name $MODEL_NAME --type "mlflow_model" --path "endpoints/online/ncd/sklearn-diabetes/model"

model_name = 'sklearn-diabetes'
model_local_path = "sklearn-diabetes/model"
model = ml_client.models.create_or_update(
        Model(name=model_name, path=model_local_path, type=AssetTypes.MLFLOW_MODEL)
)

model_name = 'sklearn-diabetes'
model_local_path = "sklearn-diabetes/model"

registered_model = mlflow_client.create_model_version(
    name=model_name, source=f"file://{model_local_path}"
)
version = registered_model.version

What if your model was logged inside of a run?

If your model was logged inside of a run, you can register it directly.

To register the model, you need to know the location where it is stored. If you're using MLflow's autolog feature, the path to the model depends on the model type and framework. You should check the jobs output to identify the name of the model's folder. This folder contains a file named MLModel.

If you're using the log_model method to manually log your models, then pass the path to the model as the argument to the method. For example, if you log the model, using mlflow.sklearn.log_model(my_model, "classifier"), then the path where the model is stored is called classifier.

Use the Azure Machine Learning CLI v2 to create a model from a training job output. In the following example, a model named $MODEL_NAME is registered using the artifacts of a job with ID $RUN_ID. The path where the model is stored is $MODEL_PATH.

az ml model create --name $MODEL_NAME --path azureml://jobs/$RUN_ID/outputs/artifacts/$MODEL_PATH

Note

The path $MODEL_PATH is the location where the model has been stored in the run.

model_name = 'sklearn-diabetes'

ml_client.models.create_or_update(
    Model(
        path=f"azureml://jobs/{RUN_ID}/outputs/artifacts/{MODEL_PATH}"
        name=model_name,
        type=AssetTypes.MLFLOW_MODEL
    )
)

Note

The path MODEL_PATH is the location where the model has been stored in the run.

model_name = 'sklearn-diabetes'

registered_model = mlflow_client.create_model_version(
    name=model_name, source=f"runs://{RUN_ID}/{MODEL_PATH}"
)
version = registered_model.version

Note

The path MODEL_PATH is the location where the model has been stored in the run.

Deploy an MLflow model to an online endpoint

Configure the endpoint where the model will be deployed. The following example configures the name and authentication mode of the endpoint:

Set an endpoint name by running the following command (replace YOUR_ENDPOINT_NAME with a unique name):

export ENDPOINT_NAME="<YOUR_ENDPOINT_NAME>"

Configure the endpoint:

create-endpoint.yaml

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineEndpoint.schema.json
name: my-endpoint
auth_mode: key

# Creating a unique endpoint name with current datetime to avoid conflicts
import datetime

endpoint_name = "sklearn-diabetes-" + datetime.datetime.now().strftime("%m%d%H%M%f")

endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    description="An online endpoint to generate predictions for the diabetes dataset",
    auth_mode="key",
    tags={"foo": "bar"},
)

You can configure the properties of this endpoint using a configuration file. In this case, you're configuring the authentication mode of the endpoint to be "key".


# Creating a unique endpoint name with current datetime to avoid conflicts
import datetime

endpoint_name = "sklearn-diabetes-" + datetime.datetime.now().strftime("%m%d%H%M%f")

endpoint_config = {
    "auth_mode": "key",
    "identity": {
        "type": "system_assigned"
    }
}

Write this configuration into a JSON file:

endpoint_config_path = "endpoint_config.json"
with open(endpoint_config_path, "w") as outfile:
    outfile.write(json.dumps(endpoint_config))

Create the endpoint:

az ml online-endpoint create --name $ENDPOINT_NAME -f endpoints/online/ncd/create-endpoint.yaml

ml_client.begin_create_or_update(endpoint)

endpoint = deployment_client.create_endpoint(
    name=endpoint_name,
    config={"endpoint-config-file": endpoint_config_path},
)

Configure the deployment. A deployment is a set of resources required for hosting the model that does the actual inferencing.

sklearn-deployment.yaml

$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: sklearn-deployment
endpoint_name: my-endpoint
model:
  name: mir-sample-sklearn-ncd-model
  version: 1
  path: sklearn-diabetes/model
  type: mlflow_model
instance_type: Standard_DS3_v2
instance_count: 1

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=endpoint_name,
    model=model,
    instance_type="Standard_F4s_v2",
    instance_count=1
)

Alternatively, if your endpoint doesn't have egress connectivity, use model packaging (preview) by including the argument with_package=True:

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=endpoint_name,
    model=model,
    instance_type="Standard_F4s_v2",
    instance_count=1,
    with_package=True,
)

blue_deployment_name = "blue"

To configure the hardware requirements of your deployment, create a JSON file with the desired configuration:

deploy_config = {
    "instance_type": "Standard_F4s_v2",
    "instance_count": 1,
}

Note

For details about the full specification of this configuration, see Managed online deployment schema (v2).

Write the configuration to a file:

deployment_config_path = "deployment_config.json"
with open(deployment_config_path, "w") as outfile:
    outfile.write(json.dumps(deploy_config))

Note

Autogeneration of the scoring_script and environment are only supported for pyfunc model flavor. To use a different model flavor, see Customizing MLflow model deployments.

Create the deployment:
```
az ml online-deployment create --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/ncd/sklearn-deployment.yaml --all-traffic
```
If your endpoint doesn't have egress connectivity, use model packaging (preview) by including the flag --with-package:
```
az ml online-deployment create --with-package --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/ncd/sklearn-deployment.yaml --all-traffic
```
```
ml_client.online_deployments.begin_create_or_update(blue_deployment)
```
```
blue_deployment = deployment_client.create_deployment(
    name=blue_deployment_name,
    endpoint=endpoint_name,
    model_uri=f"models:/{model_name}/{version}",
    config={"deploy-config-file": deployment_config_path},
)    
```
1. From the Endpoints page, Select Create from the Real-time endpoints tab.
2. Choose the MLflow model that you registered previously, then select the Select button.
  
  Note
  
  The configuration page includes a note to inform you that the the scoring script and environment are auto generated for your selected MLflow model.
3. Select New to deploy to a new endpoint.
4. Provide a name for the endpoint and deployment or keep the default names.
5. Select Deploy to deploy the model to the endpoint.
Assign all the traffic to the deployment. So far, the endpoint has one deployment, but none of its traffic is assigned to it.
This step in not required in the Azure CLI, since you used the --all-traffic flag during creation. If you need to change traffic, you can use the command az ml online-endpoint update --traffic. For more information on how to update traffic, see Progressively update traffic.
```
endpoint.traffic = {"blue": 100}
```
```
traffic_config = {"traffic": {blue_deployment_name: 100}}
```
Write the configuration to a file:
```
traffic_config_path = "traffic_config.json"
with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))
```
This step in not required in the studio.
Update the endpoint configuration:
This step in not required in the Azure CLI, since you used the --all-traffic flag during creation. If you need to change traffic, you can use the command az ml online-endpoint update --traffic. For more information on how to update traffic, see Progressively update traffic.
```
ml_client.begin_create_or_update(endpoint).result()
```
```
deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)
```
This step in not required in the studio.

Invoke the endpoint

Once your deployment is ready, you can use it to serve request. One way to test the deployment is by using the built-in invocation capability in the deployment client you're using. The following JSON is a sample request for the deployment.

sample-request-sklearn.json

{"input_data": {
    "columns": [
      "age",
      "sex",
      "bmi",
      "bp",
      "s1",
      "s2",
      "s3",
      "s4",
      "s5",
      "s6"
    ],
    "data": [
      [ 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0 ],
      [ 10.0,2.0,9.0,8.0,7.0,6.0,5.0,4.0,3.0,2.0]
    ],
    "index": [0,1]
  }}

Note

input_data is used in this example, instead of inputs that is used in MLflow serving. This is because Azure Machine Learning requires a different input format to be able to automatically generate the swagger contracts for the endpoints. For more information about expected input formats, see Differences between models deployed in Azure Machine Learning and MLflow built-in server.

Submit a request to the endpoint as follows:

az ml online-endpoint invoke --name $ENDPOINT_NAME --request-file endpoints/online/ncd/sample-request-sklearn.json

ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    request_file="sample-request-sklearn.json",
)

# Read the sample request that's in the json file to construct a pandas data frame
with open("sample-request-sklearn.json", "r") as f:
    sample_request = json.loads(f.read())
    samples = pd.DataFrame(**sample_request["input_data"])

deployment_client.predict(endpoint=endpoint_name, df=samples)

The response will be similar to the following text:

[ 
  11633.100167144921,
  8522.117402884991
]

Important

For MLflow no-code-deployment, testing via local endpoints is currently not supported.

Customize MLflow model deployments

You don't have to specify a scoring script in the deployment definition of an MLflow model to an online endpoint. However, you can opt to do so and customize how inference gets executed.

You'll typically want to customize your MLflow model deployment when:

The model doesn't have a PyFunc flavor on it.
You need to customize the way the model is run, for instance, to use a specific flavor to load the model, using mlflow.<flavor>.load_model().
You need to do pre/post processing in your scoring routine when it's not done by the model itself.
The output of the model can't be nicely represented in tabular data. For instance, it's a tensor representing an image.

Important

If you choose to specify a scoring script for an MLflow model deployment, you'll also have to specify the environment where the deployment will run.

Steps

To deploy an MLflow model with a custom scoring script:

Identify the folder where your MLflow model is located.

a. Go to the Azure Machine Learning studio.

b. Go to the Models section.

c. Select the model you're trying to deploy and go to its Artifacts tab.

d. Take note of the folder that is displayed. This folder was specified when the model was registered.

Create a scoring script. Notice how the folder name model that you previously identified is included in the init() function.

Tip

The following scoring script is provided as an example about how to perform inference with an MLflow model. You can adapt this script to your needs or change any of its parts to reflect your scenario.

score.py

import logging
import os
import json
import mlflow
from io import StringIO
from mlflow.pyfunc.scoring_server import infer_and_parse_json_input, predictions_to_json


def init():
    global model
    global input_schema
    # "model" is the path of the mlflow artifacts when the model was registered. For automl
    # models, this is generally "mlflow-model".
    model_path = os.path.join(os.getenv("AZUREML_MODEL_DIR"), "model")
    model = mlflow.pyfunc.load_model(model_path)
    input_schema = model.metadata.get_input_schema()


def run(raw_data):
    json_data = json.loads(raw_data)
    if "input_data" not in json_data.keys():
        raise Exception("Request must contain a top level key named 'input_data'")

    serving_input = json.dumps(json_data["input_data"])
    data = infer_and_parse_json_input(serving_input, input_schema)
    predictions = model.predict(data)

    result = StringIO()
    predictions_to_json(predictions, result)
    return result.getvalue()

Warning

MLflow 2.0 advisory: The provided scoring script will work with both MLflow 1.X and MLflow 2.X. However, be advised that the expected input/output formats on those versions might vary. Check the environment definition used to ensure you're using the expected MLflow version. Notice that MLflow 2.0 is only supported in Python 3.8+.

Create an environment where the scoring script can be executed. Since the model is an MLflow model, the conda requirements are also specified in the model package. For more details about the files included in an MLflow model see The MLmodel format. You'll then build the environment using the conda dependencies from the file. However, you need to also include the package azureml-inference-server-http, which is required for online deployments in Azure Machine Learning.

The conda definition file is as follows:

conda.yml
```
channels:
- conda-forge
dependencies:
- python=3.9
- pip
- pip:
  - mlflow
  - scikit-learn==1.2.2
  - cloudpickle==2.2.1
  - psutil==5.9.4
  - pandas==2.0.0
  - azureml-inference-server-http
name: mlflow-env
```
Note

The azureml-inference-server-http package has been added to the original conda dependencies file.

You'll use this conda dependencies file to create the environment:
The environment will be created inline in the deployment configuration.
```
environment = Environment(
    conda_file="sklearn-diabetes/environment/conda.yml",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04:latest",
)
```
This operation isn't supported in MLflow SDK
1. Go to the Environments tab on the side menu.
2. Select the tab Custom environments > Create.
3. Enter the name of the environment, in this case sklearn-mlflow-online-py37.
4. For Select environment source, choose Use existing docker image with optional conda file.
5. For Container registry image path, enter mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04.
6. Select Next to go to the Customize section.
7. Copy the content of the sklearn-diabetes/environment/conda.yml file and paste it in the text box.
8. Select Next to go to the Tags page, and then Next again.
9. On the Review page, select Create. The environment is ready for use.
Create the deployment:
Create a deployment configuration file deployment.yml:
```
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: sklearn-diabetes-custom
endpoint_name: my-endpoint
model: azureml:sklearn-diabetes@latest
environment: 
  image: mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu22.04
  conda_file: sklearn-diabetes/environment/conda.yml
code_configuration:
  code: sklearn-diabetes/src
  scoring_script: score.py
instance_type: Standard_F2s_v2
instance_count: 1
```
Create the deployment:
```
az ml online-deployment create -f deployment.yml
```
```
blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=endpoint_name,
    model=model,
    environment=environment,
    code_configuration=CodeConfiguration(
        code="sklearn-diabetes/src",
        scoring_script="score.py"
    ),
    instance_type="Standard_F4s_v2",
    instance_count=1,
)
```
This operation isn't supported in MLflow SDK
1. From the Endpoints page, Select +Create.
2. Select the MLflow model you registered previously.
3. Select More options in the endpoint creation wizard to open up advanced options.
4. Provide a name and authentication type for the endpoint, and then select Next to see that the model you selected is being used for your deployment.
5. Select Next to continue to the _Deployment page.
6. Select Next to go to the Code + environment page. When you select a model registered in MLflow format, you don't need to specify a scoring script or an environment on this page. However, you want to specify one in this section
7. Select the slider next to Customize environment and scoring script.
8. Browse to select the scoring script you created previously.
9. Select Custom environments for the environment type.
10. Select the custom environment you created previously, and select Next.
11. Complete the wizard to deploy the model to the endpoint.
Once your deployment completes, it is ready to serve requests. One way to test the deployment is by using a sample request file along with the invoke method.

sample-request-sklearn.json
```
{"input_data": {
    "columns": [
      "age",
      "sex",
      "bmi",
      "bp",
      "s1",
      "s2",
      "s3",
      "s4",
      "s5",
      "s6"
    ],
    "data": [
      [ 1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0 ],
      [ 10.0,2.0,9.0,8.0,7.0,6.0,5.0,4.0,3.0,2.0]
    ],
    "index": [0,1]
  }}
```
Submit a request to the endpoint as follows:
```
az ml online-endpoint invoke --name $ENDPOINT_NAME --request-file endpoints/online/ncd/sample-request-sklearn.json
```
```
ml_client.online_endpoints.invoke(
    endpoint_name=endpoint_name,
    deployment_name=deployment.name,
    request_file="sample-request-sklearn.json",
)
```
This operation isn't supported in MLflow SDK
1. Go to the Endpoints tab and select the new endpoint created.
2. Go to the Test tab.
3. Paste the content of the sample-request-sklearn.json file into the Input data to test endpoint box.
4. Select Test.
5. The predictions will show up under "Test results" to the right-hand side of the box.
The response will be similar to the following text:
```
{
  "predictions": [ 
    11633.100167144921,
    8522.117402884991
  ]
}
```
Warning

MLflow 2.0 advisory: In MLflow 1.X, the predictions key will be missing.

Clean up resources

Once you're done using the endpoint, delete its associated resources:

az ml online-endpoint delete --name $ENDPOINT_NAME --yes

ml_client.online_endpoints.begin_delete(endpoint_name)

deployment_client.delete_endpoint(endpoint_name)

Deploy MLflow models to online endpoints

About the example

Follow along in Jupyter Notebook

Prerequisites

Connect to your workspace

Register the model

What if your model was logged inside of a run?

Deploy an MLflow model to an online endpoint

Invoke the endpoint

Customize MLflow model deployments

Steps

Clean up resources

Feedback

Additional resources

Deploy MLflow models to online endpoints

About the example

Follow along in Jupyter Notebook

Prerequisites

Connect to your workspace

Register the model

What if your model was logged inside of a run?

Deploy an MLflow model to an online endpoint

Invoke the endpoint

Customize MLflow model deployments

Steps

Clean up resources

Related content

Feedback

Additional resources