Deploy MLflow models

APPLIES TO: Azure CLI ml extension v2 (current)

In this article, learn how to deploy your MLflow model to Azure ML for both real-time and batch inference. Azure ML supports no-code deployment of models created and logged with MLflow. This means that you don't have to provide a scoring script or an environment.

For no-code-deployment, Azure Machine Learning

  • Dynamically installs Python packages provided in the conda.yaml file, this means the dependencies are installed during container runtime.
    • The base container image/curated environment used for dynamic installation is or AzureML-mlflow-ubuntu18.04-py37-cpu-inference
  • Provides a MLflow base image/curated environment that contains the following items:


If you are used to deploying models using scoring scripts and custom environments and you are looking to know how to achieve the same functionality using MLflow models, we recommend reading Using MLflow models for no-code deployment.


Consider the following limitations when deploying MLflow models to Azure Machine Learning:

  • Spark flavor is not supported at the moment for deployment.
  • Data type mlflow.types.DataType.Binary is not supported as column type in signatures. For models that work with images, we suggest you to use or (a) tensors inputs using the TensorSpec input type, or (b) Base64 encoding schemes with a mlflow.types.DataType.String column type, which is commonly used when there is a need to encode binary data that needs be stored and transferred over media.
  • Signatures with tensors with unspecified shapes (-1) is only supported at the batch size by the moment. For instance, a signature with shape (-1, -1, -1, 3) is not supported but (-1, 300, 300, 3) is.

For more information about how to specify requests to online endpoints, view Considerations when deploying to real-time inference. For more information about the supported file types in batch endpoints, view Considerations when deploying to batch inference.

Deployment tools

There are three workflows for deploying MLflow models to Azure Machine Learning:

Each workflow has different capabilities, particularly around which type of compute they can target. The following table shows them:

Scenario MLflow SDK Azure ML CLI/SDK v2 Azure ML studio
Deploy MLflow models to managed online endpoints
Deploy MLflow models to managed batch endpoints
Deploy MLflow models to ACI/AKS
Deploy MLflow models to ACI/AKS (with a scoring script) 1


  • 1 No-code deployment is not supported when deploying to ACI/AKS from Azure ML studio. We recommend switching to our managed online endpoints instead.

Which option to use?

If you are familiar with MLflow or your platform support MLflow natively (like Azure Databricks) and you wish to continue using the same set of methods, use the azureml-mlflow plugin. On the other hand, if you are more familiar with the Azure ML CLI v2, you want to automate deployments using automation pipelines, or you want to keep deployments configuration in a git repository; we recommend you to use the Azure ML CLI v2. If you want to quickly deploy and test models trained with MLflow, you can use Azure Machine Learning studio UI deployment.

Deploy using the MLflow plugin

The MLflow plugin azureml-mlflow can deploy models to Azure ML, either to Azure Kubernetes Service (AKS), Azure Container Instances (ACI) and Managed Endpoints for real-time serving.


Deploying to managed batch endpoints is not supported in the MLflow plugin at the moment.


  • Install the azureml-mlflow package.
  • If you are running outside an Azure ML compute, configure the MLflow tracking URI or MLflow's registry URI to point to the workspace you are working on. For more information about how to Set up tracking environment, see Track runs using MLflow with Azure Machine Learning for more details.


  1. Ensure your model is registered in Azure Machine Learning registry. Deployment of unregistered models is not supported in Azure Machine Learning. You can register a new model using the MLflow SDK:

    mlflow.register_model(f"runs:/{run_id}/{artifact_path}", "sample-sklearn-mlflow-model")
  2. Deployments can be generated using both the Python SDK for MLflow or MLflow CLI. In both cases, a JSON configuration file can be indicated with the details of the deployment you want to achieve. If not indicated, then a default deployment is done using Azure Container Instances (ACI) and a minimal configuration.

        "instance_type": "Standard_DS2_v2",
        "instance_count": 1,


    The full specification of this configuration can be found at Managed online deployment schema (v2).

  3. Save the deployment configuration to a file:

    import json
    deploy_config = {
       "instance_type": "Standard_DS2_v2",
       "instance_count": 1,
    deployment_config_path = "deployment_config.json"
    with open(deployment_config_path, "w") as outfile:
  4. Create a deployment client using the Azure Machine Learning Tracking URI.

    from mlflow.deployments import get_deploy_client
    # Set the tracking uri in the deployment client.
    client = get_deploy_client("<azureml-mlflow-tracking-url>")
  5. Run the deployment

    model_name = "mymodel"
    model_version = 1
    # define the model path and the name is the service name
    # if model is not registered, it gets registered automatically and a name is autogenerated using the "name" parameter below
       config={ "deploy-config-file": deployment_config_path },

Deploy using Azure ML CLI (v2)

You can use Azure ML CLI v2 to deploy models trained and logged with MLflow to managed endpoints (Online/batch). When you deploy your MLflow model using the Azure ML CLI v2, it's a no-code-deployment so you don't have to provide a scoring script or an environment, but you can if needed.


Before following the steps in this article, make sure you have the following prerequisites:

The information in this article is based on code samples contained in the azureml-examples repository. To run the commands locally without having to copy/paste YAML and other files, clone the repo and then change directories to the cli directory in the repo:

git clone --depth 1
cd azureml-examples
cd cli

If you haven't already set the defaults for the Azure CLI, save your default settings. To avoid passing in the values for your subscription, workspace, and resource group multiple times, use the following commands. Replace the following parameters with values for your specific configuration:

  • Replace <subscription> with your Azure subscription ID.
  • Replace <workspace> with your Azure Machine Learning workspace name.
  • Replace <resource-group> with the Azure resource group that contains your workspace.
  • Replace <location> with the Azure region that contains your workspace.


You can see what your current defaults are by using the az configure -l command.

az account set --subscription <subscription>
az configure --defaults workspace=<workspace> group=<resource-group> location=<location>

In this code snippet used in this article, the ENDPOINT_NAME environment variable contains the name of the endpoint to create and use. To set this, use the following command from the CLI. Replace <YOUR_ENDPOINT_NAME> with the name of your endpoint:



APPLIES TO: Azure CLI ml extension v2 (current)

This example shows how you can deploy an MLflow model to an online endpoint using CLI (v2).


For MLflow no-code-deployment, testing via local endpoints is currently not supported.

  1. Create a YAML configuration file for your endpoint. The following example configures the name and authentication mode of the endpoint:


    name: my-endpoint
    auth_mode: key
  2. To create a new endpoint using the YAML configuration, use the following command:

    az ml online-endpoint create --name $ENDPOINT_NAME -f endpoints/online/mlflow/create-endpoint.yaml
  3. Create a YAML configuration file for the deployment.

    The following example configures a deployment sklearn-diabetes to the endpoint created in the previous step. The model is registered from a job previously run:

    a. Get the job name of the training job. In this example we are assuming the job you want is the last one submitted to the platform.

    JOB_NAME=$(az ml job list --query "[0].name" | tr -d '"')

    b. Register the model in the registry.

    az ml model create --name "mir-sample-sklearn-mlflow-model" \
                       --type "mlflow_model" \
                       --path "azureml://jobs/$JOB_NAME/outputs/artifacts/model"

    c. Create the deployment YAML file:


    name: sklearn-deployment
    endpoint_name: my-endpoint
    model: azureml:mir-sample-sklearn-mlflow-model@latest
    instance_type: Standard_DS2_v2
    instance_count: 1


    For MLflow no-code-deployment (NCD) to work, setting type to mlflow_model is required, type: mlflow_model​. For more information, see CLI (v2) model YAML schema.

  4. To create the deployment using the YAML configuration, use the following command:

    az ml online-deployment create --name sklearn-deployment --endpoint $ENDPOINT_NAME -f endpoints/online/mlflow/sklearn-deployment.yaml --all-traffic

Deploy using Azure Machine Learning studio

You can use Azure Machine Learning studio to deploy models to Managed Online Endpoints.


Although deploying to ACI or AKS with Azure Machine Learning studio is possible, no-code deployment feature is not available for these compute targets. We recommend the use of managed online endpoints as it provides a superior set of features.

  1. Ensure your model is registered in the Azure Machine Learning registry. Deployment of unregistered models is not supported in Azure Machine Learning. You can register models from files in the local file system or from the output of a job:

    You can register the model directly from the job's output using Azure Machine Learning studio. To do so, navigate to the Outputs + logs tab in the run where your model was trained and select the option Create model.

    Animated gif that demonstrates how to register a model directly from outputs.

  2. From studio, select your workspace and then use either the endpoints or models page to create the endpoint deployment:

    1. From the Endpoints page, Select +Create.

      Screenshot showing create option on the Endpoints UI page.

    2. Provide a name and authentication type for the endpoint, and then select Next.

    3. When selecting a model, select the MLflow model registered previously. Select Next to continue.

    4. When you select a model registered in MLflow format, in the Environment step of the wizard, you don't need a scoring script or an environment.

      Screenshot showing no code and environment needed for MLflow models.

    5. Complete the wizard to deploy the model to the endpoint.

      Screenshot showing NCD review screen.

Considerations when deploying to real time inference

The following input's types are supported in Azure ML when deploying models with no-code deployment. Take a look at Notes in the bottom of the table for additional considerations.

Input type Support in MLflow models (serve) Support in Azure ML
JSON-serialized pandas DataFrames in the split orientation
JSON-serialized pandas DataFrames in the records orientation 1
CSV-serialized pandas DataFrames 2
Tensor input format as JSON-serialized lists (tensors) and dictionary of lists (named tensors)
Tensor input formatted as in TF Serving’s API


  • 1 We suggest you to use split orientation instead. Records orientation doesn't guarante column ordering preservation.
  • 2 We suggest you to explore batch inference for processing files.

Regardless of the input type used, Azure Machine Learning requires inputs to be provided in a JSON payload, within a dictionary key input_data. Note that such key is not required when serving models using the command mlflow models serve and hence payloads can't be used interchangeably.

Creating requests

Your inputs should be submitted inside a JSON payload containing a dictionary with key input_data.

Payload example for a JSON-serialized pandas DataFrames in the split orientation

    "input_data": {
        "columns": [
            "age", "sex", "trestbps", "chol", "fbs", "restecg", "thalach", "exang", "oldpeak", "slope", "ca", "thal"
        "index": [1],
        "data": [
            [1, 1, 145, 233, 1, 2, 150, 0, 2.3, 3, 0, 2]

Payload example for a tensor input

    "input_data": [
          [1, 1, 0, 233, 1, 2, 150, 0, 2.3, 3, 0, 2],
          [1, 1, 0, 233, 1, 2, 150, 0, 2.3, 3, 0, 2]
          [1, 1, 0, 233, 1, 2, 150, 0, 2.3, 3, 0, 2],
          [1, 1, 145, 233, 1, 2, 150, 0, 2.3, 3, 0, 2]

Payload example for a named-tensor input

    "input_data": {
        "tokens": [
          [0, 655, 85, 5, 23, 84, 23, 52, 856, 5, 23, 1]
        "mask": [
          [0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]

Considerations when deploying to batch inference

Azure Machine Learning supports no-code deployment for batch inference in managed endpoints. This represents a convenient way to deploy models that require processing of big amounts of data in a batch-fashion.

How work is distributed on workers

Work is distributed at the file level, for both structured and unstructured data. As a consequence, only file datasets or URI folders are supported for this feature. Each worker processes batches of Mini batch size files at a time. Further parallelism can be achieved if Max concurrency per instance is increased.


Nested folder structures are not explored during inference. If you are partitioning your data using folders, make sure to flatten the structure beforehand.

File's types support

The following data types are supported for batch inference.

File extension Type returned as model's input Signature requirement
.csv pd.DataFrame ColSpec. If not provided, columns typing is not enforced.
.png, .jpg, .jpeg, .tiff, .bmp, .gif np.ndarray TensorSpec. Input is reshaped to match tensors shape if available. If no signature is available, tensors of type np.uint8 are inferred.

Next steps

To learn more, review these articles: