Deploy machine learning models to AKS with Kubeflow

Blob Storage
Container Registry
Kubernetes Service

Solution Idea

If you'd like to see us expand this article with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know with GitHub Feedback!

This solution idea is about real-time inference on Azure Kubernetes Service (AKS).

Potential use cases

Use AKS when you need high-scale production deployments of your machine learning models. High-scale means capabilities such as fast response time, autoscaling of the deployed service, and logging. For more information, see Deploy a model to an Azure Kubernetes Service cluster.

In this solution, Kubeflow is used to manage the deployment to AKS. Your ML models run on AKS clusters backed by GPU enabled VMs.


Architecture diagram: deploying machine learning models to Azure Kubernetes Services (AKS) Download an SVG of this architecture.

Data flow

  1. Package machine learning (ML) model into a container and publish to Azure Container Registry (ACR).
  2. Azure Blob storage hosts training data sets and trained model.
  3. Use Kubeflow to deploy training job to Azure Kubernetes Services (AKS); distributed training jobs to AKS include Parameter servers and Worker nodes.
  4. Serve production model using Kubeflow, promoting a consistent environment across test, control, and production.
  5. AKS supports GPU enabled VM.
  6. Developers build features to query the model running in AKS cluster.


Next steps

Read product documentation:

See other Architecture Center articles: