Solution Idea
If you'd like to see us expand this article with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know with GitHub Feedback!
This solution idea is about real-time inference on Azure Kubernetes Service (AKS).
Potential use cases
Use AKS when you need high-scale production deployments of your machine learning models. High-scale means capabilities such as fast response time, autoscaling of the deployed service, and logging. For more information, see Deploy a model to an Azure Kubernetes Service cluster.
In this solution, Kubeflow is used to manage the deployment to AKS. Your ML models run on AKS clusters backed by GPU enabled VMs.
Architecture
Download an SVG of this architecture.
Data flow
- Package machine learning (ML) model into a container and publish to Azure Container Registry (ACR).
- Azure Blob storage hosts training data sets and trained model.
- Use Kubeflow to deploy training job to Azure Kubernetes Services (AKS); distributed training jobs to AKS include Parameter servers and Worker nodes.
- Serve production model using Kubeflow, promoting a consistent environment across test, control, and production.
- AKS supports GPU enabled VM.
- Developers build features to query the model running in AKS cluster.
Components
Next steps
Read product documentation:
- What is Azure Machine Learning?
- Azure Kubernetes Service (AKS)
- Deploy a model to an Azure Kubernetes Service cluster
- Kubeflow on Azure
Related resources
See other Architecture Center articles: