This reference architecture describes how Azure Arc extends Kubernetes cluster management and configuration across customer datacenters, edge locations, and multiple cloud environments.
Download a Visio file of this architecture.
The following workflow corresponds to the previous diagram:
Azure Arc-enabled Kubernetes: Attach and configure Kubernetes clusters inside or outside of Azure by using Azure Arc-enabled Kubernetes. When a Kubernetes cluster is attached to Azure Arc, it's assigned an Azure Resource Manager ID and a managed identity.
Azure Kubernetes Service (AKS): Host Kubernetes clusters in Azure to reduce the complexity and operational overhead of Kubernetes cluster management.
On-premises Kubernetes cluster: Attach Cloud Native Computing Foundation (CNCF)-certified Kubernetes clusters that are hosted in on-premises or non-Microsoft cloud environments.
Azure Policy: Deploy and manage policies for Azure Arc-enabled Kubernetes clusters.
Azure Monitor: Observe and monitor Azure Arc-enabled Kubernetes clusters.
Azure Arc extends the Azure platform, which makes it possible to build applications and services that can run across datacenters, at the edge, and in multicloud environments.
AKS is a managed service for deploying and scaling Kubernetes clusters.
Azure Policy makes it possible to achieve real-time cloud compliance at scale with consistent resource governance.
Azure Monitor provides end-to-end observability for your applications, infrastructure, and network.
You can use Azure Arc to register Kubernetes clusters that are hosted outside of Microsoft Azure. You can then use Azure tools to manage these clusters and AKS-hosted clusters.
Typical uses for this architecture include:
Managing inventory, grouping, and tagging for on-premises Kubernetes clusters and AKS-hosted clusters.
Using Azure Monitor to monitor Kubernetes clusters across hybrid environments.
Using Azure Policy to help deploy and enforce policies for Kubernetes clusters across hybrid environments.
Using Azure Policy to help deploy and enforce GitOps.
Maximizing your on-premises graphics processing unit (GPU) investment by training and deploying Azure Machine Learning workflows.
Using Azure Monitor managed service for Prometheus and Managed Grafana to monitor and visualize Kubernetes workloads.
You can apply the following recommendations to most scenarios. Follow these recommendations unless you have a specific requirement that overrides them.
You can register any active CNCF Kubernetes cluster. You need a kubeconfig
file to access the cluster and a cluster-admin role on the cluster to deploy Azure Arc-enabled Kubernetes agents. Use the Azure CLI to perform cluster registration tasks. The user or service principal that you use for the az login
and az connectedk8s connect
commands requires Read and Write permissions on the Microsoft.Kubernetes/connectedClusters
resource type. The Kubernetes Cluster - Azure Arc Onboarding role has these permissions and can be used for role assignments on either the user principal or the service principal. Helm 3 is required to onboard the cluster that uses the connectedk8s
extension. The Azure CLI version 2.3 or later is required to install the Azure Arc-enabled Kubernetes CLI extensions.
Azure Arc-enabled Kubernetes consists of a few agents (or operators) that run in the cluster that's deployed to the azure-arc
namespace:
The
deployment.apps/config-agent
watches the connected cluster for source control configuration resources that are applied on the cluster and updates the compliance state.The
deployment.apps/controller-manager
is an operator of operators that orchestrates interactions between Azure Arc components.The
deployment.apps/metrics-agent
collects metrics from other Azure Arc agents to ensure that these agents perform optimally.The
deployment.apps/cluster-metadata-operator
gathers cluster metadata, including the cluster version, node count, and Azure Arc agent version.The
deployment.apps/resource-sync-agent
synchronizes the previously mentioned cluster metadata to Azure.The
deployment.apps/clusteridentityoperator
maintains the Managed Service Identity certificate that's used by other agents to communicate with Azure.The
deployment.apps/flux-logs-agent
collects logs from the flux operators that are deployed as a part of source control configuration.The
deployment.apps/extension-manager
installs and manages the lifecycle of extension Helm charts.The
deployment.apps/kube-aad-proxy
handles authentication for requests sent to the cluster via the AKS cluster connect feature.The
deployment.apps/clusterconnect-agent
is a reverse proxy agent that enables the cluster connect feature to provide access to the API server of the cluster. It's an optional component that's deployed only if the cluster connect feature is enabled on the cluster.The
deployment.apps/guard
is an authentication and authorization webhook server that's used for Microsoft Entra role-based access control (RBAC). It's an optional component that's deployed only if Azure RBAC is enabled on the cluster.The
deployment.apps/extension-events-collector
collects logs related to extensions lifecycle management. It aggregates these logs into events that correspond to each operation, such as Create, Upgrade, and Delete.The
deployment.apps/logcollector
collects platform telemetry to help ensure the operational health of the platform.
For more information, see Connect an existing Kubernetes cluster to Azure Arc.
Monitoring your containers is crucial. Azure Monitor container insights provides robust monitoring capabilities for AKS and AKS engine clusters. You can also configure Azure Monitor container insights to monitor Azure Arc-enabled Kubernetes clusters that are hosted outside of Azure. This configuration provides comprehensive monitoring of your Kubernetes clusters across Azure, on-premises, and in non-Microsoft cloud environments.
Azure Monitor container insights provides performance visibility by collecting memory and processor metrics from controllers, nodes, and containers. These metrics are available in Kubernetes through the Metrics API. Container logs are also collected. After you enable monitoring from Kubernetes clusters, a containerized version of the Log Analytics agent automatically collects metrics and logs. Metrics are written to the metrics store, and log data is written to the logs store that's associated with your Log Analytics workspace. For more information, see Azure Monitor features for Kubernetes monitoring.
You can enable Azure Monitor container insights for one or more deployments of Kubernetes by using a PowerShell script or a Bash script.
For more information, see Enable monitoring for Kubernetes clusters.
Use Azure Policy to make sure that each GitOps–enabled Microsoft.Kubernetes/connectedclusters
resource or Microsoft.ContainerService/managedClusters
resource has specific Microsoft.KubernetesConfiguration/fluxConfigurations
applied on it. For example, you can apply a baseline configuration to one or more clusters, or deploy specific applications to multiple clusters. To use Azure Policy, choose a definition from the Azure Policy built-in definitions for Azure Arc-enabled Kubernetes and then create a policy assignment. When you create the policy assignment, set the scope to an Azure resource group or subscription. Also set the parameters for the fluxConfiguration
that's created. When the assignment is created, the Azure Policy engine identifies all connectedCluster
or managedCluster
resources that are in scope and then applies the fluxConfiguration
to each resource.
If you use multiple source repositories for each cluster, such as one repository for the central IT or cluster operator and other repositories for application teams, activate this feature by using multiple policy assignments and configure each policy assignment to use a different source repository.
For more information, see Deploy applications consistently at scale by using Flux v2 configurations and Azure Policy.
GitOps is the practice of defining the desired state of Kubernetes configurations, such as deployments and namespaces, in a source repository. This repository can be a Git or Helm repository, Buckets, or Azure Blob Storage. This process is followed by a polling and pull-based deployment of these configurations to the cluster by using an operator.
The connection between your cluster and one or more source repositories is enabled by deploying the microsoft.flux
extension to your cluster. The fluxConfiguration
resource properties represent where and how Kubernetes resources should flow from the source repository to your cluster. The fluxConfiguration
data is stored encrypted at rest in an Azure Cosmos DB database to help ensure data confidentiality.
The flux-config
agent that runs in your cluster monitors for new or updated fluxConfiguration
extension resources on the Azure Arc-enabled Kubernetes resource, deploys applications from the source repository, and propagates all updates that are made to the fluxConfiguration
. You can create multiple fluxConfiguration
resources by using the namespace
scope on the same Azure Arc-enabled Kubernetes cluster to achieve multi-tenancy.
The source repository can contain any valid Kubernetes resources, including Namespaces, ConfigMaps, Deployments, and DaemonSets. It can also contain Helm charts for deploying applications. Common source repository scenarios include defining a baseline configuration for your organization that can include common RBAC roles and bindings, monitoring agents, logging agents, and cluster-wide services.
You can also manage a larger collection of clusters that are deployed across heterogeneous environments. For example, you can have one repository that defines the baseline configuration for your organization, and then apply that configuration to multiple Kubernetes clusters simultaneously. You can also deploy applications to a cluster from multiple source repositories.
For more information, see Deploy applications by using GitOps with Flux v2.
In Machine Learning, you can choose an AKS (or Azure Arc-enabled Kubernetes) cluster as a compute target for your machine learning processes. This capability enables you to train or deploy machine learning models in your own, self-hosted (or multicloud) infrastructure. This approach allows you to combine your on-premises investments on GPUs with the ease of management that Machine Learning provides in the cloud.
Azure Monitor provides a managed service for both Prometheus and Grafana deployments, so that you can take advantage of these popular Kubernetes monitoring tools. This managed service allows you to use these tools without the need to manage and update the deployments yourself. To analyze Prometheus' metrics, use the metrics explorer with PromQL.
Azure Arc agents require the following protocols, ports, and outbound URLs to function.
Endpoint (DNS) | Description |
---|---|
https://management.azure.com:443 |
Required for the agent to connect to Azure and register the cluster. |
https://[region].dp.kubernetesconfiguration.azure.com:443 |
Data plane endpoint for the agent to push status and fetch configuration information, where [region] represents the Azure region that hosts the AKS instance. |
https://docker.io:443 |
Required to pull container images. |
https://github.com:443 , git://github.com:9418 |
Example GitOps repos are hosted on GitHub. The configuration agent requires connectivity to the git endpoint that you specify. |
https://login.microsoftonline.com:443 , https://<region>.login.microsoft.com , login.windows.net |
Required to fetch and update Azure Resource Manager tokens. |
https://mcr.microsoft.com:443 https://*.data.mcr.microsoft.com:443 |
Required to pull container images for Azure Arc agents. |
For a complete list of URLs across Azure Arc services, see Azure Arc network requirements.
These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that you can use to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.
Reliability helps ensure that your application can meet the commitments that you make to your customers. For more information, see Design review checklist for Reliability.
In most scenarios, the location that you choose when you create the installation script should be the Azure region that's geographically closest to your on-premises resources. The rest of the data is stored within the Azure geography that contains the region you specify. This detail might affect your choice of region if you have data residency requirements. If an outage affects the Azure region that your machine is connected to, the outage doesn't affect the connected machine, but management operations that use Azure might not complete. If you have multiple locations that provide a geographically redundant service, connect the machines in each location to a different Azure region. This practice improves resiliency if a regional outage occurs. For more information, see Supported regions for Azure Arc-enabled Kubernetes.
You should ensure that the services in your solution are supported in the region where Azure Arc is deployed.
Security provides assurances against deliberate attacks and the misuse of your valuable data and systems. For more information, see Design review checklist for Security.
You can use Azure RBAC to manage access to Azure Arc-enabled Kubernetes across Azure and on-premises environments that use Microsoft Entra identities. For more information, see Use Azure RBAC for Kubernetes Authorization.
Microsoft recommends that you use a service principal that has limited privileges to onboard Kubernetes clusters to Azure Arc. This practice is useful in continuous integration and continuous delivery pipelines such as Azure Pipelines and GitHub Actions. For more information, see Create an Azure Arc-enabled onboarding service principal.
To simplify service principal management, you can use managed identities in AKS. However, clusters must be created by using the managed identity. The existing clusters, which include Azure and on-premises clusters, can't be migrated to managed identities. For more information, see Use a managed identity in AKS.
Cost Optimization focuses on ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Design review checklist for Cost Optimization.
For general cost considerations, see Cost Optimization design principles.
Operational Excellence covers the operations processes that deploy an application and keep it running in production. For more information, see Design review checklist for Operational Excellence.
Before you configure your Azure Arc-enabled Kubernetes clusters, review the Azure Resource Manager subscription limits and resource group limits to plan for the number of clusters.
Use Helm, which is an open-source packaging tool, to install and manage the Kubernetes application lifecycles. Similar to Linux package managers such as APT and Yum, use Helm to manage Kubernetes charts, which are packages of preconfigured Kubernetes resources.
Microsoft maintains this article. The following contributors wrote this article.
Principal author:
- Pieter de Bruin | Senior Program Manager
To see nonpublic LinkedIn profiles, sign in to LinkedIn.
- Azure Arc documentation
- Azure Arc-enabled Kubernetes documentation
- AKS documentation
- Azure Policy documentation
- Azure Monitor documentation
- Connect an existing Kubernetes cluster to Azure Arc
Related hybrid guidance:
Related architectures: