AKS cluster created in a single zone in centralus - Want to add second cluster in another zone or another region

Ima Sysadmin 40 Reputation points
2023-01-13T22:09:08.9566667+00:00

Hello,

I have an AKS cluster in the centralus region which was created in a single zone. I'm investigating what it will take to add a second cluster with HA in the same region or a second cluster in a different region (using Traffic Manager in both cases).

The goal is a setup to provide improved disaster recovery redundancy and to allow me to use Traffic Manager to take a cluster offline if for example an Kubernetes update on it has an issue or something else happens to it (so all traffic would go to the unaffected cluster).

If anyone has info regarding my below question, I would appreciate the advice:

Would we need to entirely deploy new (2) new AKS clusters to achieve the above, then deploy our applications again to them and then make that the replacement AKS environment, then retire the existing AKS Prod cluster?

Or is there a way to add that second cluster and configure Traffic Manager to distribute traffic to the original cluster + the new cluster?

I read this info in the doc's: https://learn.microsoft.com/en-us/azure/aks/availability-zones

     You can only define availability zones when the cluster or node pool is created.

     Availability zone settings can't be updated after the cluster is created. 

     You also can't update an existing, non-availability zone cluster to use availability zones.

A Microsoft source provided me this info:

    >You just create a new nodepool in the existing AKS cluster and define the availability zones.

I think this would be for adding the second cluster in a second zone in that centralus region.

Another Microsoft source provided this info on a related question:

   >The key is just to plan the deployment as you can’t make some changes after the initial deployment.

Thank you in advance for information.

-Max

Azure Traffic Manager
Azure Traffic Manager
An Azure service that is used to route incoming network traffic for high performance and availability.
111 questions
Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,893 questions
0 comments No comments
{count} votes

Accepted answer
  1. Andrei Barbu 2,576 Reputation points Microsoft Employee
    2023-01-14T12:06:21.01+00:00

    Hello Ima,

    Before trying to answer your question, I would like to clarify a few aspects about Availability Zones.

    (1) Availability Zones physically and logically separated datacenters with their own independent power source, network, and cooling. But they are in the same region. So they provide resiliency only to datacenter failure, but not to regional failure. More details: [https://learn.microsoft.com/en-us/azure/reliability/availability-zones-overview

    (2) A single AKS cluster can be created in one region. For an AKS cluster, availability zones can be defined. More details: [https://learn.microsoft.com/en-us/azure/aks/availability-zones

    However, I understand you would like to have a second AKS cluster that would allow you to use Traffic Manager to distribute the traffic between the two AKS clusters, in case one goes down. So, if you are looking to protect from regional outage, from (1) and (2) we can conclude a new AKS cluster (in a different region than the current one) is required for your setup. Also, if you want to have availability in case one cluster is down (but you don't care about regional outage scenario), you need to create a new AKS cluster, in the same region or another region.

    Now, to attempt answering your question, I would say you need to deploy a new AKS cluster and deploy your application there. To help you with this task, you can make use of Velero (note this is open-source tool, not Microsoft) and AKS node pool snapshot. Please make sure to read the documentations before starting to use them to understand the use cases and limitation (for example for node pool snapshot, you can only use the snapshot in the same region as the source node pool).

    For the Azure Traffic Manager scenario, I am sure you will find this documentation helpful: [https://learn.microsoft.com/en-us/azure/aks/operator-best-practices-multi-region#use-azure-traffic-manager-to-route-traffic

    I hope this answers your question.

    Please "Accept as Answer" and Upvote if it helped, so that it can help others in the community looking for help on similar topics.

    Thank you!

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Adrian Dobrescu 261 Reputation points Microsoft Employee
    2023-01-14T09:10:03.8333333+00:00

    Hello Ima,
    As you already read in the official doc, you can make use of Availability Zones when create/define AKS cluster. If the nodes are spread across 3 zones let's say then it will provide you redundancy in case of a datacenter outage.

    Thus the information you gathered is correct, need to plan it accordingly before deploying the cluster.
    You can find all of those under limitations section: https://learn.microsoft.com/en-us/azure/aks/availability-zones#limitations-and-region-availability

    Please "Accept as Answer" and Upvote if it helped, so that it can help others in the community looking for help on similar topics.

    Thank you!

    1 person found this answer helpful.