Stop and Start an Azure Kubernetes Service (AKS) cluster

Your AKS workloads may not need to run continuously, for example a development cluster that is used only during business hours. This leads to times where your Azure Kubernetes Service (AKS) cluster might be idle, running no more than the system components. You can reduce the cluster footprint by scaling all the User node pools to 0, but your System pool is still required to run the system components while the cluster is running. To optimize your costs further during these periods, you can completely turn off (stop) your cluster. This action will stop your control plane and agent nodes altogether, allowing you to save on all the compute costs, while maintaining all your objects and cluster state stored for when you start it again. You can then pick up right where you left of after a weekend or to have your cluster running only while you run your batch jobs.

Before you begin

This article assumes that you have an existing AKS cluster. If you need an AKS cluster, see the AKS quickstart using the Azure CLI or using the Azure portal.

Limitations

When using the cluster start/stop feature, the following restrictions apply:

  • This feature is only supported for Virtual Machine Scale Sets backed clusters.
  • The cluster state of a stopped AKS cluster is preserved for up to 12 months. If your cluster is stopped for more than 12 months, the cluster state cannot be recovered. For more information, see the AKS Support Policies.
  • You can only start or delete a stopped AKS cluster. To perform any operation like scale or upgrade, start your cluster first.
  • The customer provisioned PrivateEndpoints linked to private cluster need to be deleted and recreated again when you start a stopped AKS cluster.

Stop an AKS Cluster

You can use the az aks stop command to stop a running AKS cluster's nodes and control plane. The following example stops a cluster named myAKSCluster:

az aks stop --name myAKSCluster --resource-group myResourceGroup

You can verify when your cluster is stopped by using the az aks show command and confirming the powerState shows as Stopped as on the below output:

{
[...]
  "nodeResourceGroup": "MC_myResourceGroup_myAKSCluster_westus2",
  "powerState":{
    "code":"Stopped"
  },
  "privateFqdn": null,
  "provisioningState": "Succeeded",
  "resourceGroup": "myResourceGroup",
[...]
}

If the provisioningState shows Stopping that means your cluster hasn't fully stopped yet.

Important

If you are using Pod Disruption Budgets the stop operation can take longer as the drain process will take more time to complete.

Start an AKS Cluster

You can use the az aks start command to start a stopped AKS cluster's nodes and control plane. The cluster is restarted with the previous control plane state and number of agent nodes.
The following example starts a cluster named myAKSCluster:

az aks start --name myAKSCluster --resource-group myResourceGroup

You can verify when your cluster has started by using the az aks show command and confirming the powerState shows Running as on the below output:

{
[...]
  "nodeResourceGroup": "MC_myResourceGroup_myAKSCluster_westus2",
  "powerState":{
    "code":"Running"
  },
  "privateFqdn": null,
  "provisioningState": "Succeeded",
  "resourceGroup": "myResourceGroup",
[...]
}

If the provisioningState shows Starting that means your cluster hasn't fully started yet.

Note

If you are using cluster autoscaler, when you start your cluster back up your current node count may not be between the min and max range values you set. This behavior is expected. The cluster starts with the number of nodes it needs to run its workloads, which isn't impacted by your autoscaler settings. When your cluster performs scaling operations, the min and max values will impact your current node count and your cluster will eventually enter and remain in that desired range until you stop your cluster.

Next steps