Tutorial: Scale applications in Azure Kubernetes Service (AKS)

If you've followed the tutorials, you have a working Kubernetes cluster in AKS and you deployed the sample Azure Voting app. In this tutorial, part five of seven, you scale out the pods in the app and try pod autoscaling. You also learn how to scale the number of Azure VM nodes to change the cluster's capacity for hosting workloads. You learn how to:

  • Scale the Kubernetes nodes
  • Manually scale Kubernetes pods that run your application
  • Configure autoscaling pods that run the app front-end

In additional tutorials, the Azure Vote application is updated to a new version.

Before you begin

In previous tutorials, an application was packaged into a container image. This image was uploaded to Azure Container Registry, and you created an AKS cluster. The application was then deployed to the AKS cluster. If you haven't done these steps, and would like to follow along, start with Tutorial 1 – Create container images.

This tutorial requires that you're running the Azure CLI version 2.0.53 or later. Run az --version to find the version. If you need to install or upgrade, see Install Azure CLI.

Manually scale pods

When the Azure Vote front-end and Redis instance were deployed in previous tutorials, a single replica was created. To see the number and state of pods in your cluster, use the kubectl get command as follows:

kubectl get pods

The following example output shows one front-end pod and one back-end pod:

NAME                               READY     STATUS    RESTARTS   AGE
azure-vote-back-2549686872-4d2r5   1/1       Running   0          31m
azure-vote-front-848767080-tf34m   1/1       Running   0          31m

To manually change the number of pods in the azure-vote-front deployment, use the kubectl scale command. The following example increases the number of front-end pods to 5:

kubectl scale --replicas=5 deployment/azure-vote-front

Run kubectl get pods again to verify that AKS creates the additional pods. After a minute or so, the additional pods are available in your cluster:

$ kubectl get pods

                                    READY     STATUS    RESTARTS   AGE
azure-vote-back-2606967446-nmpcf    1/1       Running   0          15m
azure-vote-front-3309479140-2hfh0   1/1       Running   0          3m
azure-vote-front-3309479140-bzt05   1/1       Running   0          3m
azure-vote-front-3309479140-fvcvm   1/1       Running   0          3m
azure-vote-front-3309479140-hrbf2   1/1       Running   0          15m
azure-vote-front-3309479140-qphz8   1/1       Running   0          3m

Autoscale pods

Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on CPU utilization or other select metrics. The Metrics Server is used to provide resource utilization to Kubernetes, and is automatically deployed in AKS clusters versions 1.10 and higher. To see the version of your AKS cluster, use the az aks show command, as shown in the following example:

az aks show --resource-group myResourceGroup --name myAKSCluster --query kubernetesVersion --output table


If your AKS cluster is less than 1.10, the Metrics Server is not automatically installed. To install, clone the metrics-server GitHub repo and install the example resource definitions. To view the contents of these YAML definitions, see Metrics Server for Kuberenetes 1.8+.

git clone https://github.com/kubernetes-incubator/metrics-server.git
kubectl create -f metrics-server/deploy/1.8+/

To use the autoscaler, all containers in your pods and your pods must have CPU requests and limits defined. In the azure-vote-front deployment, the front-end container already requests 0.25 CPU, with a limit of 0.5 CPU. These resource requests and limits are defined as shown in the following example snippet:

     cpu: 250m
     cpu: 500m

The following example uses the kubectl autoscale command to autoscale the number of pods in the azure-vote-front deployment. If average CPU utilization across all pods exceeds 50% of their requested usage, the autoscaler increases the pods up to a maximum of 10 instances. A minimum of 3 instances is then defined for the deployment:

kubectl autoscale deployment azure-vote-front --cpu-percent=50 --min=3 --max=10

To see the status of the autoscaler, use the kubectl get hpa command as follows:

$ kubectl get hpa

NAME               REFERENCE                     TARGETS    MINPODS   MAXPODS   REPLICAS   AGE
azure-vote-front   Deployment/azure-vote-front   0% / 50%   3         10        3          2m

After a few minutes, with minimal load on the Azure Vote app, the number of pod replicas decreases automatically to three. You can use kubectl get pods again to see the unneeded pods being removed.

Manually scale AKS nodes

If you created your Kubernetes cluster using the commands in the previous tutorial, it has two nodes. You can adjust the number of nodes manually if you plan more or fewer container workloads on your cluster.

The following example increases the number of nodes to three in the Kubernetes cluster named myAKSCluster. The command takes a couple of minutes to complete.

az aks scale --resource-group myResourceGroup --name myAKSCluster --node-count 3

When the cluster has successfully scaled, the output is similar to following example:

"agentPoolProfiles": [
    "count": 3,
    "dnsPrefix": null,
    "fqdn": null,
    "name": "myAKSCluster",
    "osDiskSizeGb": null,
    "osType": "Linux",
    "ports": null,
    "storageProfile": "ManagedDisks",
    "vmSize": "Standard_D2_v2",
    "vnetSubnetId": null

Next steps

In this tutorial, you used different scaling features in your Kubernetes cluster. You learned how to:

  • Manually scale Kubernetes pods that run your application
  • Configure autoscaling pods that run the app front-end
  • Manually scale the Kubernetes nodes

Advance to the next tutorial to learn how to update application in Kubernetes.