Best practices for basic scheduler features in Azure Kubernetes Service (AKS)
As you manage clusters in Azure Kubernetes Service (AKS), you often need to isolate teams and workloads. The Kubernetes scheduler provides features that let you control the distribution of compute resources, or limit the impact of maintenance events.
This best practices article focuses on basic Kubernetes scheduling features for cluster operators. In this article, you learn how to:
- Use resource quotas to provide a fixed amount of resources to teams or workloads
- Limit the impact of scheduled maintenance using pod disruption budgets
- Check for missing pod resource requests and limits using the
Enforce resource quotas
Best practice guidance - Plan and apply resource quotas at the namespace level. If pods don't define resource requests and limits, reject the deployment. Monitor resource usage and adjust quotas as needed.
Resource requests and limits are placed in the pod specification. These limits are used by the Kubernetes scheduler at deployment time to find an available node in the cluster. These limits and requests work at the individual pod level. For more information about how to define these values, see Define pod resource requests and limits
To provide a way to reserve and limit resources across a development team or project, you should use resource quotas. These quotas are defined on a namespace, and can be used to set quotas on the following basis:
- Compute resources, such as CPU and memory, or GPUs.
- Storage resources, includes the total number of volumes or amount of disk space for a given storage class.
- Object count, such as maximum number of secrets, services, or jobs can be created.
Kubernetes doesn't overcommit resources. Once the cumulative total of resource requests or limits passes the assigned quota, no further deployments are successful.
When you define resource quotas, all pods created in the namespace must provide limits or requests in their pod specifications. If they don't provide these values, you can reject the deployment. Instead, you can configure default requests and limits for a namespace.
The following example YAML manifest named dev-app-team-quotas.yaml sets a hard limit of a total of 10 CPUs, 20Gi of memory, and 10 pods:
apiVersion: v1 kind: ResourceQuota metadata: name: dev-app-team spec: hard: cpu: "10" memory: 20Gi pods: "10"
This resource quota can be applied by specifying the namespace, such as dev-apps:
kubectl apply -f dev-app-team-quotas.yaml --namespace dev-apps
Work with your application developers and owners to understand their needs and apply the appropriate resource quotas.
For more information about available resource objects, scopes, and priorities, see Resource quotas in Kubernetes.
Plan for availability using pod disruption budgets
Best practice guidance - To maintain the availability of applications, define Pod Disruption Budgets (PDBs) to make sure that a minimum number of pods are available in the cluster.
There are two disruptive events that cause pods to be removed:
- Involuntary disruptions are events beyond the typical control of the cluster operator or application owner.
- These involuntary disruptions include a hardware failure on the physical machine, a kernel panic, or the deletion of a node VM
- Voluntary disruptions are events requested by the cluster operator or application owner.
- These voluntary disruptions include cluster upgrades, an updated deployment template, or accidentally deleting a pod.
The involuntary disruptions can be mitigated by using multiple replicas of your pods in a deployment. Running multiple nodes in the AKS cluster also helps with these involuntary disruptions. For voluntary disruptions, Kubernetes provides pod disruption budgets that let the cluster operator define a minimum available or maximum unavailable resource count. These pod disruption budgets let you plan for how deployments or replica sets respond when a voluntary disruption event occurs.
If a cluster is to be upgraded or a deployment template updated, the Kubernetes scheduler makes sure additional pods are scheduled on other nodes before the voluntary disruption events can continue. The scheduler waits before a node is rebooted until the defined number of pods are successfully scheduled on other nodes in the cluster.
Let's look at an example of a replica set with five pods that run NGINX. The pods in the replica set are assigned the label
app: nginx-frontend. During a voluntary disruption event, such as a cluster upgrade, you want to make sure at least three pods continue to run. The following YAML manifest for a PodDisruptionBudget object defines these requirements:
apiVersion: policy/v1beta1 kind: PodDisruptionBudget metadata: name: nginx-pdb spec: minAvailable: 3 selector: matchLabels: app: nginx-frontend
You can also define a percentage, such as 60%, which allows you to automatically compensate for the replica set scaling up the number of pods.
You can define a maximum number of unavailable instances in a replica set. Again, a percentage for the maximum unavailable pods can also be defined. The following pod disruption budget YAML manifest defines that no more than two pods in the replica set be unavailable:
apiVersion: policy/v1beta1 kind: PodDisruptionBudget metadata: name: nginx-pdb spec: maxUnavailable: 2 selector: matchLabels: app: nginx-frontend
Once your pod disruption budget is defined, you create it in your AKS cluster as with any other Kubernetes object:
kubectl apply -f nginx-pdb.yaml
Work with your application developers and owners to understand their needs and apply the appropriate pod disruption budgets.
For more information about using pod disruption budgets, see Specify a disruption budget for your application.
Regularly check for cluster issues with kube-advisor
Best practice guidance - Regularly run the latest version of
kube-advisor open source tool to detect issues in your cluster. If you apply resource quotas on an existing AKS cluster, run
kube-advisor first to find pods that don't have resource requests and limits defined.
The kube-advisor tool is an associated AKS open source project that scans a Kubernetes cluster and reports on issues that it finds. One useful check is to identify pods that don't have resource requests and limits in place.
The kube-advisor tool can report on resource request and limits missing in PodSpecs for Windows applications as well as Linux applications, but the kube-advisor tool itself must be scheduled on a Linux pod. You can schedule a pod to run on a node pool with a specific OS using a node selector in the pod's configuration.
In an AKS cluster that hosts multiple development teams and applications, it can be hard to track pods without these resource requests and limits set. As a best practice, regularly run
kube-advisor on your AKS clusters, especially if you don't assign resource quotas to namespaces.
This article focused on basic Kubernetes scheduler features. For more information about cluster operations in AKS, see the following best practices: