question

CraigJones-2944 avatar image
0 Votes"
CraigJones-2944 asked srbose-msft edited

CoreDNS high CPU limits set

We are trying to run a small-scale AKS cluster with minimal resources. We're running into an issue where our limits are far exceeding our allocation on the CPU. After reviewing our deployments we noticed the CoreDNS deployment has a CPU limit of 3 and it's running two replicas (6 total). Changing the YAML file to a different limit changes the limit temporarily, but then reverts back only after a few mins.

127049-image.png




If anyone could point us to some documentation or help us it would be greatly appreciated.

azure-kubernetes-service
image.png (5.8 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

srbose-msft avatar image
0 Votes"
srbose-msft answered srbose-msft edited

@CraigJones-2944 , Thank you for your question.

To address the first concern over spec.template.spec.containers[0].resources.limits.cpu set to 3 in the coredns deployment manifest, if the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a container to use more resource than its request for that resource specifies. However, a container is not allowed to use more than its resource limit. Thus at any point, each coredns container is not necessarily allocated 3 vcpu. Instead you can use kubectl top pod -n kube-system <coredns_pod_name> --containers to get the CPU being utilized by the coredns container at any instant. The limit is a threshold, which if surpassed, the pod will be evicted.


The deployment coredns runs system critical workload using the CoreDNS project for cluster DNS management and resolution with all 1.12.x and higher clusters. [Reference].

If you do a kubectl describe deployment -n kube-system coredns, you will find a very interesting label addonmanager.kubernetes.io/mode=Reconcile

Now, addons with label addonmanager.kubernetes.io/mode=Reconcile will be periodically reconciled. Direct manipulation to these addons through apiserver is discouraged because addon-manager will bring them back to the original state. In particular:

  • Addon will be re-created if it is deleted.

  • Addon will be reconfigured to the state given by the supplied fields in the template file periodically.

  • Addon will be deleted when its manifest file is deleted from the $ADDON_PATH.

The $ADDON_PATH by default is set to /etc/kubernetes/addons/ on the control plane node(s).

For more information please check this document.

Since AKS is a managed Kubernetes Service you will not be able to access $ADDON_PATH. We strongly recommend against forcing changes to kube-system resources as these are critical for the proper functioning of the cluster.


Hope this helps.

Please "Accept as Answer" if it helped, so that it can help others in the community looking for help on similar topics.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

RalfWenzel-7149 avatar image
0 Votes"
RalfWenzel-7149 answered srbose-msft edited

@srbose-msfts, Your answer is certainly technically correct, but there is still the problem that most metric implementations (e.g. prometheus alertmanager) will produce warnings / errors. So there should be a possibility to change the limit.

· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@RalfWenzel-7149 , This is valuable feedback and we have shared the same internally. Hope to see this as an option if such a feature is added to AKS in the future.

0 Votes 0 ·

Thanks for bringing that up, this is exactly why I posed the question.

1 Vote 1 ·

If Microsoft were to change this, I think it would be nice to set the limits for coreDNS through some kind of algorithm that takes into account the number of nodes in the cluster as the number of nodes will have an impact on how much load the coreDNS pods will probably need to handle. A single node cluster will typically have less DNS resolving to do than a cluster with for example 50 nodes.

0 Votes 0 ·

@RoderickBant74 , thank you for sharing that insight. Yes, it is true that there will be an impact. However, if you do a kubectl get deploy -n kube-system you shall find a Deployment coredns-autoscaler. If you do a kubectl describe deploy coredns-autoscaler -n kube-system it will show you that the default paramaters set on the cluster-proportional-autoscaler application is set as {"ladder":{"coresToReplicas":[[1,2],[512,3],[1024,4],[2048,5]],"nodesToReplicas":[[1,2],[8,3],[16,4],[32,5]]}}

These define the scaling rules for the coredns Deployment in the kube-system namespace based on the corestoReplicas tupples and nodestoReplicas tupples.

This can also be modified in the ConfigMap coredns-autoscaler in the kube-system namespace. You then only have to delete the pods controlled by the coredns-autoscaler Deployment in the kube-system namespace. The new pods that will be created will mount the updated configmap.

0 Votes 0 ·