CoreDNS high CPU limits set

Craig Jones 26 Reputation points
2021-08-27T14:31:25.453+00:00

We are trying to run a small-scale AKS cluster with minimal resources. We're running into an issue where our limits are far exceeding our allocation on the CPU. After reviewing our deployments we noticed the CoreDNS deployment has a CPU limit of 3 and it's running two replicas (6 total). Changing the YAML file to a different limit changes the limit temporarily, but then reverts back only after a few mins.

127049-image.png

If anyone could point us to some documentation or help us it would be greatly appreciated.

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
1,852 questions
0 comments No comments
{count} votes

Accepted answer
  1. SRIJIT-BOSE-MSFT 4,326 Reputation points Microsoft Employee
    2021-08-27T15:34:33.207+00:00

    @Craig Jones , Thank you for your question.

    To address the first concern over spec.template.spec.containers[0].resources.limits.cpu set to 3 in the coredns deployment manifest, if the node where a Pod is running has enough of a resource available, it's possible (and allowed) for a container to use more resource than its request for that resource specifies. However, a container is not allowed to use more than its resource limit. Thus at any point, each coredns container is not necessarily allocated 3 vcpu. Instead you can use kubectl top pod -n kube-system <coredns_pod_name> --containers to get the CPU being utilized by the coredns container at any instant. The limit is a threshold, which if surpassed, the pod will be evicted.

    ----
    The deployment coredns runs system critical workload using the CoreDNS project for cluster DNS management and resolution with all 1.12.x and higher clusters. [Reference].

    If you do a kubectl describe deployment -n kube-system coredns, you will find a very interesting label addonmanager.kubernetes.io/mode=Reconcile

    Now, addons with label addonmanager.kubernetes.io/mode=Reconcile will be periodically reconciled. Direct manipulation to these addons through apiserver is discouraged because addon-manager will bring them back to the original state. In particular:

    • Addon will be re-created if it is deleted.
    • Addon will be reconfigured to the state given by the supplied fields in the template file periodically.
    • Addon will be deleted when its manifest file is deleted from the $ADDON_PATH.

    The $ADDON_PATH by default is set to /etc/kubernetes/addons/ on the control plane node(s).

    For more information please check this document.

    Since AKS is a managed Kubernetes Service you will not be able to access $ADDON_PATH. We strongly recommend against forcing changes to kube-system resources as these are critical for the proper functioning of the cluster.

    ----------

    Hope this helps.

    Please "Accept as Answer" if it helped, so that it can help others in the community looking for help on similar topics.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Ralf Wenzel 1 Reputation point
    2021-09-05T06:39:04.55+00:00

    @srbose-msfts, Your answer is certainly technically correct, but there is still the problem that most metric implementations (e.g. prometheus alertmanager) will produce warnings / errors. So there should be a possibility to change the limit.