My team is working on running a kubernetes cluster and I have been struggling to understand how to maximize the number of pods we can run per node. When I update the cpu and memory requests and limits so that we can have 2 pods per node, things run pretty well. When I update it to have 3 pods per node, it starts erroring (CrashLoopBackoff) about 10% of the time. When I have 4 pods per node, its about 25% of the time. Unfortunately, I have not found any error messaging useful. The Events of a failing pod just says "Back-off restarting failed container." My assumption is that when I increase the pod count, they are reaching the max cpu limit per node, but playing around with the numbers and limits is not working as I had hoped. Is there any way to see an actual error report like "pod requests X cpu greater than available Y cpu" so that we can understand the actual issue? I have searched online and there is not many answers to determine what a CrashLoopBackoff actually means or how to debug the underlying error, so any advice or avenues to look for solutions are greatly appreciated.
We are using D4s_v3 (4 vcpu, 16 GiB)
I understand that some resources are reserved (https://docs.microsoft.com/en-us/azure/aks/concepts-clusters-workloads)
Is there a way to see why a pod failed and if it was due to cpu or memory shortage?

