Monitor performance of your cluster

Completed

Once you enable Container insights on a cluster, you can monitor the performance and health status of the cluster components and their workloads. Start with a summary view of all of your monitored clusters and then drill into the details of a particular cluster using built-in workbooks.

Screenshot of a list of all containers and their status in Container insights.

The Monitored clusters tab presents the following information for each of your monitored clusters:

  • Custer status summary, showing a count of clusters for each status
  • Whether all of the AKS deployments are healthy
  • How many nodes and user and system pods are deployed per cluster
  • How much disk space is available and if there's a capacity issue

The health statuses included are:

  • Healthy: No issues are detected for the VM, and it's functioning as required.
  • Critical: One or more critical issues are detected that must be addressed to restore normal operational state as expected.
  • Warning: One or more issues are detected that must be addressed or the health condition could become critical.
  • Unknown: If the service wasn't able to make a connection with the node or pod, the status changes to an Unknown state.
  • Not found: Either the workspace, the resource group, or subscription that contains the workspace for this solution was deleted.
  • Unauthorized: User doesn't have required permissions to read the data in the workspace.
  • Error: An error occurred while attempting to read data from the workspace.
  • Misconfigured: Container insights wasn't configured correctly in the specified workspace.
  • No data: Data hasn't reported to the workspace for the last 30 minutes.

Health state calculates overall cluster status as the worst of the three states with one exception. If any of the three states is Unknown, the overall cluster state shows Unknown.

The following table provides a breakdown of the calculation that controls the health states for a monitored cluster on the multi-cluster view.

Monitored cluster Status Availability
User pod Healthy
Warning
Critical
Unknown
100%
90 - 99%
<90%
If not reported in last 30 minutes
System pod Healthy
Warning
Critical
Unknown
100%
N/A
<100%
If not reported in last 30 minutes
Node Healthy
Warning
Critical
Unknown
>85%
60 - 84%
<60%
If not reported in last 30 minutes
If not reported in last 30 minutes