How to query logs from Container insights
Container insights collects performance metrics, inventory data, and health state information from container hosts and containers. The data is collected every three minutes and forwarded to the Log Analytics workspace in Azure Monitor where it's available for log queries using Log Analytics in Azure Monitor. You can apply this data to scenarios that include migration planning, capacity analysis, discovery, and on-demand performance troubleshooting. Azure Monitor Logs can help you look for trends, diagnose bottlenecks, forecast, or correlate data that can help you determine whether the current cluster configuration is performing optimally.
See Using queries in Azure Monitor Log Analytics for information on using these queries and Log Analytics tutorial for a complete tutorial on using Log Analytics to run queries and work with their results.
Open Log Analytics
There are multiple options for starting Log Analytics, each starting with a different scope. For access to all data in the workspace, select Logs from the Monitor menu. To limit the data to a single Kubernetes cluster, select Logs from that cluster's menu.
Existing log queries
You don't necessarily need to understand how to write a log query to use Log Analytics. There are multiple prebuilt queries that you can select and either run without modification or use as a start to a custom query. Click Queries at the top of the Log Analytics screen and view queries with a Resource type of Kubernetes Services.
Container tables
See Azure Monitor table reference for a list of tables and their detailed descriptions used by Container insights. All of these tables are available for log queries.
Example log queries
It's often useful to build queries that start with an example or two and then modify them to fit your requirements. To help build more advanced queries, you can experiment with the following sample queries:
List all of a container's lifecycle information
ContainerInventory
| project Computer, Name, Image, ImageTag, ContainerState, CreatedTime, StartedTime, FinishedTime
| render table
Kubernetes events
KubeEvents
| where not(isempty(Namespace))
| sort by TimeGenerated desc
| render table
Container CPU
Perf
| where ObjectName == "K8SContainer" and CounterName == "cpuUsageNanoCores"
| summarize AvgCPUUsageNanoCores = avg(CounterValue) by bin(TimeGenerated, 30m), InstanceName
Container memory
Perf
| where ObjectName == "K8SContainer" and CounterName == "memoryRssBytes"
| summarize AvgUsedRssMemoryBytes = avg(CounterValue) by bin(TimeGenerated, 30m), InstanceName
Requests Per Minute with Custom Metrics
InsightsMetrics
| where Name == "requests_count"
| summarize Val=any(Val) by TimeGenerated=bin(TimeGenerated, 1m)
| sort by TimeGenerated asc
| project RequestsPerMinute = Val - prev(Val), TimeGenerated
| render barchart
Pods by name and namespace
let startTimestamp = ago(1h);
KubePodInventory
| where TimeGenerated > startTimestamp
| project ContainerID, PodName=Name, Namespace
| where PodName contains "name" and Namespace startswith "namespace"
| distinct ContainerID, PodName
| join
(
ContainerLog
| where TimeGenerated > startTimestamp
)
on ContainerID
// at this point before the next pipe, columns from both tables are available to be "projected". Due to both
// tables having a "Name" column, we assign an alias as PodName to one column which we actually want
| project TimeGenerated, PodName, LogEntry, LogEntrySource
| summarize by TimeGenerated, LogEntry
| order by TimeGenerated desc
Pod scale-out (HPA)
Returns the number of scaled out replicas in each deployment. Calculates the scale-out percentage with the maximum number of replicas configured in HPA.
let _minthreshold = 70; // minimum threshold goes here if you want to setup as an alert
let _maxthreshold = 90; // maximum threshold goes here if you want to setup as an alert
let startDateTime = ago(60m);
KubePodInventory
| where TimeGenerated >= startDateTime
| where Namespace !in('default', 'kube-system') // List of non system namespace filter goes here.
| extend labels = todynamic(PodLabel)
| extend deployment_hpa = reverse(substring(reverse(ControllerName), indexof(reverse(ControllerName), "-") + 1))
| distinct tostring(deployment_hpa)
| join kind=inner (InsightsMetrics
| where TimeGenerated > startDateTime
| where Name == 'kube_hpa_status_current_replicas'
| extend pTags = todynamic(Tags) //parse the tags for values
| extend ns = todynamic(pTags.k8sNamespace) //parse namespace value from tags
| extend deployment_hpa = todynamic(pTags.targetName) //parse HPA target name from tags
| extend max_reps = todynamic(pTags.spec_max_replicas) // Parse maximum replica settings from HPA deployment
| extend desired_reps = todynamic(pTags.status_desired_replicas) // Parse desired replica settings from HPA deployment
| summarize arg_max(TimeGenerated, *) by tostring(ns), tostring(deployment_hpa), Cluster=toupper(tostring(split(_ResourceId, '/')[8])), toint(desired_reps), toint(max_reps), scale_out_percentage=(desired_reps * 100 / max_reps)
//| where scale_out_percentage > _minthreshold and scale_out_percentage <= _maxthreshold
)
on deployment_hpa
Nodepool scale-outs
Returns the number of active nodes in each node pool. Calculates the number of available active nodes and the max node configuration in the auto-scaler settings to determine the scale-out percentage. See commented lines in query to use it for a number of results alert rule.
let nodepoolMaxnodeCount = 10; // the maximum number of nodes in your auto scale setting goes here.
let _minthreshold = 20;
let _maxthreshold = 90;
let startDateTime = 60m;
KubeNodeInventory
| where TimeGenerated >= ago(startDateTime)
| extend nodepoolType = todynamic(Labels) //Parse the labels to get the list of node pool types
| extend nodepoolName = todynamic(nodepoolType[0].agentpool) // parse the label to get the nodepool name or set the specific nodepool name (like nodepoolName = 'agentpool)'
| summarize nodeCount = count(Computer) by ClusterName, tostring(nodepoolName), TimeGenerated
//(Uncomment the below two lines to set this as an log search alert)
//| extend scaledpercent = iff(((nodeCount * 100 / nodepoolMaxnodeCount) >= _minthreshold and (nodeCount * 100 / nodepoolMaxnodeCount) < _maxthreshold), "warn", "normal")
//| where scaledpercent == 'warn'
| summarize arg_max(TimeGenerated, *) by nodeCount, ClusterName, tostring(nodepoolName)
| project ClusterName,
TotalNodeCount= strcat("Total Node Count: ", nodeCount),
ScaledOutPercentage = (nodeCount * 100 / nodepoolMaxnodeCount),
TimeGenerated,
nodepoolName
System containers (replicaset) availability
Returns the system containers (replicasets) and report the unavailable percentage. See commented lines in query to use it for a number of results alert rule.
let startDateTime = 5m; // the minimum time interval goes here
let _minalertThreshold = 50; //Threshold for minimum and maximum unavailable or not running containers
let _maxalertThreshold = 70;
KubePodInventory
| where TimeGenerated >= ago(startDateTime)
| distinct ClusterName, TimeGenerated
| summarize Clustersnapshot = count() by ClusterName
| join kind=inner (
KubePodInventory
| where TimeGenerated >= ago(startDateTime)
| where Namespace in('default', 'kube-system') and ControllerKind == 'ReplicaSet' // the system namespace filter goes here
| distinct ClusterName, Computer, PodUid, TimeGenerated, PodStatus, ServiceName, PodLabel, Namespace, ContainerStatus
| summarize arg_max(TimeGenerated, *), TotalPODCount = count(), podCount = sumif(1, PodStatus == 'Running' or PodStatus != 'Running'), containerNotrunning = sumif(1, ContainerStatus != 'running')
by ClusterName, TimeGenerated, ServiceName, PodLabel, Namespace
)
on ClusterName
| project ClusterName, ServiceName, podCount, containerNotrunning, containerNotrunningPercent = (containerNotrunning * 100 / podCount), TimeGenerated, PodStatus, PodLabel, Namespace, Environment = tostring(split(ClusterName, '-')[3]), Location = tostring(split(ClusterName, '-')[4]), ContainerStatus
//Uncomment the below line to set for automated alert
//| where PodStatus == "Running" and containerNotrunningPercent > _minalertThreshold and containerNotrunningPercent < _maxalertThreshold
| summarize arg_max(TimeGenerated, *), c_entry=count() by PodLabel, ServiceName, ClusterName
//Below lines are to parse the labels to identify the impacted service/component name
| extend parseLabel = replace(@'k8s-app', @'k8sapp', PodLabel)
| extend parseLabel = replace(@'app.kubernetes.io/component', @'appkubernetesiocomponent', parseLabel)
| extend parseLabel = replace(@'app.kubernetes.io/instance', @'appkubernetesioinstance', parseLabel)
| extend tags = todynamic(parseLabel)
| extend tag01 = todynamic(tags[0].app)
| extend tag02 = todynamic(tags[0].k8sapp)
| extend tag03 = todynamic(tags[0].appkubernetesiocomponent)
| extend tag04 = todynamic(tags[0].aadpodidbinding)
| extend tag05 = todynamic(tags[0].appkubernetesioinstance)
| extend tag06 = todynamic(tags[0].component)
| project ClusterName, TimeGenerated,
ServiceName = strcat( ServiceName, tag01, tag02, tag03, tag04, tag05, tag06),
ContainerUnavailable = strcat("Unavailable Percentage: ", containerNotrunningPercent),
PodStatus = strcat("PodStatus: ", PodStatus),
ContainerStatus = strcat("Container Status: ", ContainerStatus)
System containers (daemonsets) availability
Returns the system containers (daemonsets) and report the unavailable percentage. See commented lines in query to use it for a number of results alert rule.
let startDateTime = 5m; // the minimum time interval goes here
let _minalertThreshold = 50; //Threshold for minimum and maximum unavailable or not running containers
let _maxalertThreshold = 70;
KubePodInventory
| where TimeGenerated >= ago(startDateTime)
| distinct ClusterName, TimeGenerated
| summarize Clustersnapshot = count() by ClusterName
| join kind=inner (
KubePodInventory
| where TimeGenerated >= ago(startDateTime)
| where Namespace in('default', 'kube-system') and ControllerKind == 'DaemonSet' // the system namespace filter goes here
| distinct ClusterName, Computer, PodUid, TimeGenerated, PodStatus, ServiceName, PodLabel, Namespace, ContainerStatus
| summarize arg_max(TimeGenerated, *), TotalPODCount = count(), podCount = sumif(1, PodStatus == 'Running' or PodStatus != 'Running'), containerNotrunning = sumif(1, ContainerStatus != 'running')
by ClusterName, TimeGenerated, ServiceName, PodLabel, Namespace
)
on ClusterName
| project ClusterName, ServiceName, podCount, containerNotrunning, containerNotrunningPercent = (containerNotrunning * 100 / podCount), TimeGenerated, PodStatus, PodLabel, Namespace, Environment = tostring(split(ClusterName, '-')[3]), Location = tostring(split(ClusterName, '-')[4]), ContainerStatus
//Uncomment the below line to set for automated alert
//| where PodStatus == "Running" and containerNotrunningPercent > _minalertThreshold and containerNotrunningPercent < _maxalertThreshold
| summarize arg_max(TimeGenerated, *), c_entry=count() by PodLabel, ServiceName, ClusterName
//Below lines are to parse the labels to identify the impacted service/component name
| extend parseLabel = replace(@'k8s-app', @'k8sapp', PodLabel)
| extend parseLabel = replace(@'app.kubernetes.io/component', @'appkubernetesiocomponent', parseLabel)
| extend parseLabel = replace(@'app.kubernetes.io/instance', @'appkubernetesioinstance', parseLabel)
| extend tags = todynamic(parseLabel)
| extend tag01 = todynamic(tags[0].app)
| extend tag02 = todynamic(tags[0].k8sapp)
| extend tag03 = todynamic(tags[0].appkubernetesiocomponent)
| extend tag04 = todynamic(tags[0].aadpodidbinding)
| extend tag05 = todynamic(tags[0].appkubernetesioinstance)
| extend tag06 = todynamic(tags[0].component)
| project ClusterName, TimeGenerated,
ServiceName = strcat( ServiceName, tag01, tag02, tag03, tag04, tag05, tag06),
ContainerUnavailable = strcat("Unavailable Percentage: ", containerNotrunningPercent),
PodStatus = strcat("PodStatus: ", PodStatus),
ContainerStatus = strcat("Container Status: ", ContainerStatus)
Resource logs
Resource logs for AKS are stored in the AzureDiagnostics table You can distinguish different logs with the Category column. See AKS reference resource logs for a description of each category. The following examples require a diagnostic extension to send resource logs for an AKS cluster to a Log Analytics workspace. See Configure monitoring for details.
API server logs
AzureDiagnostics
| where Category == "kube-apiserver"
Count logs for each category
AzureDiagnostics
| where ResourceType == "MANAGEDCLUSTERS"
| summarize count() by Category
Query Prometheus metrics data
The following example is a Prometheus metrics query showing disk reads per second per disk per node.
InsightsMetrics
| where Namespace == 'container.azm.ms/diskio'
| where TimeGenerated > ago(1h)
| where Name == 'reads'
| extend Tags = todynamic(Tags)
| extend HostName = tostring(Tags.hostName), Device = Tags.name
| extend NodeDisk = strcat(Device, "/", HostName)
| order by NodeDisk asc, TimeGenerated asc
| serialize
| extend PrevVal = iif(prev(NodeDisk) != NodeDisk, 0.0, prev(Val)), PrevTimeGenerated = iif(prev(NodeDisk) != NodeDisk, datetime(null), prev(TimeGenerated))
| where isnotnull(PrevTimeGenerated) and PrevTimeGenerated != TimeGenerated
| extend Rate = iif(PrevVal > Val, Val / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1), iif(PrevVal == Val, 0.0, (Val - PrevVal) / (datetime_diff('Second', TimeGenerated, PrevTimeGenerated) * 1)))
| where isnotnull(Rate)
| project TimeGenerated, NodeDisk, Rate
| render timechart
To view Prometheus metrics scraped by Azure Monitor filtered by Namespace, specify "prometheus". Here's a sample query to view Prometheus metrics from the default kubernetes namespace.
InsightsMetrics
| where Namespace == "prometheus"
| extend tags=parse_json(Tags)
| summarize count() by Name
Prometheus data can also be directly queried by name.
InsightsMetrics
| where Namespace == "prometheus"
| where Name contains "some_prometheus_metric"
Query config or scraping errors
To investigate any configuration or scraping errors, the following example query returns informational events from the KubeMonAgentEvents table.
KubeMonAgentEvents | where Level != "Info"
The output shows results similar to the following example:

Next steps
Container insights does not include a predefined set of alerts. Review the Create performance alerts with Container insights to learn how to create recommended alerts for high CPU and memory utilization to support your DevOps or operational processes and procedures.
Povratne informacije
Pošalјite i prikažite povratne informacije za