Azure HDInsight monitoring data reference

Article
03/27/2024

This article contains all the monitoring reference information for this service.

See Monitor HDInsight for details on the data you can collect for Azure HDInsight and how to use it.

Metrics

This section lists all the automatically collected platform metrics for this service. These metrics are also part of the global list of all platform metrics supported in Azure Monitor.

For information on metric retention, see Azure Monitor Metrics overview.

Supported metrics for Microsoft.HDInsight/clusters

The following table lists the metrics available for the Microsoft.HDInsight/clusters resource type.

All columns might not be present in every table.
Some columns might be beyond the viewing area of the page. Select Expand table to view all available columns.

Table headings

Category - The metrics group or classification.
Metric - The metric display name as it appears in the Azure portal.
Name in REST API - The metric name as referred to in the REST API.
Unit - Unit of measure.
Aggregation - The default aggregation type. Valid values: Average (Avg), Minimum (Min), Maximum (Max), Total (Sum), Count.
Dimensions - Dimensions available for the metric.
Time Grains - Intervals at which the metric is sampled. For example, PT1M indicates that the metric is sampled every minute, PT30M every 30 minutes, PT1H every hour, and so on.
DS Export- Whether the metric is exportable to Azure Monitor Logs via diagnostic settings. For information on exporting metrics, see Create diagnostic settings in Azure Monitor.

Category	Metric	Name in REST API	Unit	Aggregation	Dimensions	Time Grains	DS Export
Availability	Categorized Gateway Requests Number of gateway requests by categories (1xx/2xx/3xx/4xx/5xx)	`CategorizedGatewayRequests`	Count	Count, Total	`HttpStatus`	PT1M, PT1H, P1D	Yes
Availability	Gateway Requests Number of gateway requests	`GatewayRequests`	Count	Count, Total	`HttpStatus`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Consumer RequestThroughput Number of consumer requests to Kafka REST proxy	`KafkaRestProxy.ConsumerRequest.m1_delta`	CountPerSecond	Total	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Consumer Unsuccessful Requests Consumer request exceptions	`KafkaRestProxy.ConsumerRequestFail.m1_delta`	CountPerSecond	Total	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Consumer RequestLatency Message latency in a consumer request through Kafka REST proxy	`KafkaRestProxy.ConsumerRequestTime.p95`	Milliseconds	Average	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Consumer Request Backlog Consumer REST proxy queue length	`KafkaRestProxy.ConsumerRequestWaitingInQueueTime.p95`	Milliseconds	Average	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Producer MessageThroughput Number of producer messages through Kafka REST proxy	`KafkaRestProxy.MessagesIn.m1_delta`	CountPerSecond	Total	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Consumer MessageThroughput Number of consumer messages through Kafka REST proxy	`KafkaRestProxy.MessagesOut.m1_delta`	CountPerSecond	Total	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy ConcurrentConnections Number of concurrent connections through Kafka REST proxy	`KafkaRestProxy.OpenConnections`	Count	Total	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Producer RequestThroughput Number of producer requests to Kafka REST proxy	`KafkaRestProxy.ProducerRequest.m1_delta`	CountPerSecond	Total	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Producer Unsuccessful Requests Producer request exceptions	`KafkaRestProxy.ProducerRequestFail.m1_delta`	CountPerSecond	Total	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Producer RequestLatency Message latency in a producer request through Kafka REST proxy	`KafkaRestProxy.ProducerRequestTime.p95`	Milliseconds	Average	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	REST proxy Producer Request Backlog Producer REST proxy queue length	`KafkaRestProxy.ProducerRequestWaitingInQueueTime.p95`	Milliseconds	Average	`Machine`, `Topic`	PT1M, PT1H, P1D	Yes
Availability	Number of Active Workers Number of Active Workers	`NumActiveWorkers`	Count	Average, Maximum, Minimum	`MetricName`	PT1M, PT1H, P1D	Yes
Availability	Pending CPU Pending CPU Requests in YARN	`PendingCPU`	Count	Average, Maximum, Minimum	<none>	PT1M, PT1H, P1D	Yes
Availability	Pending Memory Pending Memory Requests in YARN	`PendingMemory`	Count	Average, Maximum, Minimum	<none>	PT1M, PT1H, P1D	Yes

Metric dimensions

For information about what metric dimensions are, see Multi-dimensional metrics.

This service has the following dimensions associated with its metrics.

Dimensions for the Microsoft.HDInsight/clusters table include:

HttpStatus
Machine
Topic
MetricName

Resource logs

This section lists the types of resource logs you can collect for this service. The section pulls from the list of all resource logs category types supported in Azure Monitor.

HDInsight doesn't use Azure Monitor resource logs or diagnostic settings. Logs are collected by other methods, including the use of the Log Analytics agent.

Azure Monitor Logs tables

This section lists the Azure Monitor Logs tables relevant to this service, which are available for query by Log Analytics using Kusto queries. The tables contain resource log data and possibly more depending on what is collected and routed to them.

HDInsight Clusters

Microsoft.HDInsight/Clusters

The available logs and metrics vary depending on your HDInsight cluster type.

Log table mapping

The new Azure Monitor integration implements new tables in the Log Analytics workspace. The following tables show the log table mappings from the classic Azure Monitor integration to the new one.

The New table column shows the name of the new table. The Description row describes the type of logs/metrics that are available in this table. The Classic table column is a list of all the tables from the classic Azure Monitor integration whose data is now present in the new table.

Note

Some tables are completely new and not based on previous tables.

General workload tables

New table	Description	Classic table
HDInsightAmbariSystemMetrics	System metrics collected from Ambari. The metrics now come from each node in the cluster (except for edge nodes) instead of just the two headnodes. Each metric is now a column and each metric is reported once per record.	metrics_cpu_nice_cl, metrics_cpu_system_cl, metrics_cpu_user_cl, metrics_memory_cache_CL, metrics_memory_swap_CL, metrics_memory_total_CLmetrics_memory_buffer_CL, metrics_load_1min_CL, metrics_load_cpu_CL, metrics_load_nodes_CL, metrics_load_procs_CL, metrics_network_in_CL, metrics_network_out_CL
HDInsightAmbariClusterAlerts	Ambari Cluster Alerts from each node in the cluster (except for edge nodes). Each alert is a record in this table.	metrics_cluster_alerts_CL
HDInsightSecurityLogs	Records from the Ambari Audit and Auth Logs.	log_ambari_audit_CL, log_auth_CL
HDInsightRangerAuditLogs	All records from the Ranger Audit log for ESP clusters.	ranger_audit_logs_CL
HDInsightGatewayAuditLogs_CL	The Gateway nodes audit information. Same format as the classic table, and still located in the Custom Logs section.	log_gateway_Audit_CL

Spark workload

Note

Spark application related tables have been replaced with 11 new Spark tables that give more in-depth information about your Spark workloads.

New table	Description	Classic table
HDInsightSparkLogs	All logs related to Spark and its related component: Livy and Jupyter.	log_livy_CL, log_jupyter_CL, log_spark_CL, log_sparkappsexecutors_CL, log_sparkappsdrivers_CL
HDInsightSparkApplicationEvents	Event information for Spark Applications including Submission and Completion time, App ID, and AppName. Useful for keeping track of when applications started and completed.
HDInsightSparkBlockManagerEvents	Event information related to Spark's Block Manager. Includes information such as executor memory usage.
HDInsightSparkEnvironmentEvents	Event information related to the Environment an application executes in including, Spark Deploy Mode, Master, and information about the Executor.
HDInsightSparkExecutorEvents	Event information about the Spark Executor usage for by an Application.
HDInsightSparkExtraEvents	Event information that doesn't fit into any other Spark table.
HDInsightSparkJobEvents	Information about Spark Jobs including their start and end times, result, and associated stages.
HDInsightSparkSqlExecutionEvents	Event information on Spark SQL Queries including their plan info and description and start and end times.
HDInsightSparkStageEvents	Event information for Spark Stages including their start and completion times, failure status, and detailed execution information.
HDInsightSparkStageTaskAccumulables	Performance metrics for stages and tasks.
HDInsightTaskEvents	Event information for Spark Tasks including start and completion time, associated stages, execution status, and task type.
HDInsightJupyterNotebookEvents	Event information for Jupyter Notebooks.

Hadoop/YARN workload

New table	Description	Classic table
HDInsightHadoopAndYarnMetrics	JMX metrics from the Hadoop and YARN frameworks. Contains all the same JMX metrics as the previous Custom Logs tables, plus more important metrics: Timeline Server, Node Manager, and Job History Server. Contains one metric per record.	metrics_resourcemanager_clustermetrics_CL, metrics_resourcemanager_jvm_CL, metrics_resourcemanager_queue_root_CL, metrics_resourcemanager_queue_root_joblauncher_CL, metrics_resourcemanager_queue_root_default_CL, metrics_resourcemanager_queue_root_thriftsvr_CL
HDInsightHadoopAndYarnLogs	All logs generated from the Hadoop and YARN frameworks.	log_mrjobsummary_CL, log_resourcemanager_CL, log_timelineserver_CL, log_nodemanager_CL

Hive/LLAP workload

New table	Description	Classic table
HDInsightHiveAndLLAPMetrics	JMX metrics from the Hive and LLAP frameworks. Contains all the same JMX metrics as the previous Custom Logs tables, one metric per record.	llap_metrics_hiveserver2_CL, llap_metrics_hs2_metrics_subsystemllap_metrics_jvm_CL, llap_metrics_llap_daemon_info_CL, llap_metrics_buddy_allocator_info_CL, llap_metrics_deamon_jvm_CL, llap_metrics_io_CL, llap_metrics_executor_metrics_CL, llap_metrics_metricssystem_stats_CL, llap_metrics_cache_CL
HDInsightHiveAndLLAPLogs	Logs generated from Hive, LLAP, and their related components: WebHCat and Zeppelin.	log_hivemetastore_CL log_hiveserver2_CL, log_hiveserve2interactive_CL, log_webhcat_CL, log_zeppelin_zeppelin_CL

Kafka workload

New table	Description	Classic table
HDInsightKafkaMetrics	JMX metrics from Kafka. Contains all the same JMX metrics as the old Custom Logs tables, plus other important metrics. One metric per record.	metrics_kafka_CL
HDInsightKafkaLogs	All logs generated from the Kafka Brokers.	log_kafkaserver_CL, log_kafkacontroller_CL

HBase workload

New table	Description	Classic table
HDInsightHBaseMetrics	JMX metrics from HBase. Contains all the same JMX metrics from the previous tables. In contrast with the previous tables, each row contains one metric.	metrics_regionserver_CL, metrics_regionserver_wal_CL, metrics_regionserver_ipc_CL, metrics_regionserver_os_CL, metrics_regionserver_replication_CL, metrics_restserver_CL, metrics_restserver_jvm_CL, metrics_hmaster_assignmentmanager_CL, metrics_hmaster_ipc_CL, metrics_hmaser_os_CL, metrics_hmaster_balancer_CL, metrics_hmaster_jvm_CL, metrics_hmaster_CL, metrics_hmaster_fs_CL
HDInsightHBaseLogs	Logs from HBase and its related components: Phoenix and HDFS.	log_regionserver_CL, log_restserver_CL, log_phoenixserver_CL, log_hmaster_CL, log_hdfsnamenode_CL, log_garbage_collector_CL

Oozie workload

New table	Description	Classic table
HDInsightOozieLogs	All logs generated from the Oozie framework.	Log_oozie_CL

Activity log

The linked table lists the operations that can be recorded in the activity log for this service. These operations are a subset of all the possible resource provider operations in the activity log.

For more information on the schema of activity log entries, see Activity Log schema.

Microsoft.HDInsight resource provider operations

See Monitor HDInsight for a description of monitoring HDInsight.
See Monitor Azure resources with Azure Monitor for details on monitoring Azure resources.

Azure HDInsight monitoring data reference

Metrics

Supported metrics for Microsoft.HDInsight/clusters

Metric dimensions

Resource logs

Azure Monitor Logs tables

HDInsight Clusters

Log table mapping

General workload tables

Spark workload

Hadoop/YARN workload

Hive/LLAP workload

Kafka workload

HBase workload

Oozie workload

Activity log

Feedback

Additional resources

Azure HDInsight monitoring data reference

Metrics

Supported metrics for Microsoft.HDInsight/clusters

Metric dimensions

Resource logs

Azure Monitor Logs tables

HDInsight Clusters

Log table mapping

General workload tables

Spark workload

Hadoop/YARN workload

Hive/LLAP workload

Kafka workload

HBase workload

Oozie workload

Activity log

Related content

Feedback

Additional resources