你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

分析数据库观察程序监视数据(预览版)

适用于:Azure SQL 数据库Azure SQL 托管实例

除了在 Azure 门户中使用仪表板,或者生成可视化效果来查看和分析 Power BIGrafanaAzure 数据资源管理器Microsoft Fabric 中的实时分析中的 SQL 监视数据,还可以直接查询监视数据存储。

本文包含 KQLT-SQL 查询的示例,可帮助你开始分析收集的监视数据。

使用 KQL 分析监视数据

若要分析收集的监视数据,建议的方法是使用 Kusto 查询语言 (KQL)。 KQL 最适合用于查询遥测、指标和日志。 它为文本搜索和分析、时序运算符和函数、分析和聚合以及许多其他有助于数据分析的语言构造提供广泛的支持。

KQL 在概念上类似于 SQL。 它针对表和列等架构实体进行操作,并支持关系运算,例如 project、restrict、join 和 summarize,对应于 SQL 中的 SELECTJOINWHEREGROUP BY 子句。

若要编写和执行 KQL 查询,可以使用 Kusto ExplorerAzure 数据资源管理器web UI。 Kusto Explorer 是一种功能齐全的 Windows 桌面软件,而 Azure 数据资源管理器 Web UI 可让你在任何平台上的浏览器中执行 KQL 查询并对结果进行可视化。

还可以使用这些工具在 Microsoft Fabric 中的实时分析内查询数据库。 若要连接,请使用实时分析数据库的“查询 URI”添加新连接。 此外,如果使用实时分析,则可以使用 KQL 查询集分析监视数据。 KQL 查询集可以保存为可共享 Fabric 工件,并用于创建 Power BI 报表。

如果你不熟悉 KOL,下列资源可帮你入门:

以下示例可帮助你编写自己的 KQL 查询来查看和分析收集到的 SQL 监视数据。 还可以将这些示例作为构建自己的数据可视化效果和仪表板的起点。

使用 KQL 查询一段时间内的资源消耗

在此示例中,查询返回过去一小时内数据库、弹性池或 SQL 托管实例的主要副本的资源消耗指标(CPU、辅助角色、日志写入吞吐量等)。 除了返回结果集外,它还将其可视化为时间图表。

在此示例和其他示例中,更改 let 语句中的变量以匹配服务器、数据库、弹性池或 SQL 托管实例的名称。 若要使用不同的时间间隔,请更改 duration 变量。 有关详细信息,请参阅时间跨度字面量

let logicalServer = @"your-server-name";
let databaseName = @"your-database-name";
let replicaType = "Primary";
let duration = 1h;
sqldb_database_resource_utilization
| where sample_time_utc > ago(duration)
| where logical_server_name =~ logicalServer
| where database_name =~ databaseName
| where replica_type =~ replicaType
| project sample_time_utc,
          avg_cpu_percent,
          avg_instance_cpu_percent,
          avg_data_io_percent,
          avg_log_write_percent,
          max_worker_percent
| sort by sample_time_utc desc
| render timechart;

使用 KQL 查看数据库、弹性池或 SQL 托管实例属性

在此示例中,查询将返回所有数据库、弹性池或 SQL 托管实例的集,在过去一天内至少从这些对象中收集了相应“属性”数据集中的一个样本。 换言之,每行表示一个监视目标,包含其最近观察到的属性。

arg_max() 函数聚合数据,以返回用于标识目标的指定列集的最新行。 例如,对于 Azure SQL 数据库,此集为 logical_server_namedatabase_namereplica_type

let duration = 1d;
sqldb_database_properties
| where sample_time_utc > ago(duration)
| summarize arg_max(sample_time_utc, *) by logical_server_name, database_name, replica_type
| project-rename last_sample_time_utc = sample_time_utc
| sort by tolower(logical_server_name) asc,
          tolower(database_name) asc,
          case(
              replica_type == "Primary", 0,
              replica_type == "Geo-replication forwarder", 1,
              replica_type == "Named secondary", 2,
              replica_type == "HA secondary", 3,
              4) asc;

使用 KQL 查询运行时统计信息

此查询返回 Azure SQL 资产中消耗资源最多的查询。 更改变量从而按任何查询存储指标(包括 CPU 时间、已用时间、执行计数等)对查询进行排名。还可以设置变量以按时间间隔、查询执行类型和查询文本进行筛选。 设置变量以专注于特定的逻辑服务器、弹性池、SQL 托管实例或数据库。

该查询使用“查询运行时统计信息”数据集返回指定的排序靠前查询数,并包括其他每个资源消耗指标的排名。

let topQueriesBy = "cpu_time"; // Set to one of the following metrics to return the top resource consuming queries:
// count_executions, duration, cpu_time, logical_io_reads, logical_io_writes, physical_io_reads, 
// num_physical_io_reads, clr_time, dop, query_max_used_memory, rowcount, log_bytes_used, tempdb_space_used 
let topQueries = 10; // Set the number of top queries to return
let endTime = now();
let startTime = endTime - 1d;
let logicalServerName = @""; // Optionally filter by logical server name
let elasticPoolName = @""; // Optionally filter by elastic pool name, if any databases are in elastic pools
let databaseName = @""; // Optionally filter by database name
let executionType = ""; // Optionally filter by execution type. Use Regular, Aborted, Exception.
let queryHash = ""; // Optionally filter by query hash (example: 0xBAAA461A6C93EA88)
let queryTextFragment = ""; // Optionally filter by a query text fragment
sqldb_database_query_runtime_stats
| where interval_start_time >= startTime and interval_end_time <= endTime
| where isempty(executionType) or execution_type_desc =~ executionType
| where isempty(logicalServerName) or logical_server_name =~ logicalServerName
| where isempty(elasticPoolName) or elastic_pool_name =~ elasticPoolName
| where isempty(databaseName) or database_name =~ databaseName
| summarize dcount_logical_servers = dcount(logical_server_name),
            any_logical_server_name = take_any(logical_server_name),
            dcount_elastic_pools = dcount(strcat(logical_server_name, "|", elastic_pool_name)),
            any_elastic_pool_name = take_any(elastic_pool_name),
            dcount_databases = dcount(strcat(logical_server_name, "|", database_name)),
            any_database_name = take_any(database_name),
            dcount_sql_module_name = dcount(sql_module_name),
            any_sql_module_name = take_any(sql_module_name),
            dcount_context_settings_id = dcount(context_settings_id),
            any_context_settings_id = take_any(context_settings_id),
            query_sql_text = take_any(query_sql_text),
            count_executions = sum(toreal(count_executions)),
            count_successful_executions = sumif(toreal(count_executions), execution_type_desc == "Regular"),
            count_aborted_executions = sumif(toreal(count_executions), execution_type_desc == "Aborted"),
            count_exception_executions = sumif(toreal(count_executions), execution_type_desc == "Exception"),
            duration_us = sum(avg_duration_us * count_executions),
            cpu_time_us = sum(avg_cpu_time_us * count_executions),
            logical_io_reads = sum(avg_logical_io_reads * count_executions),
            logical_io_writes = sum(avg_logical_io_writes * count_executions),
            physical_io_reads = sum(avg_physical_io_reads * count_executions),
            num_physical_io_reads = sum(avg_num_physical_io_reads * count_executions),
            clr_time_us = sum(avg_clr_time_us * count_executions),
            dop = sumif(avg_dop * count_executions, is_parallel_plan),
            query_max_used_memory = sum(avg_query_max_used_memory * count_executions),
            rowcount = sum(avg_rowcount * count_executions),
            log_bytes_used = sum(avg_log_bytes_used * count_executions),
            tempdb_space_used = sum(avg_tempdb_space_used * count_executions)
            by query_hash
| project logical_server_name = iif(dcount_logical_servers == 1, any_logical_server_name, strcat(any_logical_server_name, " (+", tostring(dcount_logical_servers - 1), ")")),
          elastic_pool_name = iif(dcount_elastic_pools == 1, any_elastic_pool_name, strcat(any_elastic_pool_name, " (+", tostring(dcount_elastic_pools - 1), ")")),
          database_name = iif(dcount_databases == 1, any_database_name, strcat(any_database_name, " (+", tostring(dcount_databases - 1), ")")),
          query_sql_text,
          count_executions,
          count_successful_executions,
          count_aborted_executions,
          count_exception_executions,
          duration_us,
          cpu_time_us,
          logical_io_reads,
          logical_io_writes,
          physical_io_reads,
          num_physical_io_reads,
          clr_time_us,
          dop,
          query_max_used_memory_kb = query_max_used_memory * 8,
          rowcount,
          log_bytes_used,
          tempdb_space_used_kb = tempdb_space_used * 8,
          sql_module_name = iif(dcount_sql_module_name == 1, any_sql_module_name, strcat(any_sql_module_name, " (+", tostring(dcount_sql_module_name - 1), ")")),
          context_settings_id = iif(dcount_context_settings_id == 1, tostring(any_context_settings_id), strcat(any_context_settings_id, " (+", tostring(dcount_context_settings_id - 1), ")")),
          query_hash
| sort by count_executions desc | extend count_executions_rank = row_rank_dense(count_executions)
| sort by duration_us desc | extend duration_rank = row_rank_dense(duration_us)
| sort by cpu_time_us desc | extend cpu_time_rank = row_rank_dense(cpu_time_us)
| sort by logical_io_reads desc | extend logical_io_reads_rank = row_rank_dense(logical_io_reads)
| sort by logical_io_writes desc | extend logical_io_writes_rank = row_rank_dense(logical_io_writes)
| sort by physical_io_reads desc | extend physical_io_reads_rank = row_rank_dense(physical_io_reads)
| sort by num_physical_io_reads desc | extend num_physical_io_reads_rank = row_rank_dense(num_physical_io_reads)
| sort by clr_time_us desc | extend clr_time_rank = row_rank_dense(clr_time_us)
| sort by dop desc | extend dop_rank = row_rank_dense(dop)
| sort by query_max_used_memory_kb desc | extend query_max_used_memory_rank = row_rank_dense(query_max_used_memory_kb)
| sort by rowcount desc | extend rowcount_rank = row_rank_dense(rowcount)
| sort by log_bytes_used desc | extend log_bytes_used_rank = row_rank_dense(log_bytes_used)
| sort by tempdb_space_used_kb desc | extend tempdb_space_used_rank = row_rank_dense(tempdb_space_used_kb)
| sort by case(
              topQueriesBy =~ "count_executions", toreal(count_executions),
              topQueriesBy =~ "duration", toreal(duration_us),
              topQueriesBy =~ "cpu_time", toreal(cpu_time_us),
              topQueriesBy =~ "logical_io_reads", toreal(logical_io_reads),
              topQueriesBy =~ "logical_io_writes", toreal(logical_io_writes),
              topQueriesBy =~ "physical_io_reads", toreal(physical_io_reads),
              topQueriesBy =~ "num_physical_io_reads", toreal(num_physical_io_reads),
              topQueriesBy =~ "clr_time", toreal(clr_time_us),
              topQueriesBy =~ "dop", toreal(dop),
              topQueriesBy =~ "query_max_used_memory", toreal(query_max_used_memory_kb),
              topQueriesBy =~ "rowcount", toreal(rowcount),
              topQueriesBy =~ "log_bytes_used", toreal(log_bytes_used),
              topQueriesBy =~ "tempdb_space_used", toreal(tempdb_space_used_kb),
              real(null)
              ) desc,
          count_executions desc
| project-away count_executions
| where isempty(queryHash) or query_hash == queryHash
| where isempty(queryTextFragment) or query_sql_text contains queryTextFragment
| take topQueries;

使用 KQL 分析随时间变化的性能计数器

在此示例中,查询返回一段时间间隔的性能计数器值,该间隔在指定的结束时间前 30 分钟开始。

此示例使用累积性能计数器,例如 Total request countQuery optimizations/sec。 累积表示当 SQL 查询活动发生时,计数器值会不断增加。 此示例中查询计算每个样本中的计数器值与其上一样本中的值之间的差异或增量,以获取自上一个样本以来发生的请求数和优化数,然后将这些指标在时间图表中可视化。

let logicalServer = @"your-server-name";
let databaseName = @"your-database-name";
let replicaType = "Primary";
let endTime = datetime("2023-12-19 22:10:00");
let startTime = endTime - 30m;
sqldb_database_performance_counters_common
| where sample_time_utc between (startTime .. endTime)
| where logical_server_name =~ logicalServer
| where database_name =~ databaseName
| where replica_type =~ replicaType
| where cntr_type == 272696576 // restrict to cumulative counters
| where object_name =~ "Workload Group Stats" and counter_name in ("Total request count","Query optimizations/sec")
| project replica_id, sample_time_utc, object_name, counter_name, cntr_value
| sort by replica_id asc, counter_name asc, sample_time_utc asc
| extend delta_cntr_value = iif(cntr_value >= prev(cntr_value) and counter_name == prev(counter_name) and replica_id == prev(replica_id), cntr_value - prev(cntr_value), real(null)),
         delta_sample_time_utc = iif(sample_time_utc >= prev(sample_time_utc), datetime_diff("Millisecond", sample_time_utc, prev(sample_time_utc)), long(null))
| where isnotempty(delta_sample_time_utc)
| extend value = delta_cntr_value / delta_sample_time_utc * 1000
| summarize requests_per_sec = take_anyif(value, counter_name =~ "Total request count"),
            query_optimizations_per_sec = take_anyif(value, counter_name =~ "Query optimizations/sec")
            by sample_time_utc
| sort by sample_time_utc desc
| project sample_time_utc, requests_per_sec, query_optimizations_per_sec
| render timechart;

以下示例适用于报告最近观察到的值(例如 Active memory grants countPending memory grants countProcesses blocked)的时间点性能计数器。 时间间隔为过去 30 分钟。

let logicalServer = @"your-server-name";
let databaseName = @"your-database-name";
let replicaType = "Primary";
let duration = 30m;
sqldb_database_performance_counters_common
| where sample_time_utc > ago(duration)
| where logical_server_name =~ logicalServer
| where database_name =~ databaseName
| where replica_type =~ replicaType
| where cntr_type == 65792 // restrict to point-in-time counters
| where (object_name =~ "General Statistics" and counter_name in ("Processes blocked"))
        or
        (object_name =~ "Resource Pool Stats" and counter_name in ("Active memory grants count","Pending memory grants count"))
| project sample_time_utc, counter_name, cntr_value
| render timechart;

以下示例使用“性能计数器(详细)”数据集来绘制 Azure SQL 数据库中用户和内部资源池以及工作负荷组的 CPU 利用率。 有关详细信息,请参阅用户工作负荷和内部进程的资源消耗量

用户工作负荷在 SloSharedPool1UserPool 资源池中运行,而所有其他资源池用于各种系统工作负荷。

同样,用户工作负荷在名称开头为 UserPrimaryGroup.DBId 的工作负荷组中运行,而所有其他工作负荷组都用于各种系统工作负荷。 例如,数据库观察程序监视查询在 SQLExternalMonitoringGroup 工作负荷组中运行。

let logicalServer = @"your-server-name";
let databaseName = @"your-database-name";
let replicaType = "Primary";
let endTime = datetime("2023-12-19 22:10:00");
let startTime = endTime - 30m;
sqldb_database_performance_counters_detailed
| where sample_time_utc between (startTime .. endTime)
| where logical_server_name =~ logicalServer
| where database_name =~ databaseName
| where replica_type =~ replicaType
| where cntr_type == 537003264 // restrict to ratio percentage counters
| where object_name =~ "Resource Pool Stats" and counter_name in ("CPU usage %")
| project sample_time_utc, resource_pool = instance_name, cpu_percentage = cntr_value
| render timechart;

let logicalServer = @"your-server-name";
let databaseName = @"your-database-name";
let replicaType = "Primary";
let endTime = datetime("2023-12-19 22:10:00");
let startTime = endTime - 30m;
sqldb_database_performance_counters_detailed
| where sample_time_utc between (startTime .. endTime)
| where logical_server_name =~ logicalServer
| where database_name =~ databaseName
| where replica_type =~ replicaType
| where cntr_type == 537003264 // restrict to ratio percentage counters
| where object_name =~ "Workload Group Stats" and counter_name in ("CPU usage %")
| project sample_time_utc, workload_group = instance_name, cpu_percentage = cntr_value
| render timechart;

使用 KQL 分析随时间变化的累积等待

此示例演示如何绘制一段时间间隔内排名靠前的 SQL 等待类型。 该查询计算每个等待类型的累积等待时间,以(毫秒/秒)已用时间为单位。 可以调整查询变量来设置间隔的开始和结束时间、要包括的排名靠前等待类型数,以及图表上数据点之间的步长。

查询使用两种方法来提高性能:

  • partition KQL 运算符与 shuffle 策略结合使用,以将查询处理分散到多个群集节点(如果存在)。
  • 使用 materialize() 函数保存一个中间结果集,该结果集会复用于计算排名靠前等待并可生成要绘制的时序。
let logicalServer = @"your-server-name";
let databaseName = @"your-database-name";
let replicaType = "Primary";
let endTime = datetime("2023-12-19 22:10:00");
let startTime = endTime - 30m;
let top_wait_types = 10;
let chart_step = 30s;
let wait_type_sample = materialize (
sqldb_database_wait_stats
| where sample_time_utc between (startTime .. endTime)
| where logical_server_name =~ logicalServer
| where database_name =~ databaseName
| where replica_type =~ replicaType
| project replica_id, sample_time_utc, wait_type, wait_time_ms
| partition hint.strategy=shuffle by wait_type
(
sort by replica_id asc, sample_time_utc asc
| extend delta_wait_time_ms = iif(wait_time_ms >= prev(wait_time_ms) and replica_id == prev(replica_id), wait_time_ms - prev(wait_time_ms), long(null)),
         delta_sample_time_utc = iif(sample_time_utc >= prev(sample_time_utc), datetime_diff("Millisecond", sample_time_utc, prev(sample_time_utc)), long(null))
| where isnotempty(delta_sample_time_utc)
| extend wait_ms_per_s = toreal(delta_wait_time_ms) / delta_sample_time_utc * 1000
| project sample_time_utc, wait_type, wait_ms_per_s
)
);
let top_wait = (
wait_type_sample
| summarize total_wait_ms_per_s = sum(wait_ms_per_s) by wait_type
| top top_wait_types by total_wait_ms_per_s desc
| project-away total_wait_ms_per_s
);
wait_type_sample
| join kind=inner top_wait on wait_type
| project-away wait_type1
| make-series wait_ms_per_s = avgif(wait_ms_per_s, isfinite(wait_ms_per_s)) default = long(null) on sample_time_utc from startTime to endTime step chart_step by wait_type
| project wait_type, sample_time_utc, wait_ms_per_s
| render timechart;

用 T-SQL 分析监视数据

如果已熟悉 T-SQL,可以立即开始查询和分析 SQL 监视数据,而无需学习 KQL。 但是,KQL 是用于查询 Azure 数据资源管理器或实时分析中数据的建议语言,因为它为查询遥测数据提供了无与伦比的支持。

可以从 SQL Server Management Studio (SSMS)Azure Data Studio 和其他常用工具连接到 Azure 数据资源管理器或实时分析数据库。 可以查询 Azure 数据资源管理器或 KQL 数据库,如同 SQL Server 或 Azure SQL 数据库一样。 有关详细信息,请参阅使用 SQL Server 仿真在 Azure 数据资源管理器中查询数据

注意

Azure 数据资源管理器和实时分析中并不支持所有的 T-SQL 构造。 有关详细信息,请参阅使用 T-SQL 查询数据

如果发现 T-SQL 支持不足以满足需求,或者要将 T-SQL 查询转换为 KQL 以使用其高级分析功能,则 SQL 到 Kusto 查询语言速查表有助于将 T-SQL 查询转换为 KQL。

以下示例演示如何使用 T-SQL 查询数据库观察程序数据存储中的监视数据。

使用 T-SQL 分析一段时间内的资源消耗

在此示例中,查询返回过去一小时内数据库、弹性池或 SQL 托管实例的主要副本的资源消耗指标(CPU、辅助角色、日志写入吞吐量等)。

在此和其他示例中,更改 DECLARE 语句中的变量以匹配服务器、数据库、弹性池或 SQL 托管实例的名称。

DECLARE @LogicalServerName sysname = 'your-server-name',
        @DatabaseName sysname = 'your-database-name',
        @ReplicaType sysname = 'Primary',
        @DurationMinutes int = 60;

SELECT sample_time_utc,
       avg_cpu_percent,
       avg_instance_cpu_percent,
       avg_data_io_percent,
       avg_log_write_percent,
       max_worker_percent
FROM sqldb_database_resource_utilization
WHERE sample_time_utc > DATEADD(minute, -@DurationMinutes, SYSUTCDATETIME())
      AND
      logical_server_name = @LogicalServerName
      AND
      database_name = @DatabaseName
      AND
      replica_type = @ReplicaType
ORDER BY sample_time_utc DESC;

使用 T-SQL 查看数据库、弹性池或 SQL 托管实例属性

在此示例中,查询将返回所有数据库、弹性池或 SQL 托管实例的集,在过去 24 小时内至少从这些对象中收集了相应“属性”数据集中的一个样本。 换言之,每行表示一个监视目标,包含其最近观察到的属性。

DECLARE @DurationHours int = 24;

SELECT p.sample_time_utc,
       p.logical_server_name,
       p.database_name,
       p.replica_type,
       p.database_id,
       p.elastic_pool_name,
       p.service_tier,
       p.service_level_objective,
       p.logical_cpu_count,
       p.database_engine_memory_mb,
       p.compatibility_level,
       p.updateability,
       p.database_engine_build_time,
       p.database_engine_start_time_utc
FROM sqldb_database_properties AS p
INNER JOIN (
           SELECT logical_server_name,
                  database_name,
                  replica_type,
                  MAX(sample_time_utc) AS last_sample_time_utc
           FROM sqldb_database_properties
           WHERE sample_time_utc > DATEADD(hour, -@DurationHours, SYSUTCDATETIME())
           GROUP BY logical_server_name,
                    database_name,
                    replica_type
           ) AS ls
ON p.logical_server_name = ls.logical_server_name
   AND
   p.database_name = ls.database_name
   AND
   p.replica_type = ls.replica_type
   AND
   p.sample_time_utc = ls.last_sample_time_utc
WHERE p.sample_time_utc > DATEADD(hour, -@DurationHours, SYSUTCDATETIME())
ORDER BY LOWER(logical_server_name) ASC,
         LOWER(database_name) ASC,
         CASE replica_type
              WHEN 'Primary' THEN 0
              WHEN 'Geo-replication forwarder' THEN 1
              WHEN 'Named secondary' THEN 2
              WHEN 'HA secondary' THEN 3
         END ASC;

使用 T-SQL 查询运行时统计信息

此查询返回 Azure SQL 资产中消耗资源最多的查询。 更改 @TopQueriesBy 变量以按任何查询存储指标查找排名靠前的查询,包括 CPU 时间、已用时间、执行计数等。还可以设置变量,以便按特定查询的时间间隔、查询执行类型和查询哈希进行筛选,或者专注于特定逻辑服务器、弹性池或 SQL 托管实例中的数据库。

查询使用“查询运行时统计信息”数据集返回指定的排名靠前的查询。 它还按其他每个资源消耗指标返回其排名。

DECLARE @EndTime datetime2 = SYSUTCDATETIME(),
        @StartTime datetime2 = DATEADD(hour, -24, SYSUTCDATETIME()),
        /* 
        Set the next variable to one of the following metrics to return the top resource consuming queries:
        executions, cpu_time, duration, logical_io_reads, physical_io_reads, num_physical_io_reads, 
        clr_time, query_max_used_memory, log_bytes_used, tempdb_space_used, row_count, dop
        */
        @TopQueriesBy varchar(30) = 'cpu_time',
        @TopQueries int = 10,
        @LogicalServerName sysname = '', -- Optionally filter by logical server name
        @ElasticPoolName sysname = '', -- Optionally filter by elastic pool name, if any databases are in elastic pools
        @DatabaseName sysname = '', -- Optionally filter by database name
        @ExecutionType varchar(30) = '', -- Optionally filter by execution type. Use Regular, Aborted, Exception.
        @QueryHash varchar(18) = ''; -- Optionally filter by query hash (example: 0xBAAA461A6C93EA88)

SELECT TOP (@TopQueries) 
       CONCAT(logical_server_name, IIF(count_logical_servers > 1, CONCAT(' (+', CAST(count_logical_servers - 1 AS varchar(11)), ')'), '')) AS logical_server_name,
       CONCAT(database_name, IIF(count_databases > 1, CONCAT(' (+', CAST(count_databases - 1 AS varchar(11)), ')'), '')) AS database_name,
       query_sql_text,
       CONCAT(CAST(query_id AS varchar(11)), IIF(count_queries > 1, CONCAT(' (+', CAST(count_queries - 1 AS varchar(11)), ')'), '')) AS query_id,
       CONCAT(CAST(plan_id AS varchar(11)), IIF(count_plans > 1, CONCAT(' (+', CAST(count_plans - 1 AS varchar(11)), ')'), '')) AS plan_id,
       regular_executions,
       aborted_executions,
       exception_executions,
       cpu_time_us,
       duration_us,
       logical_io_reads,
       physical_io_reads,
       num_physical_io_reads,
       clr_time_us,
       query_max_used_memory_kb,
       log_bytes_used,
       tempdb_space_used_kb,
       row_count,
       dop,
       query_hash,
       executions_rank,
       cpu_time_rank,
       duration_rank,
       logical_io_reads_rank,
       physical_io_reads_rank,
       num_physical_io_reads_rank,
       clr_time_rank,
       query_max_used_memory_rank,
       log_bytes_used_rank,
       tempdb_space_used_rank,
       row_count_rank,
       dop_rank
FROM (
     SELECT *,
            DENSE_RANK() OVER (ORDER BY executions DESC) AS executions_rank,
            DENSE_RANK() OVER (ORDER BY cpu_time_us DESC) AS cpu_time_rank,
            DENSE_RANK() OVER (ORDER BY duration_us DESC) AS duration_rank,
            DENSE_RANK() OVER (ORDER BY logical_io_reads DESC) AS logical_io_reads_rank,
            DENSE_RANK() OVER (ORDER BY physical_io_reads DESC) AS physical_io_reads_rank,
            DENSE_RANK() OVER (ORDER BY num_physical_io_reads DESC) AS num_physical_io_reads_rank,
            DENSE_RANK() OVER (ORDER BY clr_time_us DESC) AS clr_time_rank,
            DENSE_RANK() OVER (ORDER BY query_max_used_memory_kb DESC) AS query_max_used_memory_rank,
            DENSE_RANK() OVER (ORDER BY log_bytes_used DESC) AS log_bytes_used_rank,
            DENSE_RANK() OVER (ORDER BY tempdb_space_used_kb DESC) AS tempdb_space_used_rank,
            DENSE_RANK() OVER (ORDER BY row_count DESC) AS row_count_rank,
            DENSE_RANK() OVER (ORDER BY dop DESC) AS dop_rank
     FROM (
          SELECT query_hash,
                 COUNT(DISTINCT(logical_server_name)) AS count_logical_servers,
                 MAX(logical_server_name) AS logical_server_name,
                 COUNT(DISTINCT(database_name)) AS count_databases,
                 MAX(database_name) AS database_name,
                 COUNT(DISTINCT(query_id)) AS count_queries,
                 MAX(query_id) AS query_id,
                 COUNT(DISTINCT(plan_id)) AS count_plans,
                 MAX(plan_id) AS plan_id,
                 MAX(query_sql_text) AS query_sql_text,
                 SUM(IIF(execution_type_desc = 'Regular', count_executions, 0)) AS regular_executions,
                 SUM(IIF(execution_type_desc = 'Aborted', count_executions, 0)) AS aborted_executions,
                 SUM(IIF(execution_type_desc = 'Exception', count_executions, 0)) AS exception_executions,
                 SUM(count_executions) AS executions,
                 SUM(avg_cpu_time_us * count_executions) AS cpu_time_us,
                 SUM(avg_duration_us * count_executions) AS duration_us,
                 SUM(avg_logical_io_reads * count_executions) AS logical_io_reads,
                 SUM(avg_physical_io_reads * count_executions) AS physical_io_reads,
                 SUM(avg_num_physical_io_reads * count_executions) AS num_physical_io_reads,
                 SUM(avg_clr_time_us * count_executions) AS clr_time_us,
                 SUM(avg_query_max_used_memory * count_executions) * 8 AS query_max_used_memory_kb,
                 SUM(avg_log_bytes_used * count_executions) AS log_bytes_used,
                 SUM(avg_tempdb_space_used * count_executions) * 8 AS tempdb_space_used_kb,
                 SUM(avg_rowcount * count_executions) AS row_count,
                 SUM(IIF(is_parallel_plan = 1, avg_dop * count_executions, NULL)) AS dop
          FROM sqldb_database_query_runtime_stats
          WHERE interval_start_time >= @StartTime AND interval_end_time <= @EndTime
                AND
                (@ExecutionType = '' OR LOWER(execution_type_desc) = LOWER(@ExecutionType))
                AND
                (@LogicalServerName = '' OR LOWER(logical_server_name) = LOWER(@LogicalServerName))
                AND
                (@ElasticPoolName = '' OR LOWER(elastic_pool_name) = LOWER(@ElasticPoolName))
                AND
                (@DatabaseName = '' OR LOWER(database_name) = LOWER(@DatabaseName))
          GROUP BY query_hash
          ) AS rsa
     ) AS rsar
WHERE @QueryHash = '' OR LOWER(query_hash) = LOWER(@QueryHash)
ORDER BY CASE @TopQueriesBy
              WHEN 'executions' THEN executions_rank
              WHEN 'cpu_time' THEN cpu_time_rank
              WHEN 'duration' THEN duration_rank
              WHEN 'logical_io_reads' THEN logical_io_reads_rank
              WHEN 'physical_io_reads' THEN physical_io_reads_rank
              WHEN 'num_physical_io_reads' THEN num_physical_io_reads_rank
              WHEN 'clr_time' THEN clr_time_rank
              WHEN 'query_max_used_memory' THEN query_max_used_memory_rank
              WHEN 'log_bytes_used' THEN log_bytes_used_rank
              WHEN 'tempdb_space_used' THEN tempdb_space_used_rank
              WHEN 'row_count' THEN row_count_rank
              WHEN 'dop' THEN dop_rank
         END ASC;

使用 T-SQL 分析一段时间内的性能计数器

在此示例中,查询会返回过去 30 分钟的性能计数器值。

此示例使用累积性能计数器,例如 Total request countQuery optimizations/sec。 累积表示计数器值在查询活动发生时会不断增加。 查询使用 LAG() 分析函数计算每个样本中的计数器值与其上一样本中的值之间的差异或增量,以获取自上一个样本以来发生的请求数和优化数。

DECLARE @LogicalServerName sysname = 'your-server-name',
        @DatabaseName sysname = 'your-database-name',
        @ReplicaType sysname = 'Primary',
        @DurationMinutes int = 30;

SELECT sample_time_utc,
       SUM(IIF(
              counter_name = 'Total request count',
              CAST((cntr_value - prev_cntr_value) AS decimal) / DATEDIFF(millisecond, prev_sample_time_utc, sample_time_utc) * 1000,
              NULL
              )) AS requests_per_second,
       SUM(IIF(
              counter_name = 'Query optimizations/sec',
              CAST((cntr_value - prev_cntr_value) AS decimal) / DATEDIFF(millisecond, prev_sample_time_utc, sample_time_utc) * 1000,
              NULL
              )) AS query_optimizations_per_second
FROM (
     SELECT sample_time_utc,
            LAG(sample_time_utc) OVER (PARTITION BY replica_id, object_name, counter_name ORDER BY sample_time_utc ASC) AS prev_sample_time_utc,
            counter_name,
            cntr_value,
            LAG(cntr_value) OVER (PARTITION BY replica_id, object_name, counter_name ORDER BY sample_time_utc ASC) AS prev_cntr_value
     FROM sqldb_database_performance_counters_common
     WHERE sample_time_utc > DATEADD(minute, -@DurationMinutes, SYSUTCDATETIME())
           AND
           logical_server_name = @LogicalServerName
           AND
           database_name = @DatabaseName
           AND
           replica_type = @ReplicaType
           AND
           cntr_type = 272696576 /* restrict to cumulative counters */
           AND
           object_name = 'Workload Group Stats'
           AND
           counter_name IN ('Total request count','Query optimizations/sec')
     ) AS pc
WHERE cntr_value >= prev_cntr_value
      AND
      sample_time_utc >= prev_sample_time_utc
GROUP BY sample_time_utc
ORDER BY sample_time_utc DESC;

使用 T-SQL 分析时间点性能计数器

下一个示例适用于报告最近观察到的值(例如 Active memory grants countPending memory grants countProcesses blocked)的时间点性能计数器。

DECLARE @LogicalServerName sysname = 'your-server-name',
        @DatabaseName sysname = 'your-database-name',
        @ReplicaType sysname = 'Primary',
        @DurationMinutes int = 30;

SELECT sample_time_utc,
       SUM(IIF(
              counter_name = 'Processes blocked',
              cntr_value,
              NULL
              )) AS processes_blocked,
       SUM(IIF(
              counter_name = 'Active memory grants count',
              cntr_value,
              NULL
              )) AS active_memory_grants,
       SUM(IIF(
              counter_name = 'Pending memory grants count',
              cntr_value,
              NULL
              )) AS pending_memory_grants
FROM (
     SELECT sample_time_utc,
            counter_name,
            cntr_value
     FROM sqldb_database_performance_counters_common
     WHERE sample_time_utc > DATEADD(minute, -@DurationMinutes, SYSUTCDATETIME())
         AND
         logical_server_name = @LogicalServerName
         AND
         database_name = @DatabaseName
         AND
         replica_type = @ReplicaType
         AND
         cntr_type = 65792 /* restrict to point-in-time counters */
         AND
         (
         (object_name = 'General Statistics' AND counter_name IN ('Processes blocked'))
         OR
         (object_name = 'Resource Pool Stats' AND counter_name IN ('Active memory grants count','Pending memory grants count'))
         )
     ) AS pc
GROUP BY sample_time_utc
ORDER BY sample_time_utc DESC;

使用 T-SQL 分析随时间变化的累积等待

在此示例中,查询按 30 分钟间隔内的平均累积等待时间返回前 10 个等待类型。 累积是指查询计算每秒所有请求在每个等待类型下等待的总时间(以毫秒为单位)。 由于多个请求可以并发执行(并等待),因此每秒的累积等待时间会超过一秒。

DECLARE @LogicalServerName sysname = 'your-server-name',
        @DatabaseName sysname = 'your-database-name',
        @ReplicaType sysname = 'Primary',
        @DurationMinutes int = 30;

SELECT TOP (10) wait_type,
                SUM(CAST((wait_time_ms - prev_wait_time_ms) AS decimal)) * 1000
                /
                SUM(DATEDIFF(millisecond, prev_sample_time_utc, sample_time_utc))
                AS wait_time_ms_per_sec
FROM (
     SELECT sample_time_utc,
            LAG(sample_time_utc) OVER (PARTITION BY replica_id, wait_type ORDER BY sample_time_utc ASC) AS prev_sample_time_utc,
            wait_type,
            wait_time_ms,
            LAG(wait_time_ms) OVER (PARTITION BY replica_id, wait_type ORDER BY sample_time_utc ASC) AS prev_wait_time_ms
     FROM sqldb_database_wait_stats
     WHERE sample_time_utc > DATEADD(minute, -@DurationMinutes, SYSUTCDATETIME())
         AND
         logical_server_name = @LogicalServerName
         AND
         database_name = @DatabaseName
         AND
         replica_type = @ReplicaType
     ) AS w
WHERE sample_time_utc >= prev_sample_time_utc
      AND
      wait_time_ms >= prev_wait_time_ms
GROUP BY wait_type
ORDER BY wait_time_ms_per_sec DESC;