您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

查询限制Query limits

Kusto 是一个即席查询引擎,它承载着大型数据集,并尝试通过在内存中保留所有相关数据来满足查询。Kusto is an ad-hoc query engine that hosts large data sets and attempts to satisfy queries by holding all relevant data in-memory. 存在一个固有的风险,即查询会无限地独占服务资源。There's an inherent risk that queries will monopolize the service resources without bounds. Kusto 以默认查询限制的形式提供若干内置保护。Kusto provides several built-in protections in the form of default query limits. 如果要考虑消除这些限制,请首先确定这样做实际上是否会带来任何价值。If you're considering removing these limits, first determine whether you actually gain any value by doing so.

对请求并发的限制Limit on request concurrency

“请求并发”是指群集对同时运行的若干请求施加的限制。Request concurrency is a limit that a cluster imposes on several requests running at the same time.

  • 此限制的默认值取决于运行群集的 SKU,并按以下方式计算:Cores-Per-Node x 10The default value of the limit depends on the SKU the cluster is running on, and is calculated as: Cores-Per-Node x 10.
    • 例如,对于在 D14v2 SKU 上设置的群集(其中每台计算机都有 16 Vcore),默认限制为 16 cores x10 = 160For example, for a cluster that's set-up on D14v2 SKU, where each machine has 16 vCores, the default limit is 16 cores x10 = 160.
  • 可通过配置 default 工作负荷组的请求速率限制策略来更改默认值。The default value can be changed by configuring the request rate limit policy of the default workload group.
    • 可以在群集上并发运行的实际请求数取决于不同的因素。The actual number of requests that can run concurrently on a cluster depends on various factors. 最主要的因素是群集 SKU、群集的可用资源和使用模式。The most dominant factors are cluster SKU, cluster's available resources, and usage patterns. 可以根据对类生产使用模式执行的负载测试来配置策略。The policy can be configured based on load tests performed on production-like usage patterns.

超过请求并发限制会导致以下行为:Exceeding the request concurrency limit will result in the following behavior:

  • 由于请求速率限制策略而被拒绝的命令将引发 ControlCommandThrottledException(错误代码 = 429)。Commands that are denied because of the request rate limit policy will give a ControlCommandThrottledException (error code = 429).
  • 由于请求速率限制策略而被拒绝的查询将引发 QueryThrottledException(错误代码 = 429)。Queries that are denied because of the request rate limit policy will give a QueryThrottledException (error code = 429).

有关结果集大小的限制(结果截断)Limit on result set size (result truncation)

结果截断 是针对查询返回的结果集默认设置的限制。Result truncation is a limit set by default on the result set returned by the query. Kusto 将返回给客户端的记录数限制为 500,000,将这些记录的总数据大小限制为 64 MB。Kusto limits the number of records returned to the client to 500,000, and the overall data size for those records to 64 MB. 当超出其中任一限制时,查询将失败并显示“部分查询失败”。When either of these limits is exceeded, the query fails with a "partial query failure". 超出总数据大小将生成包含以下消息的异常:Exceeding overall data size will generate an exception with the message:

The Kusto DataEngine has failed to execute a query: 'Query result set has exceeded the internal data size limit 67108864 (E_QUERY_RESULT_SET_TOO_LARGE).'

超出记录数则会失败,并会显示以下异常:Exceeding the number of records will fail with an exception that says:

The Kusto DataEngine has failed to execute a query: 'Query result set has exceeded the internal record count limit 500000 (E_QUERY_RESULT_SET_TOO_LARGE).'

有多个策略可用于处理此错误。There are several strategies for dealing with this error.

  • 将查询修改为仅返回令人感兴趣的数据,从而减小结果集的大小。Reduce the result set size by modifying the query to only return interesting data. 当初始的失败查询范围太广时,此策略非常有用。This strategy is useful when the initial failing query is too "wide". 例如,查询未抛弃不需要的数据列。For example, the query doesn't project away data columns that aren't needed.
  • 通过将查询后处理(例如聚合)切换到查询本身中来减小结果集的大小。Reduce the result set size by shifting post-query processing, such as aggregations, into the query itself. 如果要将查询输出馈送到另一个处理系统,再由该系统执行其他聚合,此策略很有用。The strategy is useful in scenarios where the output of the query is fed to another processing system, and that then does other aggregations.
  • 要从服务中导出大量数据时,请从查询切换到使用数据导出Switch from queries to using data export when you want to export large sets of data from the service.
  • 使用下面列出的 set 语句或客户端请求属性中的标志来指示服务取消此查询限制。Instruct the service to suppress this query limit using set statements listed below or flags in client request properties.

用于减小查询生成的结果集大小的方法包括:Methods for reducing the result set size produced by the query include:

你可以通过使用 notruncation 请求选项来禁用结果截断。You can disable result truncation by using the notruncation request option. 建议你仍然实施某种形式的限制。We recommend that some form of limitation is still put in place.

例如:For example:

set notruncation;
MyTable | take 1000000

还可以通过设置 truncationmaxsize(以字节为单位的最大数据大小,默认为 64 MB)和 truncationmaxrecords(最大记录数,默认为 500,000)的值,对结果截断进行更精确的控制。It's also possible to have more refined control over result truncation by setting the value of truncationmaxsize (maximum data size in bytes, defaults to 64 MB) and truncationmaxrecords (maximum number of records, defaults to 500,000). 例如,以下查询设置在达到 1,105 条记录或 1 MB 时截断超出限制的结果。For example, the following query sets result truncation to happen at either 1,105 records or 1 MB, whichever is exceeded.

set truncationmaxsize=1048576;
set truncationmaxrecords=1105;
MyTable | where User=="UserId1"

去除结果截断限制意味着你打算将大量数据移出 Kusto。Removing the result truncation limit means that you intend to move bulk data out of Kusto.

你可以去除结果截断限制,以方便使用 .export 命令进行导出,或方便以后进行聚合。You can remove the result truncation limit either for export purposes by using the .export command or for later aggregation. 如果选择稍后进行聚合,请考虑使用 Kusto 进行聚合。If you choose later aggregation, consider aggregating by using Kusto.

Kusto 提供的许多客户端库能够以将“无限大的”结果流式传输给调用方的方式来处理这些结果。Kusto provides a number of client libraries that can handle "infinitely large" results by streaming them to the caller. 请使用这些库中的一个,并将其配置为流式传输模式。Use one of these libraries, and configure it to streaming mode. 例如,使用 .NET Framework 客户端 (Microsoft.Azure.Kusto.Data),并将连接字符串的 streaming 属性设置为 true,或使用始终会流式传输结果的 ExecuteQueryV2Async() 调用。For example, use the .NET Framework client (Microsoft.Azure.Kusto.Data) and either set the streaming property of the connection string to true, or use the ExecuteQueryV2Async() call that always streams results.

默认情况下,结果截断不仅仅应用于返回给客户端的结果流。Result truncation is applied by default, not just to the result stream returned to the client. 默认情况下,它还应用于跨群集查询中一个群集向另一个群集发出的任何子查询,并具有类似的效果。It's also applied by default to any subquery that one cluster issues to another cluster in a cross-cluster query, with similar effects.

设置多个结果截断属性Setting multiple result truncation properties

使用 set 语句时,并且/或者当在客户端请求属性中指定标志时,以下情况适用。The following apply when using set statements, and/or when specifying flags in client request properties.

  • 如果设置了 notruncation,并且还设置了 truncationmaxsizetruncationmaxrecordsquery_take_max_records,则会忽略 notruncationIf notruncation is set, and any of truncationmaxsize, truncationmaxrecords, or query_take_max_records are also set - notruncation is ignored.
  • 如果多次设置了 truncationmaxsizetruncationmaxrecords 和/或 query_take_max_records,则采用每个属性的下限值。If truncationmaxsize, truncationmaxrecords and/or query_take_max_records are set multiple times - the lower value for each property applies.

每个迭代器的内存限制Limit on memory per iterator

每个结果集迭代器的最大内存 是 Kusto 用于防范“失控”查询的另一个限制。Max memory per result set iterator is another limit used by Kusto to protect against "runaway" queries. 此限制由请求选项 maxmemoryconsumptionperiterator 表示,用于设置单个查询计划结果集迭代器可以容纳的内存量的上限。This limit, represented by the request option maxmemoryconsumptionperiterator, sets an upper bound on the amount of memory that a single query plan result set iterator can hold. 此限制适用于本来就不能进行流式传输的特定迭代器,例如 join。下面是发生这种情况时将返回的一些错误消息:This limit applies to the specific iterators that aren't streaming by nature, such as join.) Here are a few error messages that will return when this situation happens:

The ClusterBy operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete E_RUNAWAY_QUERY.

The DemultiplexedResultSetCache operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete (E_RUNAWAY_QUERY).

The ExecuteAndCache operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete (E_RUNAWAY_QUERY).

The HashJoin operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete (E_RUNAWAY_QUERY).

The Sort operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete (E_RUNAWAY_QUERY).

The Summarize operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete (E_RUNAWAY_QUERY).

The TopNestedAggregator operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete (E_RUNAWAY_QUERY).

The TopNested operator has exceeded the memory budget during evaluation. Results may be incorrect or incomplete (E_RUNAWAY_QUERY).

默认情况下,此值设置为 5 GB。By default, this value is set to 5 GB. 最多可将此值增加到计算机的物理内存的一半:You may increase this value by up to half the physical memory of the machine:

set maxmemoryconsumptionperiterator=68719476736;
MyTable | ...

如果查询使用 summarizejoinmake-series 运算符,可以使用随机执行查询策略来减轻单台计算机的内存压力。If the query uses summarize, join, or make-series operators, you can use the shuffle query strategy to reduce memory pressure on a single machine.

在其他情况下,可以对数据集采样,以避免超出此限制。In other cases, you can sample the data set to avoid exceeding this limit. 下面的两个查询展示了如何进行采样。The two queries below show how to do the sampling. 第一个查询是使用随机数生成器的统计性采样。The first query is a statistical sampling, using a random number generator. 第二个查询是确定性采样,通过对数据集中的某些列(通常是某个 ID)进行哈希处理来完成。The second query is deterministic sampling, done by hashing some column from the data set, usually some ID.

T | where rand() < 0.1 | ...

T | where hash(UserId, 10) == 1 | ...

如果多次设置了 maxmemoryconsumptionperiterator(例如在两个客户端请求属性中使用 set 语句设置),将应用较小的值。If maxmemoryconsumptionperiterator is set multiple times, for example in both client request properties and using a set statement, the lower value applies.

每个节点的内存限制Limit on memory per node

每个节点每个查询的最大内存 是用于防范“失控”查询的另一个限制。Max memory per query per node is another limit used to protect against "runaway" queries. 此限制通过请求选项 max_memory_consumption_per_query_per_node 表示,用于设置可以在单个节点上用于特定查询的内存量的上限。This limit, represented by the request option max_memory_consumption_per_query_per_node, sets an upper bound on the amount of memory that can be used on a single node for a specific query.

set max_memory_consumption_per_query_per_node=68719476736;
MyTable | ...

如果多次设置了 max_memory_consumption_per_query_per_node(例如在两个客户端请求属性中使用 set 语句设置),将应用较小的值。If max_memory_consumption_per_query_per_node is set multiple times, for example in both client request properties and using a set statement, the lower value applies.

如果查询使用 summarizejoinmake-series 运算符,可以使用随机执行查询策略来减轻单台计算机的内存压力。If the query uses summarize, join, or make-series operators, you can use the shuffle query strategy to reduce memory pressure on a single machine.

限制执行超时Limit execution timeout

服务器超时 是应用于所有请求的服务端超时。Server timeout is a service-side timeout that is applied to all requests. Kusto 中的多个点针对运行中的请求(查询和控制命令)强制实施了超时限制:Timeout on running requests (queries and control commands) is enforced at multiple points in the Kusto:

  • 客户端库(如果使用)client library (if used)
  • 接受请求的服务终结点service endpoint that accepts the request
  • 处理请求的服务引擎service engine that processes the request

默认情况下,查询超时设置为 4 分钟,控制命令超时设置为 10 分钟。By default, timeout is set to four minutes for queries, and 10 minutes for control commands. 如果需要,可以增大该值(上限为 1 小时)。This value can be increased if needed (capped at one hour).

  • 如果你使用 Kusto.Explorer 进行查询,请使用“工具”>“选项* ”>“连接”>“查询服务器超时”。If you query using Kusto.Explorer, use Tools > Options* > Connections > Query Server Timeout.
  • 以编程方式将 servertimeout 客户端请求属性设置为 System.TimeSpan 类型的值,最多为 1 小时。Programmatically, set the servertimeout client request property, a value of type System.TimeSpan, up to an hour.

有关超时的说明Notes about timeouts

  • 在客户端,超时将从创建请求时开始算起,直到响应开始到达客户端为止。On the client side, the timeout is applied from the request being created until the time that the response starts arriving to the client. 在客户端上读回有效负载所花费的时间不计入超时。The time it takes to read the payload back at the client isn't treated as part of the timeout. 这取决于调用方从流中拉取数据的速度。It depends on how quickly the caller pulls the data from the stream.
  • 此外,在客户端使用的实际超时值略高于用户所请求的服务器超时值。Also on the client side, the actual timeout value used is slightly higher than the server timeout value requested by the user. 此差异考虑了网络延迟。This difference, is to allow for network latencies.
  • 若要自动使用允许的最大请求超时,请将客户端请求属性 norequesttimeout 设置为 trueTo automatically use the maximum allowed request timeout, set the client request property norequesttimeout to true.

对查询 CPU 资源使用量的限制Limit on query CPU resource usage

Kusto 允许你在运行查询时使用群集拥有的全部 CPU 资源。Kusto lets you run queries and use as much CPU resources as the cluster has. 如果有多个查询正在运行,它会尝试在查询之间进行公平的轮循。It attempts to do a fair round-robin between queries if more than one is running. 对于即席查询,此方法可以实现最佳性能。This method yields the best performance for ad-hoc queries. 在其他时候,你可能希望限制用于特定查询的 CPU 资源。At other times, you may want to limit the CPU resources used for a particular query. 例如,如果你运行“后台作业”,系统可能会容忍更高的延迟,以便向并发即席查询授予高优先级。If you run a "background job", for example, the system might tolerate higher latencies to give concurrent ad-hoc queries high priority.

Kusto 支持在运行查询时指定两个客户端请求属性Kusto supports specifying two client request properties when running a query. 属性为 query_fanout_threads_percent 和 query_fanout_nodes_percent 。The properties are query_fanout_threads_percent and query_fanout_nodes_percent. 这两个属性都是默认值为最大值 (100) 的整数,但对于特定查询,可以将其降低为某个其他值。Both properties are integers that default to the maximum value (100), but may be reduced for a specific query to some other value.

第一个属性 query_fanout_threads_percent 控制线程使用的扇出因子。The first, query_fanout_threads_percent, controls the fanout factor for thread use. 如果将此属性设置为 100%,则群集将分配每个节点上的所有 CPU。When this property is set 100%, the cluster will assign all CPUs on each node. 例如,在 Azure D14 节点上部署的群集上的 16 个 CPU。For example, 16 CPUs on a cluster deployed on Azure D14 nodes. 如果将此属性设置为 50%,则将使用一半的 CPU,依此类推。When this property is set to 50%, then half of the CPUs will be used, and so on. 数值向上舍入到整数个 CPU,因此将属性值设置为 0 是安全的。The numbers are rounded up to a whole CPU, so it's safe to set the property value to 0.

第二个属性 query_fanout_nodes_percent 控制每个子查询分发操作将使用群集中的多少个查询节点。The second, query_fanout_nodes_percent, controls how many of the query nodes in the cluster to use per subquery distribution operation. 它以类似的方式工作。It functions in a similar manner.

如果多次设置 query_fanout_nodes_percentquery_fanout_threads_percent(例如在两个客户端请求属性中使用 set 语句设置),将应用每个属性的较小值。If query_fanout_nodes_percent or query_fanout_threads_percent are set multiple times, for example, in both client request properties and using a set statement - the lower value for each property applies.

有关查询复杂性的限制Limit on query complexity

在查询执行期间,查询文本将转换为表示查询的关系运算符的树。During query execution, the query text is transformed into a tree of relational operators representing the query. 如果树深度超出了数千层的内部阈值,则查询会被视为太复杂,无法处理,因此会失败并显示错误代码。If the tree depth exceeds an internal threshold of several thousand levels, the query is considered too complex for processing, and will fail with an error code. 此失败表示关系运算符树超出了其限制。The failure indicates that the relational operators tree exceeds its limits. 超出了限制是因为查询中有很长的二元运算符列表链接在一起。Limits are exceeded because of queries with long lists of binary operators that are chained together. 例如:For example:

T 
| where Column == "value1" or 
        Column == "value2" or 
        .... or
        Column == "valueN"

对于此特定案例,请使用 in() 运算符来重新编写查询。For this specific case, rewrite the query using the in() operator.

T 
| where Column in ("value1", "value2".... "valueN")