Query results cache

Kusto includes a query results cache. You can choose to get cached results when issuing a query. You'll experience better query performance and lower resource consumption if your query's results can be returned by the cache. However, this performance comes at the expense of some "staleness" in the results.

Use the cache

Set the query_results_cache_max_age option as part of the query to use the query results cache. You can set this option in the query text or as a client request property. For example:

set query_results_cache_max_age = time(5m);
GithubEvent
| where CreatedAt > ago(180d)
| summarize arg_max(CreatedAt, Type) by Id

The option value is a timespan that indicates the maximum "age" of the results cache, measured from the query start time. Beyond the set timespan, the cache entry is obsolete and won't be used again. Setting a value of 0 is equivalent to not setting the option.

Compatibility between queries

Identical queries

The query results cache returns results only for queries that are considered "identical" to a previous cached query. Two queries are considered identical if all of the following conditions are met:

Incompatible queries

The query results will not be cached if any of the following conditions is true:

No valid cache entry

If a cached result satisfying the time constraints couldn't be found, or there isn't a cached result from an "identical" query in the cache, the query will be executed and its results cached, as long as:

  • The query execution completes successfully, and
  • The query results size doesn't exceed 16 MB.

Results from the cache

How does the service indicate that the query results are being served from the cache? When responding to a query, Kusto sends another ExtendedProperties response table that includes a Key column and a Value column. Cached query results will have another row appended to that table:

  • The row's Key column will contain the string ServerCache
  • The row's Value column will contain a property bag with two fields:
    • OriginalClientRequestId - Specifies the original request's ClientRequestId.
    • OriginalStartedOn - Specifies the original request's execution start time.

Distribution

The cache is not shared by cluster nodes. Every node has a dedicated cache in its own private storage. If two identical queries land on different nodes, the query will be executed and cached on both nodes. This process can happen if weak consistency is used. By setting query consistency to affinitizedweakconsistency, you can have weakly consistency queries that are identical land on the same query head, and thus increase the cache hit rate.

Management

The following management and observability commands are supported:

Capacity

The cache capacity is currently fixed at 1 GB per cluster node. The eviction policy is LRU.

Shard level query results cache

The query results cache is effective when the exact same query is run multiple times in rapid succession and can tolerate returning slightly old data. However, some scenarios, like a live dashboard, require the most up-to-date results.

For example, a query that runs every 10 seconds and spans the last 1 hour can benefit from caching intermediate query results at the storage (shard) level.

Note

This feature is only available on EngineV3 clusters.

The shard level query results cache is automatically enabled when the Query results cache is in use. Because it shares the same cache as Query results cache, the same capacity and eviction policies apply.

Syntax

set query_results_cache_per_shard; Query

Note

This option can be set in the query text or as a client request property.

Example

set query_results_cache_per_shard;
GithubEvent
| where CreatedAt > ago(180d)
| summarize arg_max(CreatedAt, Type) by Id