基數估計 (SQL Server)Cardinality Estimation (SQL Server)

適用於: 是SQL Server 否Azure SQL Database 否Azure Synapse Analytics (SQL DW) 否平行處理資料倉儲 APPLIES TO: yesSQL Server yesAzure SQL Database noAzure Synapse Analytics (SQL DW) noParallel Data Warehouse

SQL ServerSQL Server 查詢最佳化工具是以成本為基礎的查詢最佳化工具。The SQL ServerSQL Server Query Optimizer is a cost-based Query Optimizer. 這表示它會選取估計處理成本最低的查詢計畫來執行。This means that it selects query plans that have the lowest estimated processing cost to execute. 查詢最佳化工具根據兩個主要因素來判斷執行查詢計劃的成本:The Query Optimizer determines the cost of executing a query plan based on two main factors:

  • 在查詢計畫的每一個層級進行處理的資料列總數,此稱為計畫的基數。The total number of rows processed at each level of a query plan, referred to as the cardinality of the plan.
  • 查詢中使用的運算子所指定的演算法成本模型。The cost model of the algorithm dictated by the operators used in the query.

第一個因數 (基數) 會作為第二個因數 (成本模型) 的輸入參數。The first factor, cardinality, is used as an input parameter of the second factor, the cost model. 因此,如果改善基數,便能產生更好的估計成本,進而可有更快的執行計畫。Therefore, improved cardinality leads to better estimated costs and, in turn, faster execution plans.

SQL ServerSQL Server 中的基數估計主要衍生自於建立索引或統計資料時,手動或自動建立的長條圖。Cardinality estimation (CE) in SQL ServerSQL Server is derived primarily from histograms that are created when indexes or statistics are created, either manually or automatically. 有時候,SQL ServerSQL Server 也會使用條件約束資訊及查詢的邏輯重寫來判斷基數。Sometimes, SQL ServerSQL Server also uses constraint information and logical rewrites of queries to determine cardinality.

在下列情況中,SQL ServerSQL Server 無法精確地計算基數。In the following cases, SQL ServerSQL Server cannot accurately calculate cardinalities. 這會導致不精確的成本計算,使得產生並非最佳的查詢計畫。This causes inaccurate cost calculations that may cause suboptimal query plans. 避免在查詢中使用這些建構可以提升查詢效能。Avoiding these constructs in queries may improve query performance. 有時也可以使用替代的查詢公式或其他方法,它們是:Sometimes, alternative query formulations or other measures are possible and these are pointed out:

  • 查詢的述詞,在相同資料表的不同資料行之間使用比較運算子。Queries with predicates that use comparison operators between different columns of the same table.
  • 查詢的述詞使用運算子,且下列任一情況為真:Queries with predicates that use operators, and any one of the following are true:
    • 運算子任一邊所關聯的資料行中,沒有任何統計資料。There are no statistics on the columns involved on either side of the operators.
    • 統計資料中的值分佈並不平均,但查詢會搜尋具有高度選擇性的值集。The distribution of values in the statistics is not uniform, but the query seeks a highly selective value set. 如果運算子不是等號 (=) 運算子,此情況會特別明顯。This situation can be especially true if the operator is anything other than the equality (=) operator.
    • 述詞使用不等於 (!=) 比較運算子或 NOT 邏輯運算子。The predicate uses the not equal to (!=) comparison operator or the NOT logical operator.
  • 查詢,其使用任一個 SQL Server 內建函式,或引數不是常數值之純量值的使用者定義函式。Queries that use any of the SQL Server built-in functions or a scalar-valued, user-defined function whose argument is not a constant value.
  • 查詢透過算術或字串串連運算子,與聯結資料行相關聯。Queries that involve joining columns through arithmetic or string concatenation operators.
  • 查詢所比較的變數,在編譯及最佳化查詢時其值不明。Queries that compare variables whose values are not known when the query is compiled and optimized.

本文說明如何評估及選擇最適合您系統的 CE 設定。This article illustrates how you can assess and choose the best CE configuration for your system. 大多數系統皆可從最新的 CE 中受益,因為它的精確度最高。Most systems benefit from the latest CE because it is the most accurate. CE 會預測您的查詢可能傳回的資料列數目。The CE predicts how many rows your query will likely return. 查詢最佳化工具使用基數預測,來產生最佳查詢計劃。The cardinality prediction is used by the Query Optimizer to generate the optimal query plan. 由於估計較精確,因此查詢最佳化工具通常較有機會產生較佳的查詢計劃。With more accurate estimations, the Query Optimizer can usually do a better job of producing a more optimal query plan.

您的應用程式系統可能有項重要的查詢,其計劃因此新的 CE 而變更為速度較慢的計劃。Your application system could possibly have an important query whose plan is changed to a slower plan due to the new CE. 這類查詢可能類似下列其中一項:Such a query might be like one of the following:

  • 執行頻繁很高而經常同時執行多個執行個體的 OLTP (線上交易處理) 查詢。An OLTP (online transaction processing) query that runs so frequently that multiple instance of it often run concurrently.
  • 在您的 OLTP 營業期間執行大量彙總的 SELECT。A SELECT with substantial aggregation that runs during your OLTP business hours.

您有幾個方法來指出使用新的 CE 執行較慢的查詢。You have techniques for identifying a query that performs slower with the new CE. 您也可以選擇要如何解決此效能問題。And you have options for how to address the performance issue.

CE 的版本Versions of the CE

在組建 1998 中,SQL ServerSQL Server 7.0 隨附提供 CE 的一項重大更新,其相容性層級為 70。In 1998, a major update of the CE was part of SQL ServerSQL Server 7.0, for which the compatibility level was 70. 這個版本的 CE 模型會根據四個基本假設設定:This version of the CE model is set on four basic assumptions:

  • 獨立性: 在不同資料行散發的資料會假設為各自獨立,除非提供可用的相互關聯資訊。Independence: Data distributions on different columns are assumed to be independent of each other, unless correlation information is available and usable.
  • 一致性: 相異值會平均分佈,使其全都具有相同頻率。Uniformity: Distinct values are evenly spaced and that they all have the same frequency. 更明確地說,在每個長條圖步驟中,相異值會平均分布且各值都具有相同頻率。More precisely, within each histogram step, distinct values are evenly spread and each value has same frequency.
  • 內含項目 (簡單): 使用者查詢存在的資料。Containment (Simple): Users query for data that exists. 舉例來說,就兩個資料表間的相等聯結而言,會在聯結長條圖以預估聯結選擇性前,將各輸入長條圖中的述詞選擇性1 納入考量。For example, for an equality join between two tables, factor in the predicates selectivity1 in each input histogram, before joining histograms to estimate the join selectivity.
  • 包含詞/句: 對於 Column = Constant 的篩選述詞而言,會假設相關聯資料行的常數實際存在。Inclusion: For filter predicates where Column = Constant, the constant is assumed to actually exist for the associated column. 若對應的長條圖步驟為非空白,則其中一個步驟的相異值會假設為符合來自述詞的值。If a corresponding histogram step is non-empty, one of the step's distinct values is assumed to match the value from the predicate.

1 滿足述詞的資料列計數。1 Row count that satisfies the predicate.

後續更新從 SQL Server 2014 (12.x)SQL Server 2014 (12.x) 開始,表示相容性層級為 120 及以上。Subsequent updates started with SQL Server 2014 (12.x)SQL Server 2014 (12.x), meaning compatibility levels 120 and above. 層級 120 及以上的 CE 更新併入已更新假設和演算法,適用於新式資料倉儲和 OLTP 工作負載。The CE updates for levels 120 and above incorporate updated assumptions and algorithms that work well on modern data warehousing and on OLTP workloads. 根據 CE 70 假設,下列模型假設已從 CE 120 起變更:From the CE 70 assumptions, the following model assumptions were changed starting with CE 120:

  • 獨立性變成相互關聯:不同資料行值的結合不需要獨立。Independence becomes Correlation: The combination of the different column values are not necessarily independent. 這可能會更類似實際資料查詢。This may resemble more real-life data querying.
  • 簡單內含項目變成基本內含項目:使用者可查詢不存在的資料。Simple Containment becomes Base Containment: Users might query for data that does not exist. 舉例來說,就兩個資料表間的相等聯結而言,我們會使用基底資料表長條圖來預估聯結選擇性,然後將述詞選擇性納入考量。For example, for an equality join between two tables, we use the base tables histograms to estimate the join selectivity, and then factor in the predicates selectivity.

相容性層級: 您可為 COMPATIBILITY_LEVEL 使用下列 Transact-SQLTransact-SQL 程式碼,來確認資料庫處於特定層級。Compatibility level: You can ensure your database is at a particular level by using the following Transact-SQLTransact-SQL code for COMPATIBILITY_LEVEL.

SELECT ServerProperty('ProductVersion');  
ALTER DATABASE <yourDatabase>  
SELECT d.name, d.compatibility_level  
FROM sys.databases AS d  
WHERE d.name = 'yourDatabase';  

若是相容性層級設定為 120 或以上的 SQL ServerSQL Server 資料庫,啟用追蹤旗標 9481 會強制系統使用 CE 版本 70。For a SQL ServerSQL Server database set at compatibility level 120 or above, activation of the trace flag 9481 forces the system to use the CE version 70.

舊版 CE: 若是相容性層級設定為 120 以上的 SQL ServerSQL Server 資料庫,可使用 ALTER DATABASE SCOPED CONFIGURATION 在資料庫層級啟用 CE 版本 70。Legacy CE: For a SQL ServerSQL Server database set at compatibility level 120 and above, the CE version 70 can be can be activated by using the at the database level by using the ALTER DATABASE SCOPED CONFIGURATION.

SELECT name, value  
FROM sys.database_scoped_configurations  

或者從 SQL Server 2016 (13.x)SQL Server 2016 (13.x) SP1 開始,使用查詢提示 USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION')Or starting with SQL Server 2016 (13.x)SQL Server 2016 (13.x) SP1, the Query Hint USE HINT ('FORCE_LEGACY_CARDINALITY_ESTIMATION').

SELECT CustomerId, OrderAddedDate  
FROM OrderTable  
WHERE OrderAddedDate >= '2016-05-01'

查詢存放區:SQL Server 2016 (13.x)SQL Server 2016 (13.x) 開始,提供查詢存放區工具,方便您檢查查詢的效能。Query store: Starting with SQL Server 2016 (13.x)SQL Server 2016 (13.x), the query store is a handy tool for examining the performance of your queries. Management StudioManagement Studio 中,啟用查詢存放區的情況下,物件總管中的資料庫節點下會顯示查詢存放區節點。In Management StudioManagement Studio, in the Object Explorer under your database node, a Query Store node is displayed when the query store is enabled.

ALTER DATABASE <yourDatabase>  
SELECT q.actual_state_desc AS [actual_state_desc_of_QueryStore],  
FROM sys.database_query_store_options AS q;  
ALTER DATABASE <yourDatabase>  


建議您安裝最新版本的 Management Studio,並經常進行更新。We recommend that you install the latest release of Management Studio and update it often.

追蹤基數估計處理序的另一個做法,是使用名為 query_optimizer_estimate_cardinality 的擴充事件。Another option for tracking the cardinality estimation process is to use the extended event named query_optimizer_estimate_cardinality. 下列 Transact-SQLTransact-SQL 程式碼範例會在 SQL ServerSQL Server 上執行。The following Transact-SQLTransact-SQL code sample runs on SQL ServerSQL Server. 其會將 .xel 檔案寫入 C:\Temp\ (不過您可變更路徑)。It writes a .xel file to C:\Temp\ (although you can change the path). 當您在 Management StudioManagement Studio 中開啟 .xel 檔案時,會以使用方便的方式來顯示其詳細資訊。When you open the .xel file in Management StudioManagement Studio, its detailed information is displayed in a user friendly manner.

ADD EVENT sqlserver.query_optimizer_estimate_cardinality  
        ACTION (sqlserver.sql_text)  
            WHERE (  
                sql_text LIKE '%yourTable%'  
                and sql_text LIKE '%SUM(%'  
ADD TARGET package0.asynchronous_file_target   
            filename = 'c:\temp\xe_qoec_1.xel',  
            metadatafile = 'c:\temp\xe_qoec_1.xem'  
ALTER EVENT SESSION Test_the_CE_qoec_1  

如需為 SQL DatabaseSQL Database 量身訂做之擴充事件的相關資訊,請參閱 SQL Database 中的擴充事件For information about extended events as tailored for SQL DatabaseSQL Database, see Extended events in SQL Database.

評估 CE 版本的步驟Steps to assess the CE version

接下來的步驟可讓您用來評估是否有任何最重要查詢在最新 CE 下的執行效能不佳。Next are steps you can use to assess whether any of your most important queries perform less well under the latest CE. 其中一些步驟是透過執行上一節所示的程式碼範例來進行。Some of the steps are performed by running a code sample presented in a preceding section.

  1. 開啟 Management StudioManagement StudioOpen Management StudioManagement Studio. 確定您的 SQL ServerSQL Server 資料庫已設定為最高可用的相容性層級。Ensure your SQL ServerSQL Serverdatabase is set to the highest available compatibility level.

  2. 執行下列預備步驟:Perform the following preliminary steps:

    1. 開啟 Management StudioManagement StudioOpen Management StudioManagement Studio.

    2. 執行 T-SQL,確定您的 SQL ServerSQL Server 資料庫已設定為最高可用的相容性層級。Run the T-SQL to ensure that your SQL ServerSQL Server database is set to the highest available compatibility level.

    3. 確定您的資料庫已關閉其 LEGACY_CARDINALITY_ESTIMATION 組態。Ensure that your database has its LEGACY_CARDINALITY_ESTIMATION configuration turned OFF.

    4. 「清除」您的查詢存放區。CLEAR your query store. 當然,請確定您的查詢存放區處於開啟狀態。Of course, ensure your query store is ON.

    5. 執行陳述式:SET NOCOUNT OFF;Run the statement: SET NOCOUNT OFF;

  3. 執行陳述式:SET STATISTICS XML ON;Run the statement: SET STATISTICS XML ON;

  4. 執行您的重要查詢。Run your important query.

  5. 記下結果窗格中 [訊息] 索引標籤上實際受影響的資料列數目。In the results pane, on the Messages tab, note the actual number of rows affected.

  6. 在結果窗格的 [結果] 索引標籤上,按兩下包含 XML 格式統計資料的資料格。In the results pane on the Results tab, double-click the cell that contains the statistics in XML format. 圖形查詢計劃隨即顯示。A graphic query plan is displayed.

  7. 以滑鼠右鍵按一下圖形查詢計劃中的第一個方塊,然後按一下 [屬性] 。Right-click the first box in the graphic query plan, and then click Properties.

  8. 為了在稍後比較不同的組態,請記下下列屬性的值:For later comparison with a different configuration, note the values for the following properties:

    • [CardinalityEstimationModelVersion] 。CardinalityEstimationModelVersion.

    • [估計的資料列數目] 。Estimated Number of Rows.

    • [估計的 I/O 成本] ,以及涉及實際效能 (而不是資料列計數預測) 的幾個類似 [估計] 屬性。Estimated I/O Cost, and several similar Estimated properties that involve actual performance rather than row count predictions.

    • [邏輯作業] 和 [實體作業] 。Logical Operation and Physical Operation. [平行處理原則] 是正確值。Parallelism is a good value.

    • [實際的執行模式] 。Actual Execution Mode. [批次] 是正確值,優於 [資料列] 。Batch is a good value, better than Row.

  9. 比較估計的資料列數目與實際的資料列數目。Compare the estimated number of rows to the actual number of rows. CE 誤差是 1% (高或低),或是 10%?Is the CE inaccurate by 1% (high or low), or by 10%?


  11. 執行 T-SQL 將您的資料庫相容性層級降低一個層級 (例如從 130 降低到 120)。Run the T-SQL to decrease the compatibility level of your database by one level (such as from 130 down to 120).

  12. 重新執行所有非預備步驟。Rerun all the non-preliminary steps.

  13. 比較兩次執行的 CE 屬性值。Compare the CE property values from the two runs.

    • 最新 CE 的誤差百分比是否小於舊版 CE?Is the inaccuracy percentage under the newest CE less than under the older CE?
  14. 最後,比較這兩次執行的各種效能屬性值。Finally, compare the various performance property values from the two runs.

    • 您的查詢是否在這兩個不同的 CE 估計下使用不同的計劃?Did your query use a different plan under the two differing CE estimations?

    • 您的查詢在最新的 CE 下是否執行得較慢?Did your query run slower under the latest CE?

    • 除非您的查詢在舊版 CE 下的執行效果更佳且使用不同的計劃,否則幾乎可以確定您需要最新的 CE。Unless your query runs better and with a different plan under the older CE, you almost certainly want the latest CE.

    • 不過,如果您的查詢在舊版 CE 下執行更快速的計劃,請考慮強制系統使用更快速的計劃並忽略此 CE。However, if your query runs with a faster plan under the older CE, consider forcing the system to use the faster plan and to ignore the CE. 如此一來,您不但可以針對所有項目使用最新的 CE,同時也可以在偶爾的情況下保留更快速的計劃。This way you can have the latest CE on for everything, while keeping the faster plan in the one odd case.

如何啟用最佳查詢計劃How to activate the best query plan

假設 CE120 或以上版本為您的查詢產生了效率較低的查詢計劃。Suppose that with CE 120 or above, a less efficient query plan is generated for your query. 以下選項可讓您啟用更佳的計劃:Here are some options you have to activate the better plan:

  1. 您可以將整個資料庫的相容性層級設定成比最新可用的值更低。You could set the compatibility level to a value lower than the latest available, for your whole database.

    • 例如,將相容性層級設為 110 或更低來啟用 CE 70,不過這會使所有查詢受限於先前的 CE 模型。For example, setting the compatibility level 110 or lower activates CE 70, but it makes all queries subject to the previous CE model.

    • 此外,設定較低的相容性層級,也會錯過最新版本中查詢最佳化工具的一些改善。Further, setting a lower compatibility level also misses a number of improvements in the query optimizer for latest versions.

  2. 您可以使用 LEGACY_CARDINALITY_ESTIMATION 資料庫選項讓整個資料庫使用舊版 CE,同時保留查詢最佳化工具的改善。You could use LEGACY_CARDINALITY_ESTIMATION database option, to have the whole database use the older CE, while retaining other improvements in the query optimizer.

  3. 您可以使用 LEGACY_CARDINALITY_ESTIMATION 查詢提示,讓單一資料庫使用舊版 CE,同時保留查詢最佳化工具的改善。You could use LEGACY_CARDINALITY_ESTIMATION query hint, to have a single query use the older CE, while retaining other improvements in the query optimizer.

若要進行最精細的控制,您可以「強制」 系統在測試期間使用透過 CE 70 所產生的計劃。For the finest control, you could force the system to use the plan that was generated with CE 70 during your testing. 「固定」 您慣用的計劃之後,您可以將整個資料庫設定為使用最新的相容性層級和 CE。After you pin your preferred plan, you can set your whole database to use the latest compatibility level and CE. 接下來將會詳細說明這個選項。The option is elaborated next.

如何強制執行特定查詢計劃How to force a particular query plan

查詢存放區提供不同的方式,讓您強制系統使用特定查詢計劃:The query store gives you different ways that you can force the system to use a particular query plan:

  • 執行 sp_query_store_force_planExecute sp_query_store_force_plan.

  • Management StudioManagement Studio 中,展開您的 [查詢存放區] 節點,以滑鼠右鍵按一下 [Top Resource Consuming Nodes] (資源耗用量排名在前的節點) ,然後按一下 [View Top Resource Consuming Nodes] (檢視資源耗用量排名在前的節點) 。In Management StudioManagement Studio, expand your Query Store node, right-click Top Resource Consuming Nodes, and then click View Top Resource Consuming Nodes. 這會顯示標示為 [強制執行計畫][取消強制執行計畫] 的按鈕。The display shows buttons labeled Force Plan and Unforce Plan.

如需查詢存放區的詳細資訊,請參閱 使用查詢存放區監視效能For more information about the query store, see Monitoring Performance By Using the Query Store.

CE 改善範例Examples of CE improvements

本節描述因實作最新版 CE 的增強功能而受益的範例查詢。This section describes example queries that benefit from the enhancements implemented in the CE in recent releases. 這是背景資訊,不需要您執行特定動作。This is background information that does not call for specific action on your part.

範例 A:CE 認為最大值可能比上次收集統計資料時還要高Example A. CE understands maximum value might be higher than when statistics were last gathered

假設統計資料是在 OrderAddedDate 上限為 2016-04-30時,於 2016-04-30OrderTable 收集的狀況。Suppose statistics were last gathered for OrderTable on 2016-04-30, when the maximum OrderAddedDate was 2016-04-30. CE 120 (和更新版本) 會認知具有「遞增」 資料之 OrderTable 中的資料行,可能會具有大於統計資料所記錄之最大值的值。The CE 120 (and above version) understands that columns in OrderTable which have ascending data might have values larger than the maximum recorded by the statistics. 此認知會改善 Transact-SQLTransact-SQL SELECT 陳述式的查詢計劃,如下所示。This understanding improves the query plan for Transact-SQLTransact-SQL SELECT statements such as the following.

SELECT CustomerId, OrderAddedDate  
FROM OrderTable  
WHERE OrderAddedDate >= '2016-05-01';  

範例 B:CE 認為相同資料表上的篩選述詞通常相互關聯Example B. CE understands that filtered predicates on the same table are often correlated

在下列 SELECT 中,我們可看到 ModelModelVariant 的篩選述詞。In the following SELECT we see filtered predicates on Model and ModelVariant. 我們直覺認為當 Model 為 'Xbox' 時,ModelVariant 有可能為 'One' (有鑑於 Xbox 有一個名為 One 的變數)。We intuitively understand that when Model is 'Xbox' there is a chance the ModelVariant is 'One', given that Xbox has a variant called One.

CE 120 起,SQL ServerSQL Server 可認知相同資料表上的兩個資料行 [Model] 和 [ModelVariant] 之間可能會相互關聯。Starting with CE 120, SQL ServerSQL Server understands there might be a correlation between the two columns on the same table, Model and ModelVariant. CE 可更準確地的估計查詢所要傳回的資料列數目,而且查詢最佳化工具能產生更佳的計劃。The CE makes a more accurate estimation of how many rows will be returned by the query, and the query optimizer generates a more optimal plan.

SELECT Model, Purchase_Price  
FROM dbo.Hardware  
WHERE Model = 'Xbox' AND  
      ModelVariant = 'One';  

範例 C:CE 不會再假設來自不同資料表的篩選述詞之間有任何相互關聯Example C. CE no longer assumes any correlation between filtered predicates from different tables

對新式工作負載和實際商務資料的最新詳細研究顯示,來自不同資料表的述詞篩選條件通常彼此沒有關聯。With extense new research on modern workloads and actual business data reveal that predicate filters from different tables usually do not correlate with each other. 在下列查詢中,CE 假設 s.typer.date 之間沒有相互關聯。In the following query, the CE assumes there is no correlation between s.type and r.date. 因此,CE 所估計的傳回資料列數目比較低。Therefore the CE makes a lower estimate of the number of rows returned.

SELECT s.ticket, s.customer, r.store  
FROM dbo.Sales    AS s  
CROSS JOIN dbo.Returns  AS r  
WHERE s.ticket = r.ticket AND  
      s.type = 'toy' AND  
      r.date = '2016-05-11';  

另請參閱See Also

效能的監視與微調 Monitor and Tune for Performance
Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator (使用 SQL Server 2014 基數估算程式最佳化您的查詢計劃)Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator
查詢提示 Query Hints
USE HINT 查詢提示 USE HINT Query Hints
使用查詢調整小幫手來升級資料庫 Upgrading Databases by using the Query Tuning Assistant
相關檢視、函數與程序 Monitoring Performance By Using the Query Store
查詢處理架構指南Query Processing Architecture Guide