APPROX_COUNT_DISTINCT (Transact-SQL)APPROX_COUNT_DISTINCT (Transact-SQL)

適用於: 否SQL Server 是Azure SQL Database 是Azure SQL 資料倉儲 是平行處理資料倉儲 APPLIES TO: noSQL Server yesAzure SQL Database yesAzure SQL Data Warehouse yesParallel Data Warehouse

此函式會傳回群組中非 Null 的唯一值的近似數目。This function returns the approximate number of unique non-null values in a group.

主題連結圖示 Transact-SQL 語法慣例Topic link icon Transact-SQL Syntax Conventions

注意

APPROX_COUNT_DISTINCT 是公開預覽功能。APPROX_COUNT_DISTINCT is a public preview feature.

語法Syntax

-- Syntax for Azure SQL Database, Azure SQL Data Warehouse and Parallel Data Warehouse  

APPROX_COUNT_DISTINCT ( expression )   

引數Arguments

expressionexpression
任何類型的運算式但除了 imagesql_variantntexttext之外。An expression of any type, except image, sql_variant, ntext, or text.

傳回類型Return types

bigintbigint

RemarksRemarks

APPROX_COUNT_DISTINCT( expression ) 會針對群組中的每個資料列評估運算式,然後傳回群組中非 Null 的唯一值的近似數目。APPROX_COUNT_DISTINCT( expression ) evaluates an expression for each row in a group, and returns the approximate number of unique non-null values in a group. 設計此函式的目的是提供大型資料集之間的彙總,在此情況下回應性比絕對精確度更重要。This function is designed to provide aggregations across large data sets where responsiveness is more critical than absolute precision.

APPROX_COUNT_DISTINCT 是針對用於大型資料的案例而設計,且已針對以下條件最佳化:APPROX_COUNT_DISTINCT is designed for use in big data scenarios and is optimized for the following conditions:

  • 存取有數百萬資料行或更多 and 的資料集Access of data sets that are millions of rows or higher and
  • 資料行的彙總,或是有需多不同值的資料行的彙總Aggregation of a column or columns that have many distinct values

函式實作保證在 97% 機率內最多 2% 的錯誤率。The function implementation guarantees up to a 2% error rate within a 97% probability.

APPROX_COUNT_DISTINCT 所需的記憶體比徹底的 COUNT DISTINCT 作業更少。APPROX_COUNT_DISTINCT requires less memory than an exhaustive COUNT DISTINCT operation. 因為磁碟使用量較小,所以相較於精確的 COUNT DISTINCT 作業,APPROX_COUNT_DISTINCT 較不會將記憶體溢出到磁碟。Given the smaller memory footprint, APPROX_COUNT_DISTINCT is less likely to spill memory to disk compared to a precise COUNT DISTINCT operation. 若要深入了解用於達成此目的的演算法,請參閱 HyperLogLog (英文)。To learn more about the algorithm used to achieve this, see HyperLogLog.

注意

對於定序區分大小寫的字串,公開預覽版的 APPROX_COUNT_DISTINCT 是使用二進位比對,且提供的結果是採用 BIN 定序而不是 BIN2。With collation sensitive strings, the Public Preview version of APPROX_COUNT_DISTINCT uses a binary match and provides results that would have been generated in the presence of BIN collations and not BIN2.

範例Examples

A.A. 使用 APPROX_COUNT_DISTINCTUsing APPROX_COUNT_DISTINCT

此範例會從 orders 資料表傳回不同順序索引鍵的近似數目。This example returns the approximate number of different order keys from the orders table.

SELECT APPROX_COUNT_DISTINCT(O_OrderKey) AS Approx_Distinct_OrderKey
FROM dbo.Orders;

以下為結果集:Here is the result set.

Approx_Distinct_OrderKey
------------------------
15164704

B.B. 搭配使用 APPROX_COUNT_DISTINCT 和 GROUP BYUsing APPROX_COUNT_DISTINCT with GROUP BY

此範例會從 orders 資料表根據順序狀態傳回不同順序索引鍵的近似數目。This example returns the approximate number of different order keys by order status from the orders table.

SELECT O_OrderStatus, APPROX_COUNT_DISTINCT(O_OrderKey) AS Approx_Distinct_OrderKey
FROM dbo.Orders
GROUP BY O_OrderStatus
ORDER BY O_OrderStatus; 

以下為結果集:Here is the result set.

O_OrderStatus                                                    Approx_Distinct_OrderKey
---------------------------------------------------------------- ------------------------
F                                                                7397838
O                                                                7387803
P                                                                388036

另請參閱See also

彙總函式 (Transact-SQL)Aggregate Functions (Transact-SQL)
COUNT (Transact-SQL)COUNT (Transact-SQL)