列存储索引 - 新增功能Columnstore indexes - what's new

适用对象: yesSQL ServeryesAzure SQL 数据库yesAzure SQL 数据仓库yes并行数据仓库APPLIES TO: yesSQL Server yesAzure SQL Database yesAzure SQL Data Warehouse yesParallel Data Warehouse

列存储功能摘要可用于各个版本的 SQL ServerSQL Server、最新版 SQL 数据库SQL DatabaseSQL 数据仓库SQL Data Warehouse并行数据仓库Parallel Data WarehouseSummary of columnstore features available for each version of SQL ServerSQL Server, and the latest releases of SQL 数据库SQL Database, SQL 数据仓库SQL Data Warehouse, and 并行数据仓库Parallel Data Warehouse.

备注

对于 SQL 数据库SQL Database,列存储索引可用于 Azure SQL DatabaseAzure SQL Database 高级层、标准层(S3 及更高)以及所有 vCore 层。For SQL 数据库SQL Database, columnstore indexes are available in Azure SQL DatabaseAzure SQL Database Premium tiers, Standard tiers - S3 and above, and all vCore tiers. 对于 SQL Server 2016 (13.x)SQL Server 2016 (13.x) SP1 及更高版本,列存储索引可用于所有版本。For SQL Server 2016 (13.x)SQL Server 2016 (13.x) SP1 and above, columnstore indexes are available in all editions. 对于 SQL Server 2016 (13.x)SQL Server 2016 (13.x)(早于 SP1)及更早版本,列存储索引仅可用于企业版。For SQL Server 2016 (13.x)SQL Server 2016 (13.x) (before SP1) and earlier versions, columnstore indexes are only available in Enterprise Edition.

产品版本的功能摘要Feature Summary for Product Releases

此表概述了列存储索引的主要功能以及提供这些功能的产品。This table summarizes key features for columnstore indexes and the products in which they are available.

列存储索引功能Columnstore Index Feature SQL Server 2012 (11.x)SQL Server 2012 (11.x) SQL Server 2014 (12.x)SQL Server 2014 (12.x) SQL Server 2016 (13.x)SQL Server 2016 (13.x) SQL Server 2017 (14.x)SQL Server 2017 (14.x) SQL 数据库SQL Database SQL 数据仓库SQL Data Warehouse
多线程查询的批模式执行Batch mode execution for multi-threaded queries yes yes yes yes yes yes
单线程查询的批模式执行Batch mode execution for single-threaded queries yes yes yes yes
存档压缩选项Archival compression option yes yes yes yes yes
快照隔离和读提交快照隔离Snapshot isolation and read-committed snapshot isolation yes yes yes yes
创建表时,请指定列存储索引Specify columnstore index when creating a table yes yes yes yes
Always On 支持列存储索引Always On supports columnstore indexes yes yes yes yes yes yes
Always On 可读次要副本支持只读非聚集列存储索引Always On readable secondary supports read-only nonclustered columnstore index yes yes yes yes yes yes
Always On 可读次要副本支持可更新列存储索引Always On readable secondary supports updateable columnstore indexes yes yes
堆或 B 树上的只读非聚集列存储索引Read-only nonclustered columnstore index on heap or B-tree yes yes 1yes 1 1yes 1 1yes 1 1yes 1
堆或 B 树上的可更新非聚集列存储索引Updateable nonclustered columnstore index on heap or B-tree yes yes yes yes
允许在使用非聚集列存储索引的堆或 B 树上实施额外的 B 树索引Additional B-tree indexes allowed on a heap or B-tree that has a nonclustered columnstore index yes yes yes yes yes yes
可更新的聚集列存储索引Updateable clustered columnstore index yes yes yes yes yes
基于聚集列存储索引的 B 树索引B-tree index on a clustered columnstore index yes yes yes yes
基于内存优化表的列存储索引Columnstore index on a memory-optimized table yes yes yes yes
非聚集列存储索引定义支持使用筛选的条件Nonclustered columnstore index definition supports using a filtered condition yes yes yes yes
CREATE TABLEALTER TABLE 中的列存储索引的压缩延迟选项Compression delay option for columnstore indexes in CREATE TABLE and ALTER TABLE yes yes yes yes
列存储索引具有一个非持久化计算列Columnstore index can have a non-persisted computed column yes

1 要创建只读非聚集列存储索引,请将索引存储在只读文件组内。1 To create a read-only nonclustered columnstore index, store the index on a read-only filegroup.

SQL Server 2017 (14.x)SQL Server 2017 (14.x)

SQL Server 2017 (14.x)SQL Server 2017 (14.x) 将添加这些新功能。adds these new features.

功能Functional

  • SQL Server 2017 (14.x)SQL Server 2017 (14.x) 支持聚集列存储索引中的非持久化计算列。supports non-persisted computed columns in clustered columnstore indexes. 聚集列存储索引不支持持久化计算列。Persisted computed columns are not supported in clustered columnstore indexes. 无法在具有计算列的列存储索引上创建非聚集索引。You cannot create a nonclustered index on a columnstore index that has a computed column.

SQL Server 2016 (13.x)SQL Server 2016 (13.x)

SQL Server 2016 (13.x)SQL Server 2016 (13.x) 添加了重要的增强功能,以此来改善列存储索引的性能和灵活性。adds key enhancements to improve the performance and flexibility of columnstore indexes. 这些改进功能可以增强数据仓库方案的效果,并实现实时运营分析。These improvements enhance data warehousing scenarios and enable real-time operational analytics.

功能Functional

  • 一个行存储表可以有一个可更新的非聚集列存储索引。A rowstore table can have one updateable nonclustered columnstore index. 以前,非聚集列存储索引是只读的。Previously, the nonclustered columnstore index was read-only.

  • 非聚集列存储索引定义支持使用筛选的条件。The nonclustered columnstore index definition supports using a filtered condition. 若要尽量减少在 OLTP 表中添加列存储索引的性能影响,请使用筛选条件,以便创建仅关于运行工作负荷冷数据的非聚集列存储索引。To minimize the performance impact of adding a columnstore index on an OLTP table, use a filtered condition to create a nonclustered columnstore index on only the cold data of your operational workload.

  • 一个内存中表可以有一个列存储索引。An in-memory table can have one columnstore index. 你可以在创建表时创建它,也可以稍后使用 ALTER TABLE (Transact-SQL) 来添加。You can create it when the table is created or add it later with ALTER TABLE (Transact-SQL). 以前,仅基于磁盘的表可以有列存储索引。Previously, only a disk-based table could have a columnstore index.

  • 聚集列存储索引可以有一个或多个非聚集行存储索引。A clustered columnstore index can have one or more nonclustered rowstore indexes. 以前,列存储索引不支持非聚集索引。Previously, the columnstore index did not support nonclustered indexes. SQL ServerSQL Server 自动维护 DML 操作的非聚集索引。automatically maintains the nonclustered indexes for DML operations.

  • 支持主键和外键,即可通过使用 B 树索引在聚集列存储索引上强制实施这些约束。Support for primary keys and foreign keys by using a B-tree index to enforce these constraints on a clustered columnstore index.

  • 列存储索引有一个压缩延迟选项,该选项可以最大限度地减少事务工作负荷对实时运营分析的影响。Columnstore indexes have a compression delay option that minimizes the impact of the transactional workload on real-time operational analytics. 此选项允许通过频繁地更改行来保持稳定,然后再将这些行压缩到列存储中。This option allows for frequently changing rows to stabilize before compressing them into the columnstore. 有关详细信息,请参阅创建列存储索引 (Transact SQL)开始使用列存储索引进行实时运营分析For details, see CREATE COLUMNSTORE INDEX (Transact-SQL) and Get started with Columnstore for real-time operational analytics.

数据库兼容级别 120 或 130 的性能Performance for database compatibility level 120 or 130

  • 列存储索引支持读提交快照隔离级别 (RCSI) 和快照隔离 (SI)。Columnstore indexes support read committed snapshot isolation level (RCSI) and snapshot isolation (SI). 这样可以在无锁的情况下进行事务一致性分析查询。This enables transactional consistent analytics queries with no locks.

  • 列存储支持索引碎片整理,即可以移除已删除的行而无需显式重新生成索引。Columnstore supports index defragmentation by removing deleted rows without the need to explicitly rebuild the index. ALTER INDEX ... REORGANIZE 语句将根据内部定义的策略,以联机操作的方式从列存储移除已删除的行The ALTER INDEX ... REORGANIZE statement removes deleted rows, based on an internally defined policy, from the columnstore as an online operation

  • 可以在 AlwaysOn 可读次要副本上访问列存储索引。Columnstore indexes can be access on an AlwaysOn readable secondary replica. 你可以将分析查询卸载到 AlwaysOn 次要副本,从而改进运行分析的性能。You can improve performance for operational analytics by offloading analytics queries to an AlwaysOn secondary replica.

  • 为了改进性能,SQL ServerSQL Server 会在表扫描期间计算聚合函数 MINMAXSUMCOUNTAVG,前提是数据类型不超过 8 个字节的长度,且不是字符串类型。To improve performance, SQL ServerSQL Server computes the aggregate functions MIN, MAX, SUM, COUNT, and AVG during table scans when the data type uses no more than 8 bytes, and is not of a string type. 聚合列存储索引和非聚合列存储索引都支持聚合下推,不管有没有 Group By 子句。Aggregate pushdown is supported with or without Group By clause for both clustered columnstore indexes and nonclustered columnstore indexes.

  • 谓词下推可加快查询在比较 [v]archar 或 n[v]archar 类型的字符串时的速度。Predicate pushdown speeds up queries that compare strings of type [v]archar or n[v]archar. 这适用于常用的比较运算符,包括 LIKE 这样的使用位图筛选器的运算符。This applies to the common comparison operators and includes operators such as LIKE that use bitmap filters. 这适用于 SQL Server 支持的所有排序规则。This works with all collations that SQL Server supports.

数据库兼容级别 130 的性能Performance for database compatibility level 130

  • 对于使用任何下述操作的查询,新提供了批处理模式执行支持:New batch mode execution support for queries using any of these operations:
    • SORTSORT
    • 使用多个不同函数的聚合函数。Aggregates with multiple distinct functions. 一些示例:COUNT/COUNTAVG/SUMCHECKSUM_AGGSTDEV/STDEVPSome examples: COUNT/COUNT, AVG/SUM, CHECKSUM_AGG, STDEV/STDEVP.
    • 窗口聚合函数:COUNTCOUNT_BIGSUMAVGMINMAXCLRWindow aggregate functions: COUNT, COUNT_BIG, SUM, AVG, MIN, MAX, and CLR.
    • 窗口用户定义聚合:CHECKSUM_AGGSTDEVSTDEVPVARVARPGROUPINGWindow user-defined aggregates: CHECKSUM_AGG, STDEV, STDEVP, VAR, VARP, and GROUPING.
    • 窗口聚合分析函数:LAGLEADFIRST_VALUELAST_VALUEPERCENTILE_CONTPERCENTILE_DISCCUME_DISTPERCENT_RANKWindow aggregate analytic functions: LAG, LEAD, FIRST_VALUE, LAST_VALUE, PERCENTILE_CONT, PERCENTILE_DISC, CUME_DIST, and PERCENT_RANK.
  • MAXDOP 1 下运行或使用串行查询计划以批处理模式执行的单线程查询。Single-threaded queries running under MAXDOP 1 or with a serial query plan execute in batch mode. 以前,仅多线程查询以批处理执行的方式运行。Previously-only multi-threaded queries ran with batch execution.
  • 内存优化表查询可以有 SQL 互操作模式的并行计划,不管是按行存储索引方式还是按列存储索引方式访问数据Memory optimized table queries can have parallel plans in SQL InterOp mode both when accessing data in rowstore or in columnstore index

可支持性Supportability

以下系统视图是针对列存储的新视图:These system views are new for columnstore:

sys.column_store_row_groups (Transact-SQL)sys.column_store_row_groups (Transact-SQL) sys.dm_column_store_object_pool (Transact-SQL)sys.dm_column_store_object_pool (Transact-SQL)
sys.dm_db_column_store_row_group_operational_stats (Transact-SQL)sys.dm_db_column_store_row_group_operational_stats (Transact-SQL) sys.dm_db_column_store_row_group_physical_stats (Transact-SQL)sys.dm_db_column_store_row_group_physical_stats (Transact-SQL)
sys.dm_db_index_operational_stats (Transact-SQL)sys.dm_db_index_operational_stats (Transact-SQL) sys.dm_db_index_physical_stats (Transact-SQL)sys.dm_db_index_physical_stats (Transact-SQL)
sys.internal_partitions (Transact-SQL)sys.internal_partitions (Transact-SQL)

以下内存中 OLTP 式 DMV 包含列存储的更新:These in-memory OLTP-based DMVs contain updates for columnstore:

sys.dm_db_xtp_hash_index_stats (Transact-SQL)sys.dm_db_xtp_hash_index_stats (Transact-SQL) sys.dm_db_xtp_index_stats (Transact-SQL)sys.dm_db_xtp_index_stats (Transact-SQL)
sys.dm_db_xtp_memory_consumers (Transact-SQL)sys.dm_db_xtp_memory_consumers (Transact-SQL) sys.dm_db_xtp_nonclustered_index_stats (Transact-SQL)sys.dm_db_xtp_nonclustered_index_stats (Transact-SQL)
sys.dm_db_xtp_object_stats (Transact-SQL)sys.dm_db_xtp_object_stats (Transact-SQL) sys.dm_db_xtp_table_memory_stats (Transact-SQL)sys.dm_db_xtp_table_memory_stats (Transact-SQL)

限制Limitations

  • 对于内存中表,列存储索引必须包括所有列;列存储索引不能有经过筛选的条件。For in-memory tables, a columnstore index must include all the columns; the columnstore index cannot have a filtered condition.
  • 对于内存中表,基于列存储索引的查询仅在互操作模式下运行,不在内存中本机模式下运行。For in-memory tables, queries on columnstore indexes run only in InterOP mode, and not in the in-memory native mode. 支持并行执行。Parallel execution is supported.

SQL Server 2014 (12.x)SQL Server 2014 (12.x)

SQL Server 2014 (12.x)SQL Server 2014 (12.x) 引入了聚集列存储索引作为主存储格式。introduced the clustered column store index as the primary storage format. 这样就可以进行常规加载以及更新、删除和插入操作。This allowed regular loads as well as update, delete, and insert operations.

  • 表可以使用聚集列存储索引作为主表存储。The table can use a clustered column store index as the primary table storage. 不允许在表上使用其他索引,但可对聚集列存储索引进行更新,因此可执行常规加载并对各个行进行更改。No other indexes are allowed on the table, but the clustered column store index is updateable so you can perform regular loads and make changes to individual rows.
  • 非聚集列存储索引的功能仍与 SQL Server 2012 (11.x)SQL Server 2012 (11.x) 中的一样,区别是增加了能够以批处理模式执行的运算符。The nonclustered column store index continues to have the same functionality as in SQL Server 2012 (11.x)SQL Server 2012 (11.x) except for additional operators that can now be executed in batch mode. 该索引仍然不能进行更新,只能重新生成和使用分区切换。It is still not updateable except by rebuilding, and by using partition switching. 非聚集列存储索引只能用于基于磁盘的表,不能用于内存中表。The nonclustered columnstore index is supported on disk-based tables only, and not on in-memory tables.
  • 聚集和非聚集列存储索引有一个存档压缩选项,允许进一步压缩数据。The clustered and nonclustered column store index has an archival compression option that further compresses the data. 存档选项用于减少内存中数据和磁盘上数据的大小,但会降低查询执行速度。The archival option is useful for reducing the data size both in memory and on disk, but does slow query performance. 它适用于访问不频繁的数据。It works well for data that is accessed infrequently.
  • 聚集列存储索引和非聚集列存储索引的作用方式非常类似:使用相同的列存储格式、相同的查询处理引擎,以及相同的动态管理视图集。The clustered columnstore index and the nonclustered columnstore index function in a very similar way; they use the same columnar storage format, same query processing engine, and the same set of dynamic management views. 不同之处在于,一个是主要索引类型,一个是次要索引类型,而非聚集列存储索引为只读。The difference is primary versus secondary index types, and the nonclustered columnstore index is read-only.
  • 对于多线程查询来说,以下运算符以批处理模式运行:scan、filter、project、join、group by 和 union all。These operators run in batch mode for multi-threaded queries: scan, filter, project, join, group by, and union all.

SQL Server 2012 (11.x)SQL Server 2012 (11.x)

SQL Server 2012 (11.x)SQL Server 2012 (11.x) 引入了非聚集列存储索引作为另一基于行存储表的索引类型,并为基于列存储数据的查询引入了批处理。introduced the nonclustered columnstore index as another index type on rowstore tables and batch processing for queries on columnstore data.

  • 一个行存储表可以有一个非聚集列存储索引。A rowstore table can have one nonclustered columnstore index.
  • 列存储索引是只读的。The columnstore index is read-only. 创建列存储索引以后,不能通过INSERTDELETEUPDATE操作来更新表;要执行这些操作,必须在删除索引后更新表,然后重新生成列存储索引。After you create the columnstore index, you cannot update the table by INSERT, DELETE, and UPDATE operations; to perform these operations you must drop the index, update the table and rebuild the columnstore index. 可以使用分区切换将其他数据加载到表中。You can load additional data into the table by using partition switching. 分区切换的优点是,你可以在不删除和重新生成列存储索引的情况下加载数据。The advantage of partition switching is you can load data without dropping and rebuilding the columnstore index.
  • 列存储索引始终需要额外的存储空间,通常需要在行存储的基础上再多出 10%,因为它会存储数据的副本。The column store index always requires extra storage, typically an additional 10% over rowstore, because it stores a copy of the data.
  • 批处理的查询性能会翻倍,或者说批处理会改善查询性能,但这仅适用于并行执行查询的情况。Batch processing provides 2x or better query performance, but it is only available for parallel query execution.

另请参阅See Also

列存储索引设计指南 Columnstore Indexes Design Guidance
列存储索引数据加载指南 Columnstore Indexes Data Loading Guidance
Columnstore Indexes Query Performance Columnstore Indexes Query Performance
开始使用列存储进行实时运营分析 Get started with Columnstore for real-time operational analytics
针对数据仓库的列存储索引 Columnstore Indexes for Data Warehousing
列存储索引碎片整理Columnstore Indexes Defragmentation