What are data pools in a SQL Server big data cluster?

THIS TOPIC APPLIES TO:yesSQL Server noAzure SQL DatabasenoAzure Synapse Analytics (SQL DW) noParallel Data Warehouse

This article describes the role of SQL Server data pools in a SQL Server 2019 Big Data Clusters. The following sections describe the architecture and functionality of a SQL data pool.

Data pool architecture

A data pool consists of one or more SQL Server data pool instances. SQL data pool instances provide persistent SQL Server storage for the cluster. A data pool is used to ingest data from SQL queries or Spark jobs. To provide better performance across large data sets, data in a data pool is distributed into shards across the member SQL data pool instances.

Scale-out data marts

Data pools enable the creation of scale-out data marts, where external data from multiple sources is ingested into the data pool. Because data is distributed across data pool instances, parallel queries against the curated data are more efficient.

Scale-out data mart

Next steps

To learn more about the SQL Server Big Data Clusters, see the following resources: