Azure SQL Data Warehouse release notes
This article summarizes the new features and improvements in the recent releases of Azure SQL Data Warehouse. The article also lists notable content updates that aren't directly related to the release but published in the same time frame. For improvements to other Azure services, see Service updates.
Check your Azure SQL Data Warehouse version
As new features are being rolled out to all regions, check the version deployed to your instance and the latest Azure SQL DW release notes for the feature availability. To check your Azure SQL DW version, connect to your data warehouse via SQL Server Management Studio (SSMS) and run
SELECT @@VERSION AS 'SQL Data Warehouse'; to return the current version of Azure SQL DW.
Use the date identified to confirm which release has been applied to your Azure SQL DW.
|Materialized View (Preview)||A Materialized View persists the data returned from the view definition query and automatically gets updated as data changes in the underlying tables. It improves the performance of complex queries (typically queries with joins and aggregations) while offering simple maintenance operations. For more information, see:
- CREATE MATERIALIZED VIEW AS SELECT (Transact-SQL) - ALTER MATERIALIZED VIEW (Transact-SQL)
- T-SQL statements supported in Azure SQL Data Warehouse
|Additional T-SQL support||The T-SQL language surface area for SQL Data Warehouse has been extended to include support for:
- AT TIME ZONE - STRING_AGG
|Result set caching (Preview)||DBCC commands added to manage the previously announced result set cache. For more information, see:
- DBCC DROPRESULTSETCACHE (Transact-SQL) - DBCC SHOWRESULTCACHESPACEUSED (Transact-SQL)
Also see the new result_set_cache column in sys.dm_pdw_exec_requests that shows when an executed query used the result set cache.
|Ordered clustered columnstore index (Preview)||New column, column_store_order_ordinal, added to sys.index_columns to identify the order of columns in an ordered clustered columnstore index.|
|Dynamic data masking (Preview)||Dynamic Data Masking (DDM) prevents unauthorized access to your sensitive data in your data warehouse by obfuscating it on-the-fly in the query results, based on the masking rules you define. For more information, see SQL Database dynamic data masking.|
|Workload importance now Generally Available||Workload Management Classification and Importance provide the ability to influence the run order of queries. For more information on workload importance, see the Classification and Importance overview articles in the documentation. Check out the CREATE WORKLOAD CLASSIFIER doc as well.
See workload importance in action in the below videos:
-Workload Management concepts
-Workload Management scenarios
|Additional T-SQL support||The T-SQL language surface area for SQL Data Warehouse has been extended to include support for:
|JSON functions||Business analysts can now use familiar T-SQL language to query and manipulate documents that are formatted as JSON data using the following new JSON functions in Azure Data Warehouse:
- ISJSON - JSON_VALUE
- JSON_QUERY - JSON_MODIFY
|Result set caching (Preview)||Result-set caching enables instant query response times while reducing time-to-insight for business analysts and reporting users. For more information, see:
- ALTER DATABASE (Transact-SQL) - ALTER DATABASE SET Options (Transact SQL)
- SET RESULT SET CACHING (Transact-SQL) - SET Statement (Transact-SQL)
- sys.databases (Transact-SQL)
|Ordered clustered columnstore index (Preview)||Columnstore is a key enabler for storing and efficiently querying large amounts of data. For each table, it divides the incoming data into Row Groups and each column of a Row Group forms a Segment on a disk. Ordered clustered columnstore indexes further optimize query execution by enabling efficient segment elimination. For more information, see:
- CREATE TABLE (Azure SQL Data Warehouse) - CREATE COLUMNSTORE INDEX (Transact-SQL).
|Data Discovery & Classification||Data Discovery & Classification is now available in public preview for Azure SQL Data Warehouse. It’s critical to protect sensitive data and the privacy of your customers. As your business and customer data assets grow, it becomes unmanageable to discover, classify, and protect your data. The data discovery and classification feature that we’re introducing natively with Azure SQL Data Warehouse helps make protecting your data more manageable. The overall benefits of this capability are:
• Meeting data privacy standards and regulatory compliance requirements.
• Restricting access to and hardening the security of data warehouses containing highly sensitive data.
• Monitoring and alerting on anomalous access to sensitive data.
• Visualization of sensitive data in a central dashboard on the Azure portal. Data Discovery & Classification is available for Azure SQL Data Warehouse in all Azure regions, It's part of Advanced Data Security including Vulnerability Assessment and Threat Detection. For more information about Data Discovery & Classification, see the blog post and our online documentation.
|GROUP BY ROLLUP||ROLLUP is now a supported GROUP BY option in Azure Data Warehouse. GROUP BY ROLLUP creates a group for each combination of column expressions. GROUP BY also "rolls up" the results into subtotals and grand totals. The GROUP BY function processes from right to left, decreasing the number of column expressions over which it creates groups and aggregation(s). The column order affects the ROLLUP output and can affect the number of rows in the result set.
For more information on GROUP BY ROLLUP, see GROUP BY (Transact-SQL)
|Improved accuracy for DWU used and CPU portal metrics||SQL Data Warehouse significantly enhances metric accuracy in the Azure portal. This release includes a fix to the CPU and DWU Used metric definition to properly reflect your workload across all compute nodes. Before this fix, metric values were being undereported. Expect to see an increase in the DWU used and CPU metrics in the Azure portal.|
|Row Level Security||We introduced Row-level Security capability back in Nov 2017. We’ve now extended this support to external tables as well. Additionally, we’ve added support for calling non-deterministic functions in the inline table-valued functions (inline TVFs) required for defining a security filter predicate. This addition allows you to specify IS_ROLEMEMBER(), USER_NAME() etc. in the security filter predicate. For more information, please see the examples in the Row-level Security documentation.|
|Additional T-SQL Support||The T-SQL language surface area for SQL Data Warehouse has been extended to include support for STRING_SPLIT (Transact-SQL).|
|Query Optimizer enhancements||Query optimization is a critical component of any database. Making optimal choices on how to best execute a query can yield significant improvements. When executing complex analytical queries in a distributed environment, the number of operations executed matters. Query performance has been enhanced by producing better quality plans. These plans minimize expensive data transfer operations and redundant computations such as, repeated subqueries. For more information, see this Azure SQL Data Warehouse blog post.|
|Return Order By Optimization||SELECT…ORDER BY queries get a performance boost in this release. Now, all compute nodes send their results to a single compute node. This node merges and sorts the results and returns them to the user. Merging through a single compute node results in a significant performance gain when the query result set contains a large number of rows. Previously, the query execution engine would order results on each compute node. The results would them be streamed to the control node. The control node would then merge the results.|
|Data Movement Enhancements for PartitionMove and BroadcastMove||In Azure SQL Data Warehouse Gen2, data movement steps of type ShuffleMove, use instant data movement techniques. For more information, see performance enhancements blog. With this release, PartitionMove and BroadcastMove are now powered by the same instant data movement techniques. User queries that use these types of data movement steps will run with improved performance. No code change is required to take advantage of these performance improvements.|
|Notable Bugs||Incorrect Azure SQL Data Warehouse version -
|Virtual Network Service Endpoints Generally Available||This release includes general availability of Virtual Network (VNet) Service Endpoints for Azure SQL Data Warehouse in all Azure regions. VNet Service Endpoints enable you to isolate connectivity to your logical server from a given subnet or set of subnets within your virtual network. The traffic to Azure SQL Data Warehouse from your VNet will always stay within the Azure backbone network. This direct route will be preferred over any specific routes that take Internet traffic through virtual appliances or on-premises. No additional billing is charged for virtual network access through service endpoints. Current pricing model for Azure SQL Data Warehouse applies as is.
With this release, we also enabled PolyBase connectivity to Azure Data Lake Storage Gen2 (ADLS) via Azure Blob File System (ABFS) driver. Azure Data Lake Storage Gen2 brings all the qualities that are required for the complete lifecycle of analytics data to Azure Storage. Features of the two existing Azure storage services, Azure Blob Storage and Azure Data Lake Storage Gen1 are converged. Features from Azure Data Lake Storage Gen1, such as file system semantics, file-level security, and scale are combined with low-cost, tiered storage, and high availability/disaster recovery capabilities from Azure Blob Storage.
Using Polybase you can also import data into Azure SQL Data Warehouse from Azure Storage secured to VNet. Similarly, exporting data from Azure SQL Data Warehouse to Azure Storage secured to VNet is also supported via Polybase.
For more information on VNet Service Endpoints in Azure SQL Data Warehouse, refer to the blog post or the documentation.
|Automatic Performance Monitoring (Preview)||Query Store is now available in Preview for Azure SQL Data Warehouse. Query Store is designed to help you with query performance troubleshooting by tracking queries, query plans, runtime statistics, and query history to help you monitor the activity and performance of your data warehouse. Query Store is a set of internal stores and Dynamic Management Views (DMVs) that allow you to:
• Identify and tune top resource consuming queries
• Identify and improve unplanned workloads
• Evaluate query performance and impact to the plan by changes in statistics, indexes, or system size (DWU setting)
• See full query text for all queries executed
The Query Store contains three actual stores:
• A plan store for persisting the execution plan information
• A runtime stats store for persisting the execution statistics information
• A wait stats store for persisting wait stats information.
SQL Data Warehouse manages these stores automatically and provides an unlimited number of queries storied over the last seven days at no additional charge. Enabling Query Store is as simple as running an ALTER DATABASE T-SQL statement:
sql ----ALTER DATABASE [DatabaseName] SET QUERY_STORE = ON;-------For more information on Query Store in Azure SQL Data Warehouse, see the article, Monitoring performance by using the Query Store, and the Query Store DMVs, such as sys.query_store_query. Here is the blog post announcing the release.
|Lower Compute Tiers for Azure SQL Data Warehouse Gen2||Azure SQL Data Warehouse Gen2 now supports lower compute tiers. Customers can experience Azure SQL Data Warehouse’s leading performance, flexibility, and security features starting with 100 cDWU (Data Warehouse Units) and scale to 30,000 cDWU in minutes. Starting mid-December 2018, customers can benefit from Gen2 performance and flexibility with lower compute tiers in regions, with the rest of the regions available during 2019.
By dropping the entry point for next-generation data warehousing, Microsoft opens the doors to value-driven customers who want to evaluate all the benefits of a secure, high-performance data warehouse without guessing which trial environment is best for them. Customers may start as low as 100 cDWU, down from the current 500 cDWU entry point. SQL Data Warehouse Gen2 continues to support pause and resume operations and goes beyond just the flexibility in compute. Gen2 also supports unlimited column-store storage capacity along with 2.5 times more memory per query, up to 128 concurrent queries and adaptive caching features. These features on average bring five times more performance compared to the same Data Warehouse Unit on Gen1 at the same price. Geo-redundant backups are standard for Gen2 with built-in guaranteed data protection. Azure SQL Data Warehouse Gen2 is ready to scale when you are.
|Columnstore Background Merge||By default, Azure SQL Data Warehouse (Azure SQL DW) stores data in columnar format, with micro-partitions called rowgroups. Sometimes, due to memory constrains at index build or data load time, the rowgroups may be compressed with less than the optimal size of one million rows. Rowgroups may also become fragmented due to deletes. Small or fragmented rowgroups result in higher memory consumption, as well as inefficient query execution. With this release of Azure SQL DW, the columnstore background maintenance task merges small compressed rowgroups to create larger rowgroups to better utilize memory and speed up query execution.|
|DevOps for Data Warehousing||The highly requested feature for SQL Data Warehouse (SQL DW) is now in preview with the support for SQL Server Data Tool (SSDT) in Visual Studio! Teams of developers can now collaborate over a single, version-controlled codebase and quickly deploy changes to any instance in the world. Interested in joining? This feature is available for preview today! You can register by visiting the SQL Data Warehouse Visual Studio SQL Server Data Tools (SSDT) - Preview Enrollment form. Given the high demand, we are managing acceptance into preview to ensure the best experience for our customers. Once you sign up, our goal is to confirm your status within seven business days.|
|Row Level Security Generally Available||Azure SQL Data Warehouse (SQL DW) now supports row level security (RLS) adding a powerful capability to secure your sensitive data. With the introduction of RLS, you can implement security policies to control access to rows in your tables, as in who can access what rows. RLS enables this fine-grained access control without having to redesign your data warehouse. RLS simplifies the overall security model as the access restriction logic is located in the database tier itself rather than away from the data in another application. RLS also eliminates the need to introduce views to filter out rows for access control management. There is no additional cost for this enterprise-grade security feature for all our customers.|
|Advanced Advisors||Advanced tuning for Azure SQL Data Warehouse (SQL DW) just got simpler with additional data warehouse recommendations and metrics. There are additional advanced performance recommendations through Azure Advisor at your disposal, including:
1. Adaptive cache – Be advised when to scale to optimize cache utilization.
2. Table distribution – Determine when to replicate tables to reduce data movement and increase workload performance.
3. Tempdb – Understand when to scale and configure resource classes to reduce tempdb contention.
There is a deeper integration of data warehouse metrics with Azure Monitor including an enhanced customizable monitoring chart for near real-time metrics in the overview blade. You no longer must leave the data warehouse overview blade to access Azure Monitor metrics when monitoring usage, or validating and applying data warehouse recommendations. In addition, there are new metrics available, such as tempdb and adaptive cache utilization to complement your performance recommendations.
|Advanced tuning with integrated advisors||Advanced tuning for Azure SQL Data Warehouse (SQL DW) just got simpler with additional data warehouse recommendations and metrics and a redesign of the portal overview blade that provides an integrated experience with Azure Advisor and Azure Monitor.|
|Accelerated Database Recovery (ADR)||Azure SQL Data Warehouse Accelerated Database Recovery (ADR) is now in Public Preview. ADR is a new SQL Server Engine that greatly improves database availability, especially in the presence of long running transactions, by completely redesigning the current recovery process from the ground up. The primary benefits of ADR are fast and consistent database recovery and instantaneous transaction rollback.|
|Azure Monitor diagnostics logs||SQL Data Warehouse (SQL DW) now enables enhanced insights into analytical workloads by integrating directly with Azure Monitor diagnostic logs. This new capability enables developers to analyze workload behavior over an extended time period and make informed decisions on query optimization or capacity management. We have now introduced an external logging process through Azure Monitor diagnostic logs that provide additional insights into your data warehouse workload. With a single click of a button, you are now able to configure diagnostic logs for historical query performance troubleshooting capabilities using Log Analytics. Azure Monitor diagnostic logs support customizable retention periods by saving the logs to a storage account for auditing purposes, the capability to stream logs to event hubs near real-time telemetry insights, and the ability to analyze logs using Log Analytics with log queries. Diagnostic logs consist of telemetry views of your data warehouse equivalent to the most commonly used performance troubleshooting DMVs for SQL Data Warehouse. For this initial release, we have enabled views for the following system dynamic management views:
|Columnstore memory management||As the number of compressed column store row groups increases, the memory required to manage the internal column segment metadata for those rowgroups increases. As a result, query performance and queries executed against some of the Columnstore Dynamic Management Views (DMVs) can degrade. Improvements have made in this release to optimize the size of the internal metadata for these cases, leading to improved experience and performance for such queries.|
|Azure Data Lake Storage Gen2 integration (GA||Azure SQL Data Warehouse (SQL DW) now has native integration with Azure Data Lake Storage Gen2. Customers can now load data using external tables from ABFS into SQL DW. This functionality enables customers to integrate with their data lakes in Data Lake Storage Gen2.|
|Notable Bugs||CETAS to Parquet failures in small resource classes on Data warehouses of DW2000 and more - This fix correctly identifies a null reference in the Create External Table As to Parquet code path.
Identity column value might lose in some CTAS operation - The value of an identify column may not be preserved when CTASed to another table. Reported in a blog.
Internal failure in some cases when a session is terminated while a query is still running - This fix triggers an InvalidOperationException if a session is terminated when the query is still running.
(Deployed in November 2018) Customers were experiencing a suboptimal performance when attempting to load multiple small files from ADLS (Gen1) using Polybase. - System performance was bottlenecked during AAD security token validation. Performance problems were mitigated by enabling caching of security tokens.