When migrating your code from another database to SQL Data Warehouse, you will most likely need to make changes to your code base. Some SQL Data Warehouse features can significantly improve performance as they are designed to work in a distributed fashion. However, to maintain performance and scale, some features are also not available.
Common T-SQL Limitations
The following list summarizes the most common feature which are not supported in Azure SQL Data Warehouse. The links take you to workarounds for the unsupported feature:
- ANSI joins on updates
- ANSI joins on deletes
- merge statement
- cross-database joins
- output clause
- inline user-defined functions
- multi-statement functions
- common table expressions
- [recursive common table expressions (CTE)](#Recursive-common-table-expressions-(CTE)
- CLR functions and procedures
- $partition function
- table variables
- table value parameters
- distributed transactions
- commit / rollback work
- save transaction
- execution contexts (EXECUTE AS)
- group by clause with rollup / cube / grouping sets options
- nesting levels beyond 8
- updating through views
- use of select for variable assignment
- no MAX data type for dynamic SQL strings
Fortunately most of these limitations can be worked around. Explanations are provided in the relevant development articles referenced above.
Supported CTE features
Common table expressions (CTEs) are partially supported in SQL Data Warehouse. The following CTE features are currently supported:
- A CTE can be specified in a SELECT statement.
- A CTE can be specified in a CREATE VIEW statement.
- A CTE can be specified in a CREATE TABLE AS SELECT (CTAS) statement.
- A CTE can be specified in a CREATE REMOTE TABLE AS SELECT (CRTAS) statement.
- A CTE can be specified in a CREATE EXTERNAL TABLE AS SELECT (CETAS) statement.
- A remote table can be referenced from a CTE.
- An external table can be referenced from a CTE.
- Multiple CTE query definitions can be defined in a CTE.
Common table expressions have some limitations in SQL Data Warehouse including:
- A CTE must be followed by a single SELECT statement. INSERT, UPDATE, DELETE, and MERGE statements are not supported.
- A common table expression that includes references to itself (a recursive common table expression) is not supported (see below section).
- Specifying more than one WITH clause in a CTE is not allowed. For example, if a CTE_query_definition contains a subquery, that subquery cannot contain a nested WITH clause that defines another CTE.
- An ORDER BY clause cannot be used in the CTE_query_definition, except when a TOP clause is specified.
- When a CTE is used in a statement that is part of a batch, the statement before it must be followed by a semicolon.
- When used in statements prepared by sp_prepare, CTEs will behave the same way as other SELECT statements in PDW. However, if CTEs are used as part of CETAS prepared by sp_prepare, the behavior can defer from SQL Server and other PDW statements because of the way binding is implemented for sp_prepare. If SELECT that references CTE is using a wrong column that does not exist in CTE, the sp_prepare will pass without detecting the error, but the error will be thrown during sp_execute instead.
Recursive CTEs are not supported in SQL Data Warehouse. The migraion of recursive CTE can be somewhat complete and the best process is to break down the into multiple steps. You can typically use a loop and populate a temporary table as you iterate over the recursive interim queries. Once the temporary table is populated you can then return the data as a single result set. A similar approach has been used to solve
GROUP BY WITH CUBE in the group by clause with rollup / cube / grouping sets options article.
Unsupported system functions
There are also some system functions that are not supported. Some of the main ones you might typically find used in data warehousing are:
Some of these issues can be worked around.
To work around lack of support for @@ROWCOUNT, create a stored procedure that will retrieve the last row count from sys.dm_pdw_request_steps and then execute
EXEC LastRowCount after a DML statement.
CREATE PROCEDURE LastRowCount AS WITH LastRequest as ( SELECT TOP 1 request_id FROM sys.dm_pdw_exec_requests WHERE session_id = SESSION_ID() AND resource_class IS NOT NULL ORDER BY end_time DESC ), LastRequestRowCounts as ( SELECT step_index, row_count FROM sys.dm_pdw_request_steps WHERE row_count >= 0 AND request_id IN (SELECT request_id from LastRequest) ) SELECT TOP 1 row_count FROM LastRequestRowCounts ORDER BY step_index DESC ;
For a complete list of all supported T-SQL statements, see Transact-SQL topics.