CREATE MATERIALIZED VIEW AS SELECT (Transact-SQL)

Applies to: yesAzure Synapse Analytics

This article explains the CREATE MATERIALIZED VIEW AS SELECT T-SQL statement in Azure Synapse Analytics (SQL Data Warehouse) for developing solutions. The article also provides code examples.

A Materialized View persists the data returned from the view definition query and automatically gets updated as data changes in the underlying tables. It improves the performance of complex queries (typically queries with joins and aggregations) while offering simple maintenance operations. With its execution plan automatching capability, a materialized view does not have to be referenced in the query for the optimizer to consider the view for substitution. This capability allows data engineers to implement materialized views as a mechanism for improving query response time, without having to change queries.

Topic link icon Transact-SQL Syntax Conventions

Syntax

CREATE MATERIALIZED VIEW [ schema_name. ] materialized_view_name
    WITH (  
      <distribution_option>
    )
    AS <select_statement>
[;]

<distribution_option> ::=
    {  
        DISTRIBUTION = HASH ( distribution_column_name )  
      | DISTRIBUTION = ROUND_ROBIN  
    }

<select_statement> ::=
    SELECT select_criteria

Note

This syntax is not supported by SQL on-demand (preview) in Azure Synapse Analytics.

Arguments

schema_name
Is the name of the schema to which the view belongs.

materialized_view_name
Is the name of the view. View names must follow the rules for identifiers. Specifying the view owner name is optional.

distribution option
Only HASH and ROUND_ROBIN distributions are supported.

select_statement
The SELECT list in the materialized view definition needs to meet at least one of these two criteria:

  • The SELECT list contains an aggregate function.
  • GROUP BY is used in the Materialized view definition and all columns in GROUP BY are included in the SELECT list. Up to 32 columns can be used in the GROUP BY clause.

Aggregate functions are required in the SELECT list of the materialized view definition. Supported aggregations include MAX, MIN, AVG, COUNT, COUNT_BIG, SUM, VAR, STDEV.

When MIN/MAX aggregates are used in the SELECT list of materialized view definition, following requirements apply:

  • FOR_APPEND is required. For example:

    CREATE MATERIALIZED VIEW mv_test2  
    WITH (distribution = hash(i_category_id), FOR_APPEND)  
    AS
    SELECT MAX(i.i_rec_start_date) as max_i_rec_start_date, MIN(i.i_rec_end_date) as min_i_rec_end_date, i.i_item_sk, i.i_item_id, i.i_category_id
    FROM syntheticworkload.item i  
    GROUP BY i.i_item_sk, i.i_item_id, i.i_category_id
    
  • The materialized view will be disabled when an UPDATE or DELETE occurs in the referenced base tables.  This restriction doesn't apply to INSERTs.  To re-enable the materialized view, run ALTER MATERIALIZED VIEW with REBUILD.

Remarks

A materialized view in Azure data warehouse is similar to an indexed view in SQL Server.  It shares almost the same restrictions as indexed view (see Create Indexed Views for details) except that a materialized view supports aggregate functions.  

Note

Although CREATE MATERIALIZED VIEW does not support COUNT, DISTINCT, COUNT(DISTINCT expression), or COUNT_BIG (DISTINCT expression), SELECT queries with these functions can still benefit from materialized views for faster performance as the Synapse SQL optimizer can automatically re-write those aggregations in the user query to match existing materialized views. For details, check this article's example section.

APPROX_COUNT_DISTINCT is not supported in CREATE MATERIALIZED VIEW AS SELECT.

Only CLUSTERED COLUMNSTORE INDEX is supported by materialized view.

A materialized view cannot reference other views.

A materialized view can't be created on a table with dynamic data masking (DDM), even if the DDM column is not part of the materialized view. If a table column is part of an active materialized view or a disabled materialized view, DDM can't be added to this column.

A materialized view can't be created on a table with row level security enabled.

Materialized Views can be created on partitioned tables.  Partition SPLIT/MERGE are supported on materialized views base tables, partition SWITCH isn't supported.

ALTER TABLE SWITCH is not supported on tables that are referenced in materialized views. Disable or drop the materialized views before using ALTER TABLE SWITCH. In the following scenarios, the materialized view creation requires new columns to be added to the materialized view:

Scenario New columns to add to materialized view Comment
COUNT_BIG() is missing in the SELECT list of a materialized view definition COUNT_BIG (*) Automatically added by materialized view creation. No user action is required.
SUM(a) is specified by users in the SELECT list of a materialized view definition AND 'a' is a nullable expression COUNT_BIG (a) Users need to add the expression 'a' manually in the materialized view definition.
AVG(a) is specified by users in the SELECT list of a materialized view definition where 'a' is an expression. SUM(a), COUNT_BIG(a) Automatically added by materialized view creation. No user action is required.
STDEV(a) is specified by users in the SELECT list of a materialized view definition where 'a' is an expression. SUM(a), COUNT_BIG(a), SUM(square(a)) Automatically added by materialized view creation. No user action is required.

Once created, materialized views are visible within SQL Server Management Studio under the views folder of the Azure Synapse Analytics (SQL Data Warehouse) instance.

Users can run SP_SPACEUSED and DBCC PDW_SHOWSPACEUSED to determine the space being consumed by a materialized view.

A materialized view can be dropped via DROP VIEW. You can use ALTER MATERIALIZED VIEW to disable or rebuild a materialized view.

EXPLAIN plan and the graphical Estimated Execution Plan in SQL Server Management Studio can show whether a materialized view is considered by the query optimizer for query execution. and the graphical Estimated Execution Plan in SQL Server Management Studio can show whether a materialized view is considered by the query optimizer for query execution.

To find out if a SQL statement can benefit from a new materialized view, run the EXPLAIN command with WITH_RECOMMENDATIONS. For details, see EXPLAIN (Transact-SQL).

Permissions

Requires 1) REFERENCES and CREATE VIEW permission OR 2) CONTROL permission on the schema in which the view is being created.

Example

A. This example shows how Synapse SQL optimizer automatically uses materialized views to execute a query for better performance even when the query uses functions un-supported in CREATE MATERIALIZED VIEW, such as COUNT(DISTINCT expression). A query used to take multiple seconds to complete now finishes in sub-second without any change in the user query.


-- Create a table with ~536 million rows
create table t(a int not null, b int not null, c int not null) with (distribution=hash(a), clustered columnstore index);

insert into t values(1,1,1);

declare @p int =1;
while (@P < 30)
    begin
    insert into t select a+1,b+2,c+3 from t;  
    select @p +=1;
end

-- A SELECT query with COUNT_BIG (DISTINCT expression) took multiple seconds to complete and it reads data directly from the base table a. 
select a, count_big(distinct b) from t group by a;

-- Create two materialized views, not using COUNT_BIG(DISTINCT expression).
create materialized view V1 with(distribution=hash(a)) as select a, b from dbo.t group by a, b;

-- Clear all cache.

DBCC DROPCLEANBUFFERS;
DBCC freeproccache;

-- Check the estimated execution plan in SQL Server Management Studio.  It shows the SELECT query is first step (GET operator) is to read data from the materialized view V1, not from base table a.
select a, count_big(distinct b) from t group by a;

-- Now execute this SELECT query.  This time it took sub-second to complete because Synapse SQL engine automatically matches the query with materialized view V1 and uses it for faster query execution.  There was no change in the user query.

DECLARE @timerstart datetime2, @timerend datetime2;
SET @timerstart = sysdatetime();

select a, count_big(distinct b) from t group by a;

SET @timerend = sysdatetime()
select DATEDIFF(ms,@timerstart,@timerend);

See also

Performance tuning with Materialized View
ALTER MATERIALIZED VIEW (Transact-SQL)
DROP VIEW
EXPLAIN (Transact-SQL)
sys.pdw_materialized_view_column_distribution_properties (Transact-SQL)
sys.pdw_materialized_view_distribution_properties (Transact-SQL)
sys.pdw_materialized_view_mappings (Transact-SQL)
DBCC PDW_SHOWMATERIALIZEDVIEWOVERHEAD (Transact-SQL)
Azure Synapse Analytics (SQL Data Warehouse) and Parallel Data Warehouse Catalog Views
System views supported in Azure Azure Synapse Analytics (SQL Data Warehouse)
T-SQL statements supported in Azure Azure Synapse Analytics (SQL Data Warehouse)