Optimize performance by upgrading SQL Data Warehouse

Upgrade Azure SQL Data Warehouse to latest generation of Azure hardware and storage architecture.

Why upgrade?

You can now seamlessly upgrade to the SQL Data Warehouse Compute Optimized Gen2 tier in the Azure portal. If you have a Compute Optimized Gen1 tier data warehouse, upgrading is recommended. By upgrading, you can use the latest generation of Azure hardware and enhanced storage architecture. You can take advantage of faster performance, higher scalability, and unlimited columnar storage.

Applies to

This upgrade applies to Compute Optimized Gen1 tier data warehouses.

Sign in to the Azure portal

Sign in to the Azure portal.

Before you begin

Note

If your existing Compute Optimized Gen1 tier data warehouse is not in a region where the Compute Optimized Gen2 tier is available, you can geo-restore through PowerShell to a supported region.

  1. If the Compute Optimized Gen1 tier data warehouse to be upgraded is paused, resume the data warehouse.

  2. Be prepared for a few minutes of downtime.

  3. Identify any code references to Compute Optimized Gen1 performance levels and modify them to their equivalent Compute Optimized Gen2 performance level. Below are two examples of where you should update code references before upgrading:

    Original Gen1 PowerShell command:

    Set-AzureRmSqlDatabase -ResourceGroupName "myResourceGroup" -DatabaseName "mySampleDataWarehouse" -ServerName "mynewserver-20171113" -RequestedServiceObjectiveName "DW300"
    

    Modified to:

    Set-AzureRmSqlDatabase -ResourceGroupName "myResourceGroup" -DatabaseName "mySampleDataWarehouse" -ServerName "mynewserver-20171113" -RequestedServiceObjectiveName "DW300c"
    

    Note

    -RequestedServiceObjectiveName "DW300" is changed to - RequestedServiceObjectiveName "DW300c"

    Original Gen1 T-SQL command:

    ALTER DATABASE mySampleDataWarehouse MODIFY (SERVICE_OBJECTIVE = 'DW300') ;
    

    Modified to:

    ALTER DATABASE mySampleDataWarehouse MODIFY (SERVICE_OBJECTIVE = 'DW300c') ; 
    

    Note

    SERVICE_OBJECTIVE = 'DW300' is changed to SERVICE_OBJECTIVE = 'DW300c'

Start the upgrade

  1. Go to your Compute Optimized Gen1 tier data warehouse in the Azure portal and click on the Upgrade to Gen2 card under the Tasks tab: Upgrade_1

    Note

    If you do not see the Upgrade to Gen2 card under the Tasks tab, your subscription type is limited in the current region. Submit a support ticket to get your subscription whitelisted.

  2. By default, select the suggested performance level for the data warehouse based on your current performance level on Compute Optimized Gen1 tier by using the mapping below:

    Compute Optimized Gen1 tier Compute Optimized Gen2 tier
    DW100 DW100c
    DW200 DW200c
    DW300 DW300c
    DW400 DW400c
    DW500 DW500c
    DW600 DW500c
    DW1000 DW1000c
    DW1200 DW1000c
    DW1500 DW1500c
    DW2000 DW2000c
    DW3000 DW3000c
    DW6000 DW6000c
  3. Ensure your workload has completed running and quiesced before upgrading. You will experience downtime for a few minutes before your data warehouse is back online as a Compute Optimized Gen2 tier data warehouse. Click Upgrade:

    Upgrade_2

  4. Monitor your upgrade by checking the status in the Azure portal:

    Upgrade3

    The first step of the upgrade process goes through the scale operation ("Upgrading - Offline") where all sessions will be killed, and connections will be dropped.

    The second step of the upgrade process is data migration ("Upgrading - Online"). Data migration is an online trickle background process, which slowly moves columnar data from the old storage architecture to the new storage architecture leveraging a local SSD cache. During this time, your data warehouse will be online for querying and loading. All your data will be available to query regardless of whether it has been migrated or not. The data migration happens at a varying rate depending on your data size, your performance level, and the number of your columnstore segments.

  5. Optional Recommendation: To expedite the data migration background process, you can immediately force data movement by running Alter Index rebuild on all primary columnstore tables you'd be querying at a larger SLO and resource class. This operation is offline compared to the trickle background process which can take hours to complete depending on the number and sizes of your tables; however, data migration will be much quicker where you can then take full advantage of the new enhanced storage architecture once complete with high-quality rowgroups.

The following query generates the required Alter Index Rebuild commands to expedite the data migration process:

SELECT 'ALTER INDEX [' + idx.NAME + '] ON [' 
       + Schema_name(tbl.schema_id) + '].[' 
       + Object_name(idx.object_id) + '] REBUILD ' + ( CASE 
                                                         WHEN ( 
                                                     (SELECT Count(*) 
                                                      FROM   sys.partitions 
                                                             part2 
                                                      WHERE  part2.index_id 
                                                             = idx.index_id 
                                                             AND 
                                                     idx.object_id = 
                                                     part2.object_id) 
                                                     > 1 ) THEN 
              ' PARTITION = ' 
              + Cast(part.partition_number AS NVARCHAR(256)) 
              ELSE '' 
                                                       END ) + '; SELECT ''[' + 
              idx.NAME + '] ON [' + Schema_name(tbl.schema_id) + '].[' + 
              Object_name(idx.object_id) + '] ' + ( 
              CASE 
                WHEN ( (SELECT Count(*) 
                        FROM   sys.partitions 
                               part2 
                        WHERE 
                     part2.index_id = 
                     idx.index_id 
                     AND idx.object_id 
                         = part2.object_id) > 1 ) THEN 
              ' PARTITION = ' 
              + Cast(part.partition_number AS NVARCHAR(256)) 
              + ' completed'';' 
              ELSE ' completed'';' 
                                                    END ) 
FROM   sys.indexes idx 
       INNER JOIN sys.tables tbl 
               ON idx.object_id = tbl.object_id 
       LEFT OUTER JOIN sys.partitions part 
                    ON idx.index_id = part.index_id 
                       AND idx.object_id = part.object_id 
WHERE  idx.type_desc = 'CLUSTERED COLUMNSTORE'; 

Next steps

Your upgraded data warehouse is online. To take advantage of the enhanced architecture, see Resource classes for Workload Management.