Backup and restore in Azure SQL Data Warehouse
Learn how to use backup and restore in Azure SQL Data Warehouse. Use data warehouse restore points to recover or copy your data warehouse to a previous state in the primary region. Use data warehouse geo-redundant backups to restore to a different geographical region.
What is a data warehouse snapshot
A data warehouse snapshot creates a restore point you can leverage to recover or copy your data warehouse to a previous state. Since SQL Data Warehouse is a distributed system, a data warehouse snapshot consists of many files that are located in Azure storage. Snapshots capture incremental changes from the data stored in your data warehouse.
A data warehouse restore is a new data warehouse that is created from a restore point of an existing or deleted data warehouse. Restoring your data warehouse is an essential part of any business continuity and disaster recovery strategy because it re-creates your data after accidental corruption or deletion. Data warehouse is also a powerful mechanism to create copies of your data warehouse for test or development purposes. SQL Data Warehouse uses fast restore mechanisms within the same region which has been measured to take less than 20 minutes for any data size.
Automatic Restore Points
Snapshots are a built-in feature of the service that creates restore points. You do not have to enable this capability. Automatic restore points currently cannot be deleted by users where the service uses these restore points to maintain SLAs for recovery.
SQL Data Warehouse takes snapshots of your data warehouse throughout the day creating restore points that are available for seven days. This retention period cannot be changed. SQL Data Warehouse supports an eight-hour recovery point objective (RPO). You can restore your data warehouse in the primary region from any one of the snapshots taken in the past seven days.
To see when the last snapshot started, run this query on your online SQL Data Warehouse.
select top 1 * from sys.pdw_loader_backup_runs order by run_id desc ;
User-Defined Restore Points
This feature enables you to manually trigger snapshots to create restore points of your data warehouse before and after large modifications. This capability ensures that restore points are logically consistent, which provides additional data protection in case of any workload interruptions or user errors for quick recovery time. User-defined restore points are available for seven days and are automatically deleted on your behalf. You cannot change the retention period of user-defined restore points. 42 user-defined restore points are guaranteed at any point in time so they must be deleted before creating another restore point. You can trigger snapshots to create user-defined restore points through PowerShell or the Azure portal.
If you require restore points longer than 7 days, please vote for this capability here. You can also create a user-defined restore point and restore from the newly created restore point to a new data warehouse. Once you have restored, you have the data warehouse online and can pause it indefinitely to save compute costs. The paused database incurs storage charges at the Azure Premium Storage rate. If you need an active copy of the restored data warehouse, you can resume which should take only a few minutes.
Restore point retention
The following lists details for restore point retention periods:
- SQL Data Warehouse deletes a restore point when it hits the 7-day retention period and when there are at least 42 total restore points (including both user-defined and automatic)
- Snapshots are not taken when a data warehouse is paused
- The age of a restore point is measured by the absolute calendar days from the time the restore point is taken including when the data warehouse is paused
- At any point in time, a data warehouse is guaranteed to be able to store up to 42 user-defined restore points and 42 automatic restore points as long as these restore points have not reached the 7-day retention period
- If a snapshot is taken, the data warehouse is then paused for greater than 7 days, and then resumes, it is possible for restore point to persist until there are 42 total restore points (including both user-defined and automatic)
Snapshot retention when a data warehouse is dropped
When you drop a data warehouse, SQL Data Warehouse creates a final snapshot and saves it for seven days. You can restore the data warehouse to the final restore point created at deletion.
If you delete a logical SQL server instance, all databases that belong to the instance are also deleted and cannot be recovered. You cannot restore a deleted server.
Geo-backups and disaster recovery
SQL Data Warehouse performs a geo-backup once per day to a paired data center. The RPO for a geo-restore is 24 hours. You can restore the geo-backup to a server in any other region where SQL Data Warehouse is supported. A geo-backup ensures you can restore data warehouse in case you cannot access the restore points in your primary region.
Geo-backups are on by default. If your data warehouse is Gen1, you can opt out if you wish. You cannot opt out of geo-backups for Gen2 as data protection is a built-in guaranteed.
If you require a shorter RPO for geo-backups, vote for this capability here. You can also create a user-defined restore point and restore from the newly created restore point to a new data warehouse in a different region. Once you have restored, you have the data warehouse online and can pause it indefinitely to save compute costs. The paused database incurs storage charges at the Azure Premium Storage rate. Should you need an active copy of the data warehouse, you can resume which should take only a few minutes.
Backup and restore costs
You will notice the Azure bill has a line item for Storage and a line item for Disaster Recovery Storage. The Storage charge is the total cost for storing your data in the primary region along with the incremental changes captured by snapshots. For a more detailed explanation of how snapshots are charged, refer to Understanding how Snapshots Accrue Charges. The geo-redundant charge covers the cost for storing the geo-backups.
The total cost for your primary data warehouse and seven days of snapshot changes is rounded to the nearest TB. For example, if your data warehouse is 1.5 TB and the snapshots captures 100 GB, you are billed for 2 TB of data at Azure Premium Storage rates.
If you are using geo-redundant storage, you receive a separate storage charge. The geo-redundant storage is billed at the standard Read-Access Geographically Redundant Storage (RA-GRS) rate.
Restoring from restore points
Each snapshot creates a restore point that represents the time the snapshot started. To restore a data warehouse, you choose a restore point and issue a restore command.
You can either keep the restored data warehouse and the current one, or delete one of them. If you want to replace the current data warehouse with the restored data warehouse, you can rename it using ALTER DATABASE (Azure SQL Data Warehouse) with the MODIFY NAME option.
To restore a deleted or paused data warehouse, you can create a support ticket.
Cross subscription restore
If you need to directly restore across subscription, vote for this capability here. Restore to a different logical server and 'Move' the server across subscriptions to perform a cross subscription restore.
You can restore your data warehouse to any region supporting SQL Data Warehouse at your chosen performance level.
To perform a geo-redundant restore you must not have opted out of this feature.
For more information about disaster planning, see Business continuity overview
We'd love to hear your thoughts. Choose the type you'd like to provide:
Our feedback system is built on GitHub Issues. Read more on our blog.