Performing disaster recovery drills
It is recommended that validation of application readiness for recovery workflow is performed periodically. Verifying the application behavior and implications of data loss and/or the disruption that failover involves is a good engineering practice. It is also a requirement by most industry standards as part of business continuity certification.
Performing a disaster recovery drill consists of:
- Simulating data tier outage
- Validate application integrity post recovery
Depending on how you designed your application for business continuity, the workflow to execute the drill can vary. This article describes the best practices for conducting a disaster recovery drill in the context of Azure SQL Database.
To prevent the potential data loss when conducting a disaster recovery drill, perform the drill using a test environment by creating a copy of the production environment and using it to verify the application’s failover workflow.
To simulate the outage, you can rename the source database. This name change causes application connectivity failures.
- Perform the geo-restore of the database into a different server as described here.
- Change the application configuration to connect to the recovered database and follow the Configure a database after recovery guide to complete the recovery.
Complete the drill by verifying the application integrity post recovery (including connection strings, logins, basic functionality testing, or other validations part of standard application signoffs procedures).
For a database that is protected using failover groups, the drill exercise involves planned failover to the secondary server. The planned failover ensures that the primary and the secondary databases in the failover group remain in sync when the roles are switched. Unlike the unplanned failover, this operation does not result in data loss, so the drill can be performed in the production environment.
To simulate the outage, you can disable the web application or virtual machine connected to the database. This outage simulation results in the connectivity failures for the web clients.
- Make sure the application configuration in the DR region points to the former secondary, which becomes the fully accessible new primary.
- Initiate planned failover of the failover group from the secondary server.
- Follow the Configure a database after recovery guide to complete the recovery.
Complete the drill by verifying the application integrity post recovery (including connectivity, basic functionality testing, or other validations required for the drill signoffs).
- To learn about business continuity scenarios, see Continuity scenarios.
- To learn about Azure SQL Database automated backups, see SQL Database automated backups
- To learn about using automated backups for recovery, see restore a database from the service-initiated backups.
- To learn about faster recovery options, see Active geo-replication and Auto-failover groups.