Microsoft Windows Azure Disaster Recovery Options for On-Premises SQL Server

One of the use-cases for a cloud solution is to serve as a Disaster Recovery option for your on-premises servers. I’ll explain one particular use-case in this entry, specifically using Windows Azure “IaaS” or Virtual Machines as a Recovery Solution for SQL Server (more detail here: In future installments I’ll explain options for other workloads such as Linux and Windows Servers, SharePoint and other solutions. Some architectures also allow for using Windows Azure SQL Database (Formerly SQL Azure) in recovery scenarios; I’ll cover that separately.

Using Azure as a Disaster Recovery site gives you a range of options, uses world-wide datacenters that you can pick from, and does not require traditional licensing and maintenance paths. You can also integrate the offsite data into other uses, such as reporting (in some cases) or to leverage within other applications.  However, the cost-model is different, so make sure you do your homework to ensure that it makes sense to use a cloud provider for safety. You may find that it is cheaper, more expensive, or that you require a mix of technologies and options to get the best solution.

NOTE: The Microsoft Windows Azure platform evolves constantly. That means new features and capabilities, as well as security, optimizations and more improve on a frequent basis. As with any cloud provider, ensure that you check the date of this post to ensure you are within six months or so. If the date is longer than that, then check each of the “Details” links to ensure you are working with the latest information.

The options you have range from simple off-site storage for database backups to systems that your users can access when your primary options are offline.  To select which options to use, evaluate the databases you want to protect, and then create your Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO) for each workload. Those two vectors will provide the starting point for each choice you make.

NOTE: If you’re not familiar with RPO and RTO on a database system, learn those terms carefully before designing a recovery solution – on any platform. RPO and RTO are business/technology terms, and are not vendor or platform-specific.  

The range of protection you have is very similar to the on-premises options for SQL Server (on-premises details here:, with the primary limitation being bandwidth. While Microsoft has the largest connections we can get into our datacenters, depending on where your systems are and their connection to the Internet, you will need to consider how much data you transfer, and how often.  For backup files, a single, larger transfer is acceptable, using Log Shipping or Database Mirroring, smaller, more frequent transfers are preferable.

Another limitation is controlling the hardware on the Windows Azure Virtual Machine. That means hardware-based clustering isn’t possible, as of this writing. You’re also limited to the size of the Virtual Machines that Windows Azure (or any other cloud provider) offers. It’s important to keep in mind that you’re building a Disaster Recovery solution, not necessarily a full Highly-Available system. The difference is that in this case DR provides a means to recover and operate at a more limited fashion than a full on-premises HA (with matching hardware and licenses) involves. Storage, however, isn’t as affected. You can mount large amounts of storage on a Windows Azure Virtual Machine, so it’s more memory and CPU that you need to consider for your solution.

The final consideration is security. There are two aspects in security that you need to consider: data security and authentication and access. For the first consideration, the Windows Azure system does hold multiple certifications and attestations that you can find here:  . In some cases those certifications are agreements on the part of security each party will hold liability for; so it’s important to carefully read and understand what the agreement states. There are also methods of encrypting data (such as the backups) using your own certificates or hardware devices and then storing them externally. This means no one can easily un-encrypt your data.

For the authentication portion, you can create a secure “tunnel”  between your network and Windows Azure. This involves a certificate that is installed on your hardware firewall at your facility, and an agent that is enabled with the same certificate on Windows Azure. This gives you a “point to point” connection, encrypted but over a public connection. From there you can use Active Directory to connect the authentication for the systems involved in the DR solution.


The First and most simple DR solution using Windows Azure is to store your backup files ( *.bak) in Windows Azure storage. Windows Azure Storage is triple-redundant across multiple fault-domains within a single datacenter, and then all three copies are replicated to a geographically separate (although data-sovereignty same) location. That translates to six copies of data stored remotely. In case of a disaster, you connect to storage, download the images, and restore them to a new server. The server can have the same name or different, and unless you’re using contained databases, you’ll need to re-create and re-authorize the security accounts needed for the database.

Note that you also have the option of using an “appliance”, which is a piece of hardware you install at your facility which will act as a backup device or share location (or both). The device handles the encryption, de-duplication and compression for the files, and then stores those files on Windows Azure. More information on that option is here:

RPO: As of last backup

RTO: (Time of transfer from Windows Azure + Time of Restore to New System + Bringing System Online with User Accounts) - Time of Backup


More detail on storing files on Windows Azure: 

Free Client:

Database Mirroring

Database Mirroring is a deprecated feature in SQL Server, which means it will be removed in a future release. It is, however, still supported in SQL Server 2012, and it can be used between on-premises SQL Server Instances and Windows Azure VM’s.  Using connection strings and .NET languages, clients can actually point to the partner server automatically.

The granularity of this solution is at the individual database level.  Machines can retain their individual identities. You can use certificates to connect the systems, or you can use the point-to-point solution and Active Directory.

There are limitations, however. You won’t use a Listener in this configuration, and you’ll be using Asynchronous mode. If you are not running in the same Active Directory, you’ll also need to factor in the time to re-create and tie out those accounts when calculating the RTO value.


RPO: As of last good synchronization

RTO: (Time of failure + Time of client redirect to New System ) - Time of last good synchronization


A complete tutorial on setting up this configuration is here:

Log Shipping

Another feature available for DR in a Hybrid fashion is using Log Shipping, which also protects your system at a database level. Log shipping involves an automated log backup of your database, and the log is copied and then applied at the secondary server. Because the log file is copied to a Windows share, this solution requires both networking access and an Active Directory integration.


RPO: As of last good log backup application to the secondary system

RTO: (Time of failure + Time of manual client redirect to New System + Time of Manual Failover ) - Time of last good log backup


Log Shipping information is here:

AlwaysOn Availability Groups

SQL Server 2012 introduces a new set of features called “AlwaysOn” that encompass many of the HA/DR features in previous releases. One feature within that set is called “Availability Groups”, and with certain caveats that feature is available for a Hybrid on-premises to Windows Azure VM solution.


AlwaysOn requires a Windows Cluster (WFSC), which is where the caveats come into play. You’re able to set up a  multi-subnet WSFC cluster, but you won’t have access to the Availability Group Listener function, so you need to consider the client reconnection.

RPO: As of last good synchronization

RTO: (Time of failure + Time of manual client redirect to New System + Time of Manual Failover ) - Time of last good log backup


A complete tutorial on setting up this configuration is here:

Real-world notes and testing here:

Other Solution Options

Taking an overview approach, you can use other data transfer mechanisms. While these involve more manual coding and architecture, you do have more control. For instance, you could copy the data to multiple locations, platforms and more, and allow reading and manipulations of the data at the destination. You can use code options or even SQL Server Replication (blog on this process is here:

RPO: Varies

RTO: Varies


A whitepaper on the information I've discussed throughout this article and other options is available here:

The “SQL AlwaysOn” Team Blog (where you may find more current information) is here: