AlwaysOn: Value Proposition of Readable Secondary
This series of blogs is targeting AlwaysON Readable Secondary. Please refer to http://blogs.msdn.com/b/sqlalwayson/archive/tags/availability+group/for other blogs in the related area
Readable Secondary is part of AlwaysOn suite of functionality available in SQL12 release. It allows customers to leverage their investment in High Availability hardware for offloading read workload such as reporting to one or more secondary replicas. Offloading reporting workload to secondary replica frees up resources on the primary node for primary application workload to achieve higher throughput and at the same time allowing resources on secondary replica to reporting workload to deliver higher performance. So it is a win-win situation both for primary and reporting workload.
Before we proceed further, it will be useful to look at the technology choices available to SQL customers today to offload read workload. This will provide us a good insight into the unique value proposition of Readable Secondary.
SQL Server 2008 offers following four High Availability choices to customers
- Failover Clustering: This is shared disk environment and since only one node can have access to the database at one time, the second node cannot be used offloading read workload. The customers can increase the utilization of the HA hardware by deploying Active-Active cluster but not for offloading the read workload.
- Database Mirroring: It is used for HA but it also provides read access to the data by creating a database snapshot on the mirror. The read workload can then run against the database snapshot. While many customers have used database snapshot for running reporting workload but it has some challenges as follows
- The name of the database in database snapshot is different so if your application has hard coded database name, it will require modifications.
- It is a static snapshot of the database. If an application needs to access the data that has changed since, a new database snapshot needs to be created with a new name unless you drop the old database snapshot. In other words, customers want accessing data in near real-time but it is hard, if not impossible, to achieve using database snapshot. However, if a reporting workload processes, for example, last week’s data, you can create a database snapshot on the weekly boundary and run your reporting workload.
- Performance: Database snapshot employs copy on-write technology. When a database page is first modified after creating a database snapshot, a copy of the page made. This can significant overhead if there are multiple database snapshot. Also, during query processing it randomizes IO
- Log Shipping: Log shipping can be thought of as manual database mirroring with the fail over time dependent on how frequently log backups are done and restored on the log shipped node. DBAs can open the Log Shipping destination for read workload with the added limitation that log backup cannot be restored while the Log Shipping destination is open for read workload. This can put DBAs into a fix as they need to choose between HA and the read workload. If they reduce the frequency of the log backup restores, it will increase the RTO (recovery time objective) and if the increase the frequency, they will need kill all currently running queries. This does not work for many customers especially if they have long running queries as they may never run to completion due to HA constraints. One can get around this by adding one more log shipping destination (i.e. more hardware and software license cost) where log backups can be restored at a lesser frequency assuming that we will be able to keep up with the log.
- Replication: This is commonly deployed solution for HA and for offloading read workload. One of the key benefit of replication is that customer can create reporting workload specific indexes on the subscriber node. The challenges here are (a) may require schema changes (e.g. unique key) (b) not very suitable for high transaction throughput.
Readable Secondary addresses the challenges outlined with previous High Availability solutions with the exception of the ability to create reporting workload specific index(s) similar to the ones allowed by transaction replication. Readable Secondary allows read workload to run concurrently with the recovery thread that is applying the transactional logs from the primary. This allows reporting workload to access the data changes “live” as they occur on the primary replica and subsequently applied by the REDO thread on the secondary replica. The reporting workload is run without any changes as it is the same database on secondary replica as it was on the primary.
In the subsequent blogs I will describe how to setup readable secondary, configure application connectivity and ensuring predictable performance both for primary and secondary workloads