High Availability (Windows Server AppFabric Caching)

Article
09/20/2010

When high availability is enabled, a copy of each cached object or region is maintained on a separate cache host. The cache cluster manages maintenance of these copies and supplies them to your application if the primary copies are not available. No code changes are required to make your cache-enabled applications highly available. The following figure illustrates how copies of objects and regions are stored on separate hosts when the high availability feature is enabled.

"Velocity" high availability overview

High Availability Configuration

High availability is configured at the cache level in the cluster configuration settings. As a property of the cache, you can enable it when you first create the cache by using the New-Cache command with the Secondaries parameter equal to 1. This tells the cache administration Windows PowerShell cmdlets that you want one copy of each cached object or region. If you set the Secondaries parameter to 0, you disable the high availability feature. By default, the high availability option is disabled when you create a new cache. For more information about editing cache configuration settings, see Edit Cache Configuration Settings with Windows PowerShell (Windows Server AppFabric Caching).

The high availability feature of Windows Server AppFabric caching features requires all nodes in the cache cluster to be running the Enterprise Edition (or higher) of Windows Server 2008 or Windows Server 2008 R2. Please confirm that all high availability cache nodes are running on a supported operating system. For more information about supported operating systems, see the “Software Requirements” section of the AppFabric Installation Guide (https://go.microsoft.com/fwlink/?LinkId=169172).

Secondary Copy Storage

The cache cluster chooses where the secondary copies of objects and regions are stored. Just as AppFabric distributes cached objects across all cache hosts in the cluster, it also distributes the secondary copies of those objects across all cache hosts in the cluster.

How Consistency Is Maintained

Regardless of whether high availability is enabled, cache-enabled applications work as if only the primary copy of the cached object exists. All Add, Put, and Remove method calls are first initiated on the primary object on whichever cache host they reside. After the call is initiated to the cache host that maintains the primary object or region, the resulting action differs depending on whether high availability is enabled.

If high availability is enabled, there is an added step of notifying the host maintaining the secondary copy that a change is about to occur. Then, the cache host with the primary copy of the object waits for an acknowledgment from the other host before acknowledging back to the client that the operation is complete.

For an example, see cache hosts A and B in the following diagram. As soon as cache host A receives a request, it starts processing the request and notifies cache host B of the change. Then cache host B sends an acknowledgment back to cache host A. When cache host A receives the acknowledgment, it finishes the change and sends an acknowledgment back to the cache-enabled application. This process makes sure that the secondary copy of the object or region is always in the same state as the primary COPY. This process is referred to as strong consistency.

"Velocity" high availability consistency

Performance Considerations

Because the cache host maintaining the secondary copy of the object or region must acknowledge all changes that pertain to the primary copy, there is a small performance cost in the response time of writes from the cache-enabled application. Note that this performance impact does not affect reads of items already in the cache. You should also consider the time required to reload the cache with objects if the cache host that maintains the primary copies of those objects is lost.

What Happens When a Cache Host Fails

If a cache host fails (assuming there are still a sufficient number of cache hosts available to keep the cluster running) nothing changes for the cache-enabled application. The cache cluster re-routes requests for the object to the cache host that maintained the secondary copy of the object. Within the cluster, the secondary copies of all the primary objects are then elevated to become the new primary objects. Then, secondary copies of those new primary objects are distributed to other cache hosts across the cluster. Secondary objects on the cache host that failed are replaced by new secondary objects and distributed across the cluster. This process also applies to regions.

For the high availability feature to help insulate your application from the failure of a cache host, at least three cache hosts must be members of the cache cluster. This is due to a strong consistency requirement stating that there must always be two copies of a cached object or region in a high availability-enabled cache. To maintain two copies of a cache or region, a high availability-enabled cache requires at least two cache hosts to function.

For example, perhaps you have created a high availability-enabled cached named HACache in a three-server cache cluster as shown in the following table. Assume that SQL Server was configured to perform the cluster management role (so that this example does not need to consider the potential loss of lead hosts).

Time	Cache host 1	Cache host 2	Cache host 3	`HACache` (high availability-enabled named cache)
T1	running	running	running	available
T2	running	running	stopped	available
T3	running	stopped	stopped	not available

At T1, when there are three cache hosts available, two copies of cached objects or regions can be stored on one of three available servers. At T2, when one cache server fails, HACache continues to be available because there are still two cache hosts available to store the two copies of cached objects or regions. At T3, when the second cache host fails, HACache becomes unavailable. This is because there is no longer another cache host available to store the second copy of cached objects or regions.

Other High Availability Recommendations

To optimize the availability of your cached data, consider the following recommendations:

Employ a large number of cache hosts.
Deploy your distributed cache system within the perimeter of a firewall, with all servers members of the same domain, including the cache clients, cache hosts, primary data source server, and the server hosting the cluster configuration storage location.
Use SQL Server or a custom provider to store the cache cluster configuration settings.
- Use SQL Server or a custom provider to perform the cluster management role. For more information, see Lead Hosts and Cluster Management (Windows Server AppFabric Caching).
- When possible, use Microsoft Windows Server 2008 Failover Clustering (https://go.microsoft.com/fwlink/?LinkId=130692) to host a "clustered" database resource for the cache cluster configuration storage location.
Minimize costly configuration changes that require stopping the cluster. When possible, re-create named caches instead of stopping the entire cache cluster to make cache configuration changes in the cluster configuration settings.
Always use the Stop-CacheHost command to stop the cache service before rebooting a server. When lead hosts perform the cluster management role, the Stop-CacheHost cmdlet will not succeed if the act of stopping the cache service causes the entire cache cluster to shut itself down (because of no majority of running lead hosts).