High Availability (Velocity)

[This topic is pre-release documentation and is subject to change in future releases. Blank topics are included as placeholders.]

The high availability feature in Microsoft project code named "Velocity" supports continuous availability of your cached data by storing copies of that data on separate cache hosts. When you have high availability enabled on a multi-server cluster, your application can still retrieve its cached data if a cache server fails.

Without high availability enabled, "Velocity" can provide some protection for your cached data because objects not stored in regions are distributed across all hosts in the cluster. Due to this distributed nature, the more cache servers there are in the cluster, the less chance there is that a single server failure can affect the data that your application was using. With high availability enabled, server failures pose even less risk to your application.

The high availability feature helps guard against computer and process failures from individual cache hosts while the cluster is running. There may also be scenarios when the entire cluster goes down. For this reason, your application code should be designed so that it can function without the cache, and not require that cached data always be available. Because data in the cache is not persisted in a durable fashion, there is always the possibility that it may become unavailable to your application, even with the high availability feature enabled.


For the high availability feature to help insulate your application from the failure of a cache host, at least three cache hosts must be members of the cache cluster. Due to strong consistency requirements, there must always be two cache hosts in order for a high availability-enabled cache to function. It is also important to consider that the failure of a lead host in a small lead host-managed cluster may force the entire cluster to shut down. For more information, see Lead Hosts and Cluster Management (Velocity).

How High Availability Works

When high availability is enabled, a copy of each cached object or region is maintained on a separate cache host. The cache cluster manages maintenance of these copies and supplies them to your application if the primary copies are not available. No code changes are required to make your cache-enabled applications highly available. The following figure illustrates how copies of objects and regions are stored on separate hosts when the high availability feature is enabled.


High Availability Configuration

High availability is configured at the cache level in the cluster configuration settings. As a property of the cache, you can enable it when you first create the cache using the New-Cache command with the Secondaries parameter equal to 1. This tells the cache administration PowerShell cmdlets that you want one copy of each cached object or region. If you set the Secondaries parameter to 0, you disable the high availability feature. By default, the high availability option is disabled when you create a new cache. For more information about editing cache configuration settings, see How to: Edit Cache Configuration Settings with PowerShell (Velocity).

Secondary Copy Storage

The cache cluster chooses where the secondary copies of objects and regions are stored. Just as "Velocity" distributes cached objects across all cache hosts in the cluster, it also distributes the secondary copies of those objects across all cache hosts in the cluster.

How Consistency is Maintained

Regardless of whether high availability is enabled or not, cache-enabled applications work as if only the primary copy of the cached object exists. All Add, Put, and Remove method calls are first initiated on the primary object on whichever cache host they reside. Only after those calls are initiated to the cache host maintaining the primary object or region is there any difference in what occurs.

If high availability is enabled, there is an added step of notifying the host maintaining the secondary copy that a change is about to occur. Then, the cache host with the primary copy of the object waits for an acknowledgement from the other host before acknowledging back to the client that the operation is complete.

For an example, see cache hosts A and B in the following diagram. As soon as cache host A receives a request, it starts processing the request and notifies cache host B of the change. Then cache host B sends an acknowledgement back to cache host A. When cache host A receives the acknowledgement, it finishes the change and sends an acknowledgement back to the cache-enabled application. This process makes sure that the secondary copy of the object or region is always in the same state as the primary. This process is referred to as strong consistency.


Performance Considerations

Because the cache host maintaining the secondary copy of the object or region must acknowledge all changes that pertain to the primary copy, there is a small performance cost in the response time to the cache-enabled application. Whatever that cost may be, you also want to consider how long it will take to reload the cache with those objects if the cache host that maintains the primary copies of those objects is lost.

What Happens When a Cache Host Fails

If a cache host fails (assuming there are still a sufficient number of cache hosts available to keep the cluster running) nothing changes for the cache-enabled application. The cache cluster re-routes requests for the object to the cache host that maintained the secondary copy of the object. Within the cluster, the secondary copies of all the primary objects are then elevated to become the new primary objects. Then, secondary copies of those new primary objects are distributed to other cache hosts across the cluster. Secondary objects on the cache host that failed are replaced by new secondary objects and distributed across the cluster. This process also applies to regions.

For the high availability feature to help insulate your application from the failure of a cache host, at least three cache hosts must be members of the cache cluster. This is due to a strong consistency requirement stating that there must always be two copies of a cached object or region in a high availability-enabled cache. In order to maintain two copies of a cache or region, a high availability-enabled cache requires at least two cache hosts to function.

For example, perhaps you have created a high availability-enabled cached named HACache in a three-server cache cluster as shown in the following table. Assume that SQL Server was configured to perform the cluster management role (so that this example does not need to consider the potential loss of lead hosts).

Time Cache host 1 Cache host 2 Cache host 3 HACache (high availability-enabled named cache)















not available

At T1, when there are three cache hosts available, two copies of cached objects or regions can be stored on one of three available servers. At T2, when one cache server fails, HACache continues to be available because there are still two cache hosts available to store the two copies of cached objects or regions. At T3, when the second cache host fails, HACache becomes unavailable. This is because there is no longer another cache host available to store the second copy of cached objects or regions.

Other High Availability Recommendations

To optimize the availability of your cached data, consider the following recommendations:

  • Employ a large number of cache hosts.

  • Deploy your distributed cache system within the perimeter of the corporate firewall, with all server members of the same domain, including the cache clients, cache hosts, primary data source server, and the server hosting the cluster configuration storage location.

  • Use SQL Server in your distributed cache system.

  • Minimize costly configuration changes that require stopping the cluster. When possible, re-create named caches instead of stopping the entire cache cluster to make cache configuration changes in the cluster configuration settings.

  • Always use the Stop-CacheHost command to stop the cache host service before rebooting a server. When lead hosts perform the cluster management role, the Stop-CacheHost cmdlet will not succeed if the act of stopping the cache host service causes the entire cache cluster to shut itself down (because of no majority of running lead hosts).

See Also


Deployment Options (Velocity)
General Concept Models (Velocity)
Concurrency Models (Velocity)
Expiration and Eviction (Velocity)
Data Classification (Velocity)

Other Resources

Programming Guide (Velocity)
Administration Guide (Velocity)