Requirements and Recommendations for a Multi-Site Failover Cluster
Updated: August 29, 2012
Applies To: Windows Server 2008
This topic provides information about requirements and recommendations for a multi-site failover cluster. For a list of the steps for implementing a design for a multi-site cluster, see Checklist: Clustered Service or Application in a Multi-Site Failover Cluster (http://go.microsoft.com/fwlink/?LinkId=129126).
For additional information about designs for a multi-site cluster, see Design for a Clustered Service or Application in a Multi-Site Failover Cluster and Example, Clustered Service or Application in a Multi-Site Failover Cluster.
Multi-site failover clusters running Exchange Server 2007 use the Cluster Continuous Replication (CCR) feature of Microsoft Exchange Server 2007, and have a maximum of two nodes. For information about CCR and clustering, see the CCR topics at http://go.microsoft.com/fwlink/?Linkid=129111 and http://go.microsoft.com/fwlink/?Linkid=129112.
The following list provides information about requirements and recommendations for a multi-site cluster:
Hardware investment: A multi-site cluster requires an investment in redundant hardware, because it requires the additional servers and storage at the secondary site. Work closely with your hardware and software vendors to ensure that the solution you choose meets your requirements for server capacity, storage functionality, replication between sites, and network characteristics such as network latency.
Number of nodes and corresponding quorum configuration: For a multi-site cluster, we recommend having an even number of nodes and, for the quorum configuration, using the Node and File Share Majority option, that is, including a file share witness as part of the configuration. This is shown in the diagram in Design for a Clustered Service or Application in a Multi-Site Failover Cluster. The file share witness can be located at a third site, that is, a different location from the main site and secondary site, so that it is not lost if one of the other two sites has problems.
Any cluster with an even number of nodes should use a quorum configuration that includes a witness (disk witness or file-share witness) as a tie-breaker. For the witness for a multi-site cluster, we recommend a file share witness, not a disk witness, because it is easier to keep the file share witness accessible to both sites.
See the important note at the beginning of this topic about multi-site failover clusters running Exchange Server 2007.
It is also possible to design a multi-site cluster that has an odd number of nodes (except as previously noted for Exchange Server 2007), with the majority of nodes at the main site. This design should use the **Node Majority** quorum configuration (as should all configurations with an odd number of nodes). Note that with this design, complete failure of the main site requires you to intervene and force the cluster to start at the secondary site, because the secondary site has only a minority of nodes. Forcing the cluster to start in this way is called forcing quorum. For additional information about quorum configurations, see [Appendix F: Reviewing Quorum Configuration Options for a Failover Cluster](dd197496\(v=ws.10\).md).
Network configuration—deciding between multi-subnets and a VLAN: A multi-site cluster running Windows Server 2008 can contain nodes that are in different subnets, unless it is a cluster running SQL Server 2005 or SQL Server 2008 (which requires the use of a virtual local area network or VLAN). In other words, the cluster nodes can potentially communicate across network routers. However, when using multiple subnets, it is important to consider how clients will discover services or applications that have just failed over.
Although a clustered service or application keeps the same network name after failover, if it fails over to a server in a different subnet, that network name will then be associated with a new IP address. The DNS servers must update one another with this new IP address before clients can discover the service or application that has failed over. In addition, on the client, the cached DNS entries need to expire before the client queries a DNS server again. In other words, with multiple subnets, the amount of downtime that clients experience is dependent not just on how quickly failover occurs, but also on how quickly DNS replication occurs and how quickly the clients query for updated DNS information.
To minimize downtime in a multi-site cluster, consider the following approaches:
Review your options for using VLANs and for using multiple subnets to connect the nodes. Each approach has its advantages (but note that a cluster running SQL Server 2008 must be configured with a VLAN). One of the advantages for VLANs is that they avoid issues associated with the time it takes for DNS replication to complete. However, multiple subnets can be simpler than VLANs to set up and manage.
If you prefer to use multiple subnets in your multi-site cluster, you might choose to modify two private properties associated with the network name resources in your cluster. One property is the Time to Live (TTL) property, which can limit the amount of time that a given DNS record is used before it will be discarded, that is, limit the persistence of DNS information that might be stale because a failover occurred. The default Time to Live is 20 minutes or 1200 seconds, but you can limit it according to recommendations for your application. (For example, the recommended value for Exchange Server 2007 is 5 minutes or 300 seconds.) For more information, see http://go.microsoft.com/fwlink/?LinkId=128166 and http://go.microsoft.com/fwlink/?LinkId=130588.
The other private property that you might choose to modify controls which IP addresses are registered in DNS: either all IP addresses on which a network name resource depends, or only the IP address that successfully comes online (that is, the IP address on the subnet of the node that currently owns that network name resource). If you register all IP addresses on which a network name resource depends, any IP address that is needed by a network name will always be registered (regardless of subnet), minimizing downtime. This private property is most useful when the client side of your client-server application is capable of handling DNS records with multiple IP addresses associated with the network name. For more information, see http://go.microsoft.com/fwlink/?LinkId=130588.
Network configuration—Hyper-V, DHCP, and static IP addresses: In a multi-site cluster where the nodes run Hyper-V and use multiple subnets, if the virtual machines use DHCP rather than static IP addresses, failover is fully automatic even when the new owner node is in a different subnet than the old. However, if the virtual machines use static IP addresses, when failover occurs to a node in a different subnet, you must adjust the IP addresses manually to an appropriate address.
Tuning of heartbeat settings: In a multi-site cluster, you might want to tune the "heartbeat" settings. The heartbeat settings include the frequency at which the nodes send heartbeat signals to each other to indicate that they are still functioning, and the number of heartbeats that a node can miss before another node initiates failover and begins taking over the services and applications that had been running on the failed node. You can tune these settings for heartbeat signals to account for differences in network latency caused by communication across subnets. For information about how to tune heartbeat settings, see http://go.microsoft.com/fwlink/?LinkId=130588.
Replication of data: Replication of data between sites is very important in a multi-site cluster, and is accomplished in different ways by different hardware vendors. Therefore, the choice of the replication process requires careful consideration. When making this choice, consult with your hardware and software vendors, and review the following considerations:
Choosing replication level: block, file system, or application level: The replication process can function through the hardware (at the block level), through the operating system (at the file system level), or through certain applications such as Microsoft Exchange Server 2007 (which has a feature called Cluster Continuous Replication or CCR). Work with your hardware and software vendors to choose a replication process that fits the requirements of your organization.
Configuring replication to avoid data corruption: The replication process must be configured so that any interruptions to the process will not result in data corruption, but instead will always provide a set of data that matches the data from the main site as it existed at some moment in time. In other words, the replication must always preserve the order of I/O operations that occurred at the main site. This is crucial, because very few applications can recover if the data is corrupted during replication.
Not using Distributed File System Replication: You cannot use the feature in Windows Server 2008 called Distributed File System Replication (DFS-R) as your data replication method in a multi-site cluster. DFS-R only performs its data replication after a file is closed. This works well for files such as documents, presentations, or spreadsheets, but it will not work for files that are held open, such as databases or virtual machines. You must choose a replication option other than DFS-R.
Choosing between synchronous and asynchronous replication: The replication process can be synchronous, where no write operation finishes until the corresponding data is committed at the secondary site, or asynchronous, where the write operation can finish at the main site and then be replicated (as a background operation) to the secondary site. Synchronous replication means that the replicated data is always up-to-date, but it slows application performance while each operation waits for replication. Asynchronous replication can help maximize application performance, but if failover to the secondary site is necessary, some of the most recent user operations might not be reflected in the data after failover. This is because some operations that were finished recently might not yet be replicated.
Synchronous replication is best for multi-site clusters that can are using high-bandwidth, low-latency connections. Typically, this means that a cluster using synchronous replication must not be stretched over a great distance. Asynchronous replication is best for clusters where you want to stretch the cluster over greater geographical distances with no significant application performance impact.
For diagrams showing basic designs for a multi-site cluster, see Design for a Clustered Service or Application in a Multi-Site Failover Cluster and Example, Clustered Service or Application in a Multi-Site Failover Cluster.