Configuring Windows Failover Cluster Networks

In this blog, I will discuss the overall general practices to be considered when configuring networks in Failover Clusters.

Avoid single points of failure:

Identifying single points of failure and configuring redundancy at every point in the network is very critical to maintain high availability. Redundancy can be maintained by using multiple independent networks or by using NIC Teaming. Several ways of achieving this would be:

· Use multiple physical network adapter cards. Multiple ports of the same multiport card or backplane used for networks introduces a single point of failure.

· Connect network adapter cards to different independent switches. Multiple Vlans patched into a single switch introduces a single point of failure.

· Use of NIC teaming for non-redundant networks, such as client connection, intra-cluster communication, CSV, and Live Migration. In the event of a failure of the current active network card will have the communication move over to the other card in the team.

· Using different types of network adapters eliminates affecting connectivity across all network adapters at the same time if there is an issue with the NIC driver.

· Ensure upstream network resiliency to eliminate a single point of failure between multiple networks.

· The Failover Clustering network driver detects networks on the system by their logical subnet. It is not recommended to assign more than one network adapter per subnet, including IPV6 Link local, as only one card would be used by Cluster and the other ignored.

Network Binding Order:

The Adapters and Bindingstab lists the connections in the order in which the connections are accessed by network services. The order of these connections reflects the order in which generic TCP/IP calls/packets are sent on to the wire.

How to change the binding order of network adapters

  1. Click Start, click Run, type ncpa.cpl, and then click OK. You can see the available connections in the LAN and High-Speed Internet section of the Network Connections window.
  2. Press the <ALT><N> keys on the keyboard to bring up the Advanced Menu.
  3. On the Advanced menu, click Advanced Settings, and then click the Adapters and Bindings tab.
  4. In the Connections area, select the connection that you want to move higher in the list. Use the arrow buttons to move the connection. As a general rule, the card that talks to the network (domain connectivity, routing to other networks, etc should the first bound (top of the list) card.

Cluster nodes are multi-homed systems.  Network priority affects DNS Client for outbound network connectivity.  Network adapters used for client communication should be at the top in the binding order.  Non-routed networks can be placed at lower priority.  In Windows Server 2012/2012R2, the Cluster Network Driver (NETFT.SYS) adapter is automatically placed at the bottom in the binding order list.

Cluster Network Roles:

Cluster networks are automatically created for all logical subnets connected to all nodes in the Cluster.  Each network adapter card connected to a common subnet will be listed in Failover Cluster Manager.  Cluster networks can be configured for different uses.

Name

Value

Description

Disabled for Cluster Communication

0

No cluster communication of any kind sent over this network

Enabled for Cluster Communication only

1

Internal cluster communication and CSV traffic can be sent over this network

Enabled for client and cluster communication

3

Cluster IP Address resources can be created on this network for clients to connect to. Internal and CSV traffic can be sent over this network

Automatic configuration

The Network roles are automatically configured during cluster creation. The above table describes the networks that are configured in a cluster.

Networks used for ISCSI communication with ISCSI software initiators is automatically disabled for Cluster communication (Do not allow cluster network communication on this network).

Networks configured without default gateway is automatically enabled for cluster communication only (Allow cluster network communication on this network).

Network configured with default gateway is automatically enabled for client and cluster communication (Allow cluster network communication on this network, Allow clients to connect through this network).

Manual configuration

Though the cluster networks are automatically configured while creating the cluster as described above, they can also be manually configured based on the requirements in the environment.

To modify the network settings for a Failover Cluster:

· Open Failover Cluster Manager

· Expand Networks.

· Right-click the network that you want to modify settings for, and then click Properties.

· If needed, change the name of the network.

· Select one of the following options:

o Allow cluster network communication on this network.  If you select this option and you want the network to be used by the nodes only (not clients), clear Allow clients to connect through this network. Otherwise, make sure it is selected.

o Do not allow cluster network communication on this network.  Select this option if you are using a network only for iSCSI (communication with storage) or only for backup. (These are among the most common reasons for selecting this option.)

Cluster network roles can also be changed using PowerShell command, Get-ClusterNetwork.

For example:

(Get-ClusterNetwork “Cluster Network 1”). Role =3

This configures “Cluster Network 1” to be enabled for client and cluster communication.

Configuring Quality of Service Policies in Windows 2012/2012R2:

To achieve Quality of Service, we can either have multiple network cards or used, QoS policies with multiple VLANs can be created.

QoS Prioritization is recommended to configure on all cluster deployments. Heartbeats and Intra-cluster communication are sensitive to latency and configuring a QoS Priority Flow Control policy helps reduce the latency.

An example of setting cluster heartbeating and intra-node communication to be the highest priority traffic would be:

New-NetQosPolicy “Cluster”-Cluster –Priority 6
New-NetQosPolicy “SMB” –SMB –Priority 5
New-NetQosPolicy “Live Migration” –LiveMigration –Priority 3

Note:

Available values are 0 – 6

Must be enabled on all the nodes in the cluster and the physical network switch

Undefined traffic is of priority 0

Bandwidth Allocation:

It is recommended to configure Relative Minimum Bandwidth SMB policy on CSV deployments

Example of setting minimum policy of cluster for 30%, Live migration for 20%, and SMB Traffic for 50% of the total bandwidth.

New-NetQosPolicy “Cluster” –Cluster –MinBandwidthWeightAction 30
New-NetQosPolicy “Live Migration” –LiveMigration –MinBandwidthWeightAction 20
New-NetQosPolicy “SMB” –SMB –MinBandwidthWeightAction 50

Multi-Subnet Clusters:

Failover Clustering supports having nodes reside in different IP Subnets. Cluster Shared Volumes (CSV) in Windows Server 2012 as well as SQL Server 2012 support multi-subnet Clusters.

Typically, the general rule has been to have one network per role it will provide. Cluster networks would be configured with the following in mind.

Client connectivity

Client connectivity is used for the applications running on the cluster nodes to communicate with the client systems. This network can be configured with statically assigned IPv4, IPv6 or DHCP assigned IP addresses. APIPA addresses should not be used as will be ignored networks as the Cluster Virtual Network Adapter will be on those address schemes. IPV6 Stateless address auto configuration can be used, but keep in mind that DHCPv6 addresses are not supported for clustered IP address resources. These networks are also typically a routable network with a Default Gateway.

CSV Network for Storage I/O Redirection.

You would want this network if using as a Hyper-V Cluster and highly available virtual machines. This network is used for the NTFS Metadata Updates to a Cluster Shared Volume (CSV) file system. These should be lightweight and infrequent unless there are communication related events getting to the storage.

In the case of CSV I/O redirection, latency on this network can slow down the storage I/O performance. Quality of Service is important for this network. In case of failure in a storage path between any nodes or the storage, all I/O will be redirected over the network to a node that still has the connectivity for it to commit the data. All I/O is forwarded, via SMB, over the network which is why network bandwidth is important.

Client for Microsoft Networks and File and Printer Sharing for Microsoft Networks need to be enabled to support Server Message Block (SMB) which is required for CSV. Configuring this network not to register with DNS is recommended as it will not use any name resolution. The CSV Network will use NTLM Authentication for its connectivity between the nodes.

CSV communication will take advantage of the SMB 3.0 features such as SMB multi-channel and SMB Direct to allow streaming of traffic across multiple networks to deliver improved I/O performance for its I/O redirection.

By default, the cluster will automatically choose the NIC to be used for CSV for manual configuration refer the following article.

Designating a Preferred Network for Cluster Shared Volumes Communication
https://technet.microsoft.com/en-us/library/ff182335(WS.10).aspx

This network should be configured for Cluster Communications

Live Migration Network

As with the CSV network, you would want this network if using as a Hyper-V Cluster and highly available virtual machines. The Live Migration network is used for live migrating Virtual machines between cluster nodes. Configure this network as Cluster communications only network. By default, Cluster will automatically choose the NIC for Live migration.

Multiple networks can be selected for live migration depending on the workload and performance. It will take advantage of the SMB 3.0 feature SMB Direct to allow migrations of virtual machines to be done at a much quicker pace.

ISCSI Network:

If you are using ISCSI Storage and using the network to get to it, it is recommended that the iSCSI Storage fabric have a dedicated and isolated network. This network should be disabled for Cluster communications so that the network is dedicated to only storage related traffic.

This prevents intra-cluster communication as well as CSV traffic from flowing over same network. During the creation of the Cluster, ISCSI traffic will be detected and the network will be disabled from Cluster use. This network should set to lowest in the binding order.

As with all storage networks, you should configure multiple cards to allow the redundancy with MPIO. Using the Microsoft provided in-box teaming drivers, network card teaming is now supported in Win2012 with iSCSI.

Heartbeat communication and Intra-Cluster communication

Heartbeat communication is used for the Health monitoring between the nodes to detect node failures. Heartbeat packets are Lightweight (134 bytes) in nature and sensitive to latency. If the cluster heartbeats are delayed by a Saturated NIC, blocked due to firewalls, etc, it could cause the cluster node to be removed from Cluster membership.

Intra-Cluster communication is executed to update the cluster database across all the nodes any cluster state changes. Clustering is a distributed synchronous system. Latency in this network could slow down cluster state changes.

IPv6 is the preferred network as it is more reliable and faster than IPv4. IPv6 linklocal (fe80) works for this network.

In Windows Clusters, Heartbeat thresholds are increased as a default for Hyper-V Clusters.

The default value changes when the first VM is clustered.

Cluster Property

Default

Hyper-V Default

SameSubnetThreshold

5

10

CrossSubnetThreshold

5

20

Generally, heartbeat thresholds are modified after the Cluster creation. If there is a requirement to increase the threshold values, this can be done in production times and will take effect immediately.

Configuring full mesh heartbeat

The Cluster Virtual Network Driver (NetFT.SYS) builds routes between the nodes based on the Cluster property PlumbAllCrossSubnetRoutes.

Value Description

0 Do not attempt to find cross subnet routes if local routes are found

1 Always attempt to find routes that cross subnets

2 Disable the cluster service from attempting to discover cross subnet routes after node successfully joins.

To make a change to this property, you can use the command:

(Get-Cluster). PlumbAllCrossSubnetRoutes = 1

References for configuring Networks for Exchange 2013 and SQL 2012 on Failover Clusters.

Exchange server 2013 Configuring DAG Networks.
https://technet.microsoft.com/en-us/library/dd298065(v=exchg.150).aspx

Before Installing Failover Clustering for SQL Server 2012
https://msdn.microsoft.com/en-us/library/ms189910.aspx

At TechEd North America 2013, there was a session that Elden Christensen (Failover Cluster Program Manager) did that was entitled Failover Cluster Networking Essentials that goes over a lot of configurations, best practices etc.

Failover Cluster Networking Essentials
https://channel9.msdn.com/Events/TechEd/NorthAmerica/2013/MDC-B337#fbid=ZpvM0cLRvyX

S. Jayaprakash
Senior Support Escalation Engineer
Microsoft India GTSC