Review Bridgehead Server Load-Balancing Improvements with Windows Server 2008 RODCs
Updated: January 7, 2011
Applies To: Windows Server 2008
One of the benefits of deploying read-only domain controllers (RODCs) in branch offices is unidirectional replication. Bridgehead servers in a hub site do not replicate Active Directory data from the RODCs in each branch office, which reduces the inbound replication load on the bridgehead servers and also reduces administration and network usage. For outbound replication from the hub site to the branch office sites, RODCs provide load-balancing improvements that help distribute outbound replication connections evenly across a set of bridgehead servers. This topic explains how load-balancing was typically handled in branch office deployments that used previous versions of Windows Server. It also explains how RODCs in Windows Server 2008 can improve load-balancing by redistributing replication connections automatically.
With previous versions of Windows Server, after you add a new bridgehead domain controller in a hub site (for example, for hardware replacement), there is no automatic mechanism to redistribute the replication connections between the branch domain controllers and the hub domain controllers to take advantage of the new hub domain controller for replication.
For example, consider a hub site that has two Windows Server 2003 domain controllers that replicate with domain controllers in six different branch offices. If you add a third domain controller in the hub site, none of the replication connections is automatically redistributed to the additional domain controller in the hub site.
For Windows Server 2003 domain controllers, you can rebalance the workload by using a tool such as Adlb.exe, which is included in the download package for the Windows Server 2003 Active Directory Branch Office Guide (http://go.microsoft.com/fwlink/?LinkID=28523).
For Windows Server 2008 RODCs, normal operation of the Knowledge Consistency Checker (KCC) provides load-balancing, which eliminates the need to use an additional tool such Adlb.exe. The automatic load-balancing is enabled by default. If you do not want RODC replication connections to be redistributed automatically and you prefer to distribute the load (Adlb.exe will not distribute the load from RODCs), you can disable automatic load-balancing by adding the following registry key on each RODC that you want to exclude from load-balancing:
“Random BH Load Balancing Allowed”
1 = Enabled (default), 0 = Disabled
The new automatic load-balancing applies only to RODCs. If you have writeable domain controllers in branch sites and you add a new bridgehead server in the hub site, continue to use a tool such as Adlb.exe to redistribute the workload across all the hub site bridgehead servers.
Example: How bridgehead server load-balancing works
When the KCC on an RODC detects a new bridgehead server candidate that it can replicate from, it determines whether to switch replication partners to that new bridgehead. This decision is based on an algorithm that provides probabilistic load-balancing.
For example, assume that there is a hub site with four bridgehead servers, named Hub-DC-01 through Hub-DC-04, as shown in the following illustration. There are 100 RODCs that perform inbound replication from the four bridgehead servers—a 25:1 ratio.
This ratio is intended only to illustrate how the new load-balancing for RODCs works. It is not intended to be a recommended ratio.
Another bridgehead server (Hub-DC-05) is added to the hub site, which creates a total of five. When each RODC replicates the Configuration partition from a bridgehead server, it detects the new bridgehead server. Then, on the next KCC run, the KCC determines whether the RODC should switch its replication connection to the new bridgehead server.
There is a one-in-five chance (a 20-percent probability) that an RODC will switch its replication connection. After all 100 RODCs have performed this operation, approximately 20 of them will have switched to replicate from the new bridgehead server.
If you remove a bridgehead server from the hub site, the KCC on each RODC that replicates with that bridgehead server again automatically creates a new connection object with one of the remaining bridgehead servers. There is no minimum number of hub bridgehead servers required for the load-balancing to work.
If you add more RODCs in the branches without adding more bridgehead servers in the hub site, the new connections are also balanced across the static set of bridgehead servers.
Failover for FRS replication of SYSVOL
The File Replication Service (FRS) connection object on an RODC must use the same target as the connection object that the KCC generates on the RODC for Active Directory replication. To achieve this, the fromServer value on the two connections is synchronized. If you are using FRS to replicate SYSVOL, the connection that is used for replication of SYSVOL data will also be switched to the new bridgehead server.
However, only the removal of the old connection triggers the fromServer value on the FRS connection object to change. The removal step happens, depending on the environment, up to eight hours after the new connection object is created. Consequently, the fromServer value continues to reference the original partner until the old connection is removed by the KCC.
A side effect of this is that while Active Directory replication works successfully against the new partner, FRS replication fails during this period. The additional delay is by design; it avoids causing FRS to perform an expensive version vector join (vvjoin) operation against the new partner. FRS performs a vvjoin operation when it creates a new connection. In this case, the new connection is not necessary if the outage of the original partner is only temporary.
Whenever a new connection object is created, FRS performs a vvjoin operation with its new replication partner while they synchronize their SYSVOL replication. Vvjoin is a CPU-intensive operation that can affect the performance of the server. It can also generate replication traffic. If there is a high number of connection failures that cause the KCC to create new connection objects frequently, the vvjoin operations can seriously affect server performance. For more information, see the Windows Server 2003 Active Directory Branch Office Guide (http://go.microsoft.com/fwlink/?LinkID=28523).
For more information about the KCC and Active Directory replication, see the Active Directory Replication Topology Technical Reference (http://go.microsoft.com/fwlink/?LinkId=208600).
Under most circumstances, you do not have to take any action when the FRS connection object for an RODC fails over to a new server. If you plan for the original replication partner to be online again within eight hours or if there is no need to replicate SYSVOL contents during that time, wait for the KCC to update the connection objects automatically.
Corrected the name of the Registry key Random BH Load Balancing Allowed.