Get out and push! Getting the most out of DFSR pre-staging
Hi, Ned here again. Today I am going to explain the inner workings of DFSR pre-staging in Windows Server 2003 R2, debunk some myths, and hand out some best practices. Let’s get started.
To begin, this is the last time I will say ‘pre-staging’ . While the term is commonly used, it’s a bit confusing once you start mixing in terminology like the Staging directories. So from here in I will refer to this as ‘pre-seeding’ and hope that it enters your vernacular.
Pre-seeding is the act of getting a recent copy of replicated data to a new DFSR downstream node before you add that server to the Replicated Folder content set. This means that we can minimize the amount of data we transfer over the wire during the initial sync process and hopefully have that downstream server be available much quicker than simply letting DFSR copy all the files in their entirety over potentially latent network links. Administrators typically do this with NTBACKUP or ROBOCOPY.
How Initial Sync works
Before we can start pre-seeding, we need to understand how this initial sync system works under the covers. The diagram below is grossly simplified, but gets across the gist of the process:
Take a long look here and tell me if you can see a performance pitfall for pre-seeding. Give up? In step 6 on the upstream server, files need to be added to the staging directory before the downstream server can decide if it needs the whole file, portions of a file, or no file (because they are identical between servers). Even if both servers have identical copies, the staging process must cycle through on the upstream server in order to decide what portions of the file to send. So while very little data will be on the wire when all is said and done, there is some inherent churn time upstream while we decide how to give the downstream server what it needs, and it ends up meaning that initial sync might take longer than expected on the first partner. So how can we improve this?
How initial sync works with pre-seeding
First let’s take a look at how things will work on our third and all subsequent DFSR members in a Replication Group:
Since the staging directory upstream is already packed full of files, a big step is skipped for much of the process and the servers can concentrate on actually moving data or file hashes around. This means things go much faster (keeping in mind that the staging directory is a cache and is finite; the longer one waits, the more likely changes are to push out previously staged data). In one repro I did for this post, I found these results in my virtual server environment :
- Three Windows Server 2003 Enterprise R2 SP2 servers running in Virtual Server 2005 VM’s on a private virtual network.
- 4GB staging (the default).
- 5.7GB data on a separate volume on upstream server.
- To determine replication time, I measured the difference between DFSR Event Log event 4102 and 4104 (like so):
Event Type: Warning Event Source: DFSR Event Category: None Event ID: 4102 Date: 2/8/2008 Time: 11:40:35 AM User: N/A Computer: 2003MEM21 Description: The DFS Replication service initialized the replicated folder at local path e:\dbperf and is waiting to perform initial replication. The replicated folder will remain in this state until it has received replicated data, directly or indirectly, from the designated primary member.
Event Type: Information Event Source: DFSR Event Category: None Event ID: 4104 Date: 2/8/2008 Time: 11:40:36 AM User: N/A Computer: 2003MEM21 Description: The DFS Replication service successfully finished initial replication on the replicated folder at local path e:\dbperf.
- New Replication Group with no pre-staging
- Initial sync took 28 minutes (baseline speed)
- New Replication Group with one downstream server
- Pre-seeded data with NTBACKUP on the downstream server
- Initial sync took 24 minutes (~15% faster than baseline)
- Same replication group with original two servers
- Added a new third DFSR member
- Pre-seeded data with NTBACKUP on the new downstream server
- Initial sync took 13 minutes (~55% faster than baseline)
55% faster is nothing to blow your nose at – and this is just a small amount of low latency data. If you take a very large set of data on a very slow link with high latency then base initial sync could take for example 2 weeks, out of which only 2 hours are spent to stage files and compute hashes, and the rest by sending data across the wire. In this case pre-seeding may be (1 week - 2 hours) / 1 week = 99% faster. As you can see, the fact that data was already staged upstream meant that we spent considerably less time rolling through the staging directory and didn’t spend most of our time verifying the servers are in sync.
To get the most bang for our buck, we can do some of the following to spend the least amount of time populating the staging directory and the most time syncing files:
- Set the staging directory quota on your hub servers as close to the size of your data as possible . Since hub servers tend to be beefier boxes and certainly closer to home than your remote branches, this isn’t a problem for most administrators. If you have the disk space, a staging quota that is the same size as the data volume will give the absolute best results.
- When pre-seeding, always use the most recent backup possible and pre-seed off hours. The less data that is in flux in the staging directory while we run through initial replication the better. This may seem like a no-brainer, but customers frequently contact us about slow initial sync that they started at 9AM on a Monday with a terabyte of highly dynamic data!
- The latest firmware, chipset, network and disk drivers from your hardware vendor will usually give an incremental performance increase (and not just with DFSR performance). You wouldn’t dream of running your servers without service packs and security hotfixes – why wouldn’t you treat your hardware the same way?
Important Technical Notes (updated 2/28/09)
1. ROBOCOPY - If you use robocopy.exe to pre-seed your data, ensure that you use the permissions on the replicated folder root (i.e.c:\my_replicated_folder) to be identical on the source and target servers before beginning your robocopy commands. Otherwise when you have robocopy mirror the files and copy the permissions, you will get unnecessary 4412 conflict events and perform redundant replication (your data will be fine). The issue here is in how robocopy.exe handles security inheritence from a root folder, and how that can change the overall hash of a file. So using the command-line /COPYALL /MIR /Z /R:0 is perfectly fine as long as the permissions on the source and destination folder are *identical* . After pre-seeding your data with robcopy, you can always use ICACLS.EXE to verify and synchronize the security if necessary.
2A. NTBACKUP (on Win2003 R2) - If you use NTBACKUP to pre-seed your data on a server where it already hosts DFSR data on that same volume (i.e. you are going to use a new Replicated Folder on the E: drive, and some other data was already being replicated to that E: drive), and you plan on restoring from a full disk backup, you need to understand an important behavior. NTBACKUP is aware of DFSR; NTBACKUP will set a restore key under the DFSR services key in the registry (HKLM\System\CurrentControlSet\Services\DFSR\Restore\<date time> and mark the DFSR service with a non-authoritative restore flag for that volume. The DFSR service will be restarted and the Replicated folders on that volume will do a non-authoritative sync. This should not be destructive to data, but it can mean that you could see your downstream server become unresponsive for minutes or hours while it syncs. When DFSR was written the thought was that NTBACKUP would be used for disaster recovery, where you would certainly be suspicious of the data and DFSR jet database and want consistency sync performed at restore time.
2B. Windows Server Backup (Windows Server 2008 and Windows Server 2008 R2) - same as above but with newer tools. Do not use NTBACKUP to remotely backup or restore WIndows Server 2008 or later. This is unsupported and will mark files HIDDEN and SYSTEM, which you certainly don't want...
3. XCOPY - The XCOPY /O command works correctly even without having the root folder permissions set identically, unlike robocopy. However it is certainly not as roboust and sophisticated as robocopy in other regards. So Xcopy is a valid option, but maybe not powerful enough for many users.
4. Third party solutions - be wary of third party tools and test them carefully before committing to using them for wide-scale pre-seeding. Thekey thing to remember is that the file hash is everything - if DFSR cannot match the upstream and downstream hashes, it will replicate the file on initial sync. This includes file metadata, such as security ACL's (which are not calculated by tools that do checksum calculating). In Windows Server 2008 R2 beta, check out the DFSRDIAG tool to see how we have made this a bit easier for people. If you really need a file hash checking tool, contact us with a support case, we have some internal ones.
Finally – I don’t have numbers here for Windows Server 2008 yet, sorry. I can tell you that DFSR behaves the same way in regards to the staging process. Based on the performance improvements made elsewhere though (specifically the 16 concurrent file downloads combined with asynchronous RPC and IO), it should be much faster, pre-seeded or not; that’s the Win2008 DFSR mandate.
- Ned Pyle