Log collection for DFSR replication issues
DFSR (DFS Replication) is a successor of FRS (File Replication Service), which helps replicating files/folders between Windows servers OS. DFSR was first introduced in Windows 2003 R2 for replicating DFS share/ data folders. From Windows 2008 onwards we can also use DFSR to replicate Sysvol share across DCs.
Below are a few new important features in DFSR:
- DFSR can be used to replicate folders without it being a part of DFS or shared.
- When a file is changed, only the changed blocks are replicated, not the entire file. The RDC protocol determines the changed file blocks that needs to be replicated.
- DFS Replication uses a conflict resolution heuristic of last writer wins for files that are in conflict (that is, a file that is updated at multiple servers simultaneously) and earliest creator wins for name conflicts. Files and folders that lose the conflict resolution are moved to a folder known as the Conflict and Deleted folder. You can also configure the service to move deleted files to the Conflict and Deleted folder for retrieval should the file or folder be deleted.
- DFS Replication is self-healing and can automatically recover from USN journal wraps, USN journal loss, or loss of the DFS Replication database.
- You can configure DFS Replication to use a limited amount of bandwidth on a per-connection basis (bandwidth throttling).
- Better management and diagnostic tools (‘DFS management’, DFSR diagnostic health report and propagation test/reports, DFSRdiag etc.)
Addition with Windows 2008 R2:
- Support for Windows Failover Clusters
- Read-only Replicated Folders (now with true filter driver support)
- SYSVOL on Read-only Domain Controllers (leveraging the improved Read-only functionality)
For more information about DFS Replication, please refer to the articles below:
DFS Replication: Frequently Asked Questions (FAQ)
DFSR Does Not Replicate Temporary Files
Five Common Causes of “Waiting for the DFS Replication service to retrieve replication settings from Active Directory”
Get out and push! Getting the most out of DFSR pre-staging
Top 10 Common Causes of Slow Replication with DFSR
DFSR replication dependencies:
- Connectivity with DC to get DFSR configuration (DSFR PollAD)
- AD Replication – with replication issues, the DFSR RG members might get different view of the DFSR RG configuration.
- RPC port connectivity between the DFSR members – RPC ports, the DFSR services are listening on should be allowed on the firewall.
- Enough free space on the drive hosting the replica to add more data and do staging.
- Access to the data (System account should not be denied access) – the DFSR service should be able to read/write and lock the files its replicating.
If there is any issue with any of the above dependencies, DFSR replication may stop or become slower.
A few times I’ve also observed that improper management of DFSR RG / RF can also lead to replication issues. Like…
- In Windows 2003 R2 and Windows 2008, it’s not recommended to disable or delete DFSR connection objects to force just one-way replication. The support for one way replication has been included from Windows 2008 r2. If you need to stop replication for some time either change the schedule on the connection objects or disable it for a sometime. Keeping the connection objects disabled or deleting one-way connection objects may see unexpected data deletion issues when the connection objects get re-created or may cause issues with DFSR database.
- Disabling and enabling of RG membership should not be done to stop replication for some time. This is because when a disabled membership is enabled again, the enabled member will perform an initial sync again.
- After changing DFSR configuration, it always recommended to wait for the AD replication to complete so that all DFSR members get the information about the change. If all DFSR members are not updated with the configuration stored in AD, then it may cause issues like a downstream server still replicating changes even if the connection objects is disabled using DFS management on the upstream server.
If you are suspecting or know that the DFSR replication between members is failing I suggest you collect the following logs. These logs will help you as well as the ‘Support Engineer’ you will work with if you create a ticket with Microsoft support.
From the servers involved in DFSR replication, collecting the following information would help:
1. DFSR diagnostic health report
2. DFSR propagation test and report
After a few minutes of running the DFSR Propagation test, you can run the Propagation report to get the status of the replication.
3. DFSR event logs in CSV format
4. Collect DFSR debug logs from the members from under c:\windows\debug folder
5 . Run MSinfo32 and once loaded save it into a .nfo file.
Apart from the above logs a “Support Engineer” in Microsoft support may ask you to increase the DFSR debug log severity before collecting the debug logs. A SE may also send you a tool named DFSRInfo (Microsoft CSS Internal tool) to collect more information from the DFSR member.
The following controls the log settings and describes the defaults:
SETTING: Debug Log Severity
wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set debuglogseverity=5
SETTING: Debug Log Messages
Range: 1000 to 4294967295 (FFFFFFFF)
wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set maxdebuglogmessages=500000
SETTING: Debug Log Files
Range: 1 to 10000
wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set maxdebuglogfiles=200
SETTING: Debug Log File Path
wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set debuglogfilepath="d:\dfsrlogs"
NOTE: The path must be created manually; if not, at service restart, the default value %windir%\debug will be used.
SETTING: Enable Debug Logging (NOTE: Debug logging is enabled by default)
Range: TRUE or FALSE
wmic /namespace:\\root\microsoftdfs path dfsrmachineconfig set enabledebuglog=true
- Abizer Hazrat