DFSN and DFSR-RO interoperability: When Good DFS Roots Go Bad…
Hello, Ken here. Today I want to talk about some behaviors that can occur when using Distributed File System Namespaces (DFSN) along with Distributed File System Replication (DFSR) Read-Only members.
The DFSN and DFSR services are independent of each other, and as such have no idea what the other is doing at any given time. Aside from having a name that is confusingly similar, they do completely different jobs. DFSN is for creating transparent connectivity between shared folders located on different servers and DFSR is for replicating files.
With the advent of Windows 2008 R2, we gave you the ability in DFSR to create “Read-Only” members. This was a frequent request in previous versions, and as such, it works well.
Historically, a common configuration we have seen with DFSN is replication of the actual namespace root, replicating a namespace root folder can cause weirdness. Every time the DFSN service starts, it creates reparse points that link to the various shared folders on the network. If a replication service is monitoring the file system for changes, then we can run into some timing issues with the creation and deletion of those reparse points, leading to the possible inability of clients to connect to a namespace on a given server. If you have ever talked to a Microsoft Directory Services engineer they probably discouraged your plans to replicate roots.
Today I am going to show you what will happen if you have a DFS Namespace and you are using DFSR to replicate it, and then you decide to use the handy-dandy read-only feature on one or more of the members.
The setup and the issue
First we start with a word from our sponsor about DFSR: if you would like to know more specifically how read-only members work, check out Ned’s Bee-Log, Read-Only Replication in R2.
Here we have a basic namespace:
Here I am accessing it from a Windows 7 machine, so far so good:
Now I have realized that the scheduled robocopy job that “the guy” who used to work here before me setup is not the greatest solution in the world, so I’m going to implement DFSR and keep all my files in sync all the time, 'cause I can and it’s awesome, so I create a DFSR replication group.
Now because I’m working at the DFS root, I don’t have the “Replication” tab option in the namespace properties…hmmm. That’s ok, I can just go down to the Replication section of the DFS management console and create one, like this:
Now that’s all replicating fine and I can create and modify new data.
Quick check of my connection, and I see that I am now connected to the read-only server:
I attempt to make changes to the data in the share, and get the expected error:
This is the DFSRRO.sys filter driver blocking us from making the changes. When a member is marked as read-only, and the DFSRRO.sys driver is loaded, only DFSR itself can change the data in the replicated folder. You cannot give yourself or anybody enough permission to modify the data.
So far everything is great and working according to plan.
Fast-forward a few weeks, and now it’s time to reboot the server for <insert reason here>. No big deal, we have to do this from time to time, so I put it in the change schedule for Friday night and reboot that bad boy like a champ. The server comes up, looks good, and I go home to enjoy the weekend.
Come Monday I get to work and start hearing reports that users in the remote site cannot access the DFS data and they are getting a network error 0x80070035:
This doesn’t add up because all the servers are online and I can RDP to them, but I am seeing this same error on the clients and the server itself. In addition, if I try to access the file share on the server outside of the namespace I get this error "A device attached to the system is not functioning":
What is happening is normal and expected, given my configuration. All the servers are online and on the network, but what I have done is locked myself out of my own share using DFSR. However, the issue here is not really DFSR, but rather with DFSN.
Remember earlier, I said when a member is marked as read-only and the DFSRRO.sys driver is loaded, only the DFSR service can make changes to the data in the folder, this includes the “folder” itself. This is where we run into the issue. When the DFS Namespace server starts, it attempts to create the reparse points for the namespace and all the shares below it. The reparse points are stored on the root target server; in my case, it’s the default “DFSRoots” folder. What I have done here has made the root share inaccessible to the DFSN service. Using ETW tracing for the DFSN service, we see the following errors happening under the covers:
· [serverservice]Root 00000000000DF110, name Stuffs
· [serverservice]Opened DFSRoots: Status 0
· [serverservice]Opened Stuffs: Status c0000022
· [serverservice]DFSRoots\Stuffs: 0xc0000022(STATUS_ACCESS_DENIED)
· [serverservice]IsRootShareMountPoint failed share for root 00000000000DF110, (Stuffs) (\??\C:\DFSRoots\Stuffs) 0xc0000022(STATUS_ACCESS_DENIED)
· [serverservice]Root 00000000000DF110, Share check status 5(ERROR_ACCESS_DENIED)
· [serverservice]AcquireRoot share for root 00000000000DF110, (Stuffs) 5(ERROR_ACCESS_DENIED)
· [serverservice]Root folder for Stuffs, status: 5(ERROR_ACCESS_DENIED)
· [serverservice]Done with recognize new dfs, status 0(ERROR_SUCCESS), rootStatus 5(ERROR_ACCESS_DENIED)
This DFSRRO filter driver is doing its job of not letting anyone change the data.
Note: You may be wondering how I gathered this awesome logging for DFS, I used a utility called tracelog.exe. Tracelog is a part of the Windows Driver Kit , and is an event tracing controller that runs from the command line, and can be used for all kinds of other ETW tracing also. Since the tracelog output requires translation, you will need to open a support case with Microsoft in order to read it. Your friendly neighborhood Microsoft Directory Services engineer will be able to help you get it translated.
So what do we do to fix this? Well, the quickest way to resolve the issue is to remove the read-only configuration for the replicated folder. When you add or remove the read-only configuration for a replicated folder, DFSR will run an initial sync process. This should not take too long as the data should already be mostly in sync, if the data Is not fully in sync then is could take longer, and will be handled as pre-seeded. Once the initial sync is complete, you will need to restart the DFS Namespace service on the member to allow DFSN to recreate the reparse points. After doing this you will be back in business:
Moving forward, we need to make some decisions about how to avoid outages.
The best recommendation would be to stop storing data in the root folder of the DFS Namespace. Instead create a new folder- either on the same server in a different location or on another server - and create a new file share for it; I called mine “Replicated-Data”. Then you create a new DFS share under the namespace; I named this “Data”. Configure the folder target to the new file share with the recently moved data. Once you have your new DFS share, DFSN will even give you the option of configuring replication once you add a second folder target:
When you select the “Replicate Folder Wizard”, it launches the DFSR creation wizard, and we can configure the replication group. Once we run through that, it will look something like this:
Now we have our namespace root, and our replicated folder separated. The namespace root is located at “C:\DFSroots\Stuffs” and the replicated folder is located at “C:\DFSR_Replicated_Folders\Data” so when we configure the replicated folder as read-only, it will not affect our reparse points in the DFS root folder.
Now if we reboot the server DFSN and DFSR are able to do their respective jobs, without any conflict and clients disruptions.
And all was right with the DFSR world once again. Thanks for sticking it out this long.
Ken "I don’t like the name Ned gave me" McMahan