question

MaurySC-7320 avatar image
0 Votes"
MaurySC-7320 asked DSPatrick commented

USN Rollback on a virtualized DC

Hello all,

We're having replication problems with our two domain controllers, following restoring one of the DC's from backup. I have since learned that restoring DC VM's is not a good idea. But, indications are it is due to USN Rollback.

We have one physical domain controller (running Server 2012 R2) and another one in a VM, running on VMWare. Both DC's have the same services running - DNS, NPS, NTP. The VM's host became unavailable, so we had to restore backup of that VM onto another VMWare host to get one of our applications working again.

A few days later, we noticed that no one could RDP in to the VM DC. Also, the VM DC has all FSMO roles, and I wanted to transfer those to the physical server prior to installing updates. However, I got an error when attempting that from the physical server, stating the other server could not be contacted.

So I ran "repadmin /replsummary" and "dcdiag" which indicated replication wasn't happening. I did find that the time way off on the VM DC, but, fixing that didn't help. I checked the NTDS registry settings on both machines, and on the VM DC, is has a DSA Not Writable key and value indicating a USN Rollback.

We have since gotten the original VMWare server that the DC was on back up, and that DC's VM is still available to start up. But I am a bit leery of trying that, after the restored VM has been running. And that DC (the VM) still has all of the FSMO roles.

So now I'm at a bit of a road block, deciding on the best way to resolve that. Would it be advisable to just shut down the restored DC VM and boot the original one back up, and see if everything just works? Or, demote the VM DC, cleanup, and promote again -- and if so, what about those FSMO roles on it? (I do also have recent filesystem backups of that VM DC, from before it was restored, which include the system state and AD information.)

Below are the results of running repadmin /replsummary and dcdiag on both DC's:

DC4 is our physical DC, and DC5 being the VM

From repadmin on DC4:

Source DSA largest delta fails/total %% error
DC4 11d.22h:47m:07s 5 / 5 100 (8457) The destination server is currently rejecting replication requests.
DC5 11d.17h:18m:38s 5 / 5 100 (8456) The source server is currently rejecting replication requests.

Destination DSA largest delta fails/total %% error
DC4 11d.17h:18m:39s 5 / 5 100 (8456) The source server is currently rejecting replication requests.
DC5 11d.22h:46m:19s 5 / 5 100 (8457) The destination server is currently rejecting replication requests.

From repadmin on DC5:
Source DSA largest delta fails/total %% error
DC4 11d.23h:00m:54s 5 / 5 100 (8457) The destination server is currently rejecting replication request
DC5 11d.17h:32m:25s 5 / 5 100 (8456) The source server is currently rejecting replication requests.

Destination DSA largest delta fails/total %% error
DC4 11d.17h:33m:16s 5 / 5 100 (8456) The source server is currently rejecting replication requests.
DC5 11d.23h:00m:56s 5 / 5 100 (8457) The destination server is currently rejecting replication request



From dcdiag on DC4:

   ......................... DC4 passed test ObjectsReplicated

Starting test: Replications
[Replications Check,DC4] A recent replication attempt failed:
From DC5 to DC4
Naming Context: DC=ForestDnsZones,DC=(Domain),DC=com
The replication generated an error (8456):
The source server is currently rejecting replication requests.
The failure occurred at 2021-08-18 10:30:30.
The last success occurred at 2021-08-06 17:15:23.
1125 failures have occurred since the last success.
Replication has been explicitly disabled through the server options.

…And from DC5, the VM DC:


Testing server: Default-First-Site-Name\DC5
Starting test: Advertising
Warning: DsGetDcName returned information for \\DC4…, when we were trying to reach DC5.
SERVER IS NOT RESPONDING or IS NOT CONSIDERED SUITABLE.
......................... DC5 failed test Advertising
Starting test: FrsEvent
......................... DC5 passed test FrsEvent
Starting test: DFSREvent
There are warning or error events within the last 24 hours after the SYSVOL has been shared. Failing SYSVOL replic
......................... DC5 passed test DFSREvent
Starting test: SysVolCheck
......................... DC5 passed test SysVolCheck
Starting test: KccEvent
......................... DC5 passed test KccEvent
Starting test: KnowsOfRoleHolders
......................... DC5 passed test KnowsOfRoleHolders
Starting test: MachineAccount
......................... DC5 passed test MachineAccount
Starting test: NCSecDesc
......................... DC5 passed test NCSecDesc
Starting test: NetLogons
......................... DC5 passed test NetLogons
Starting test: ObjectsReplicated
......................... DC5 passed test ObjectsReplicated
Starting test: Replications
[Replications Check,Replications Check] Inbound replication is disabled.
To correct, run "repadmin /options DC5 -DISABLE_INBOUND_REPL"
[Replications Check,DC5] Outbound replication is disabled.
To correct, run "repadmin /options DC5 -DISABLE_OUTBOUND_REPL"
......................... DC5 failed test Replications
Starting test: RidManager
......................... DC5 passed test RidManager
Starting test: Services
w32time Service is stopped on [DC5]
NETLOGON Service is paused on [DC5]
......................... DC5 failed test Services
Starting test: SystemLog

<Group policy errors>

    ......................... DC5 failed test SystemLog
   Starting test: VerifyReferences
      ......................... DC5 passed test VerifyReferences


Running partition tests on : DomainDnsZones
Starting test: CheckSDRefDom
......................... DomainDnsZones passed test CheckSDRefDom
Starting test: CrossRefValidation
......................... DomainDnsZones passed test CrossRefValidation

Running partition tests on : ForestDnsZones
Starting test: CheckSDRefDom
......................... ForestDnsZones passed test CheckSDRefDom
Starting test: CrossRefValidation
......................... ForestDnsZones passed test CrossRefValidation

Running partition tests on : Schema
Starting test: CheckSDRefDom
......................... Schema passed test CheckSDRefDom
Starting test: CrossRefValidation
......................... Schema passed test CrossRefValidation

Running partition tests on : Configuration
Starting test: CheckSDRefDom
......................... Configuration passed test CheckSDRefDom
Starting test: CrossRefValidation
......................... Configuration passed test CrossRefValidation

Running partition tests on : <Domain>
Starting test: CheckSDRefDom
......................... <Domain>passed test CheckSDRefDom
Starting test: CrossRefValidation
......................... <Domain>passed test CrossRefValidation

Running enterprise tests on : <Domain>.com
Starting test: LocatorCheck
......................... <Domain>.com passed test LocatorCheck
Starting test: Intersite
......................... <Domain>.com passed test Intersite


Thanks for any guidance!

windows-active-directory
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

DSPatrick avatar image
1 Vote"
DSPatrick answered DSPatrick commented

It isn't recommended to restore a domain controller from backup in a multi domain controller environment. The safer cleaner option is to seize roles to another healthy one
https://docs.microsoft.com/en-us/troubleshoot/windows-server/identity/transfer-or-seize-fsmo-roles-in-ad-ds


then perform cleanup of failed one.
https://docs.microsoft.com/en-us/windows-server/identity/ad-ds/deploy/ad-ds-metadata-cleanup
https://techcommunity.microsoft.com/t5/itops-talk-blog/step-by-step-manually-removing-a-domain-controller-server/ba-p/280564

then rebuild the failed one from clean install media. Use dcdiag / repadmin tools to verify health correcting all errors found before starting any operations. Then stand up the new one, patch it fully, license it, join existing domain, add active directory domain services, promote it also making it a GC (recommended), transfer FSMO roles over (optional), transfer pdc emulator role (optional), use dcdiag / repadmin tools to again verify health, when all is good you can decommission / demote old one.


--please don't forget to upvote and Accept as answer if the reply is helpful--








· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

TThanks Patrick! I'll give this a shot this weekend.


Another thought that I had, is that we still have the original VM that DC was on (which was not restored from backup), it just has been shut down since the host crashed. After reinstalling ESXi on the host, we found the datastore was still intact. Can you think of any issues with just trying to start up that VM again (after shutting down the VM that was restored from backup)?

Thanks again

0 Votes 0 ·

Its difficult to say at this point. You could give it a try.




1 Vote 1 ·

I started up our original DC VM and everything looks to be running fine with it. Thanks again for your help Patrick.

0 Votes 0 ·
Show more comments
DSPatrick avatar image
0 Votes"
DSPatrick answered

Just checking if there's any progress or updates?

--please don't forget to upvote and Accept as answer if the reply is helpful--



5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.