Friday Mail Sack: The Gang’s All Here Edition
Hi folks, Ned here again with your questions and our answers. This is a pretty long one; looks like everyone is back from vacation, winter storms, and hiding from the boss. Today we talk Kerberos, KCC, SPNs, PKI, USN journaling, DFSR, auditing, NDES, PowerShell, SIDs, RIDs, DFSN, and other random goo.
- DC NIC teaming
- SPNs with IP addresses
- Moving the DFSR conflict folder
- Issuing user certs to unmanageable Apple devices
- DFSR USN Journal recommendations
- Windows 2008 DFS Event Log Messages
- DFSR and object access auditing SACLs
- SID uniqueness
- USN journal loss
- Setting SPN with AD PowerShell
- KCC nomination
- Other random goo
Is NIC teaming recommended on domain controllers?
It’s a sticky question – MS does not make a NIC teaming solution, so you are at the mercy of 3rd party vendor software and if there are any issues, we cannot help other than to break the team. So the question you need to answer is “do you trust your NIC vendor support?”
Generally speaking, we are not huge fans of NIC teaming, as we see customers having frequent driver issues and because a DC probably doesn’t need it. If clients are completely consuming 1Gbit or 10Gbit network interfaces, the DC is probably being overloaded with requests. Doubling that network would make things worse; it’s better to add more DCs. And if the DC is also running Exchange, file server, SQL, etc. you are probably talking about an environment without many users or clients.
A failover NIC solution is probably a better option if your vendor supports it. Meaning that the second NIC is only used if the first one burns out and dies, all on the same network.
We used to manually create SPNs with IP addresses to allow Kerberos without network name resolution. This worked in Windows XP and 2003 but stopped working in later operating systems. Is this expected?
Yes it is. Starting in Windows Vista and forever more, the OS examines the format of the SPN being requested and if it is only an IP address, Kerberos is not even attempted. There’s no way to override this behavior. If I look at it in practical terms, having manually set an IP Address for SPN:
Then I actually try mapping a driver here with an IP address (which would have worked in XP in this scenario):
No tickets were cached above. And in the network capture below, it’s clear that I am using NTLM:
This is why in this previous post – see the “I want to create a startup script via GPO” and “NTLM is not allowed for computer-to-computer communication” sections – I highly discouraged customers from this sort of hacking. What I didn’t realize when I wrote the old post was that I now have the power to control the future with my mind.
I see that the DFSR staging folder can be moved, but can the Conflict and Deleted (\dfsrprivate\conflictanddeleted) folder be relocated? If so, how?
It cannot be moved or renamed – this was once planned (and there is even an AD attribute that makes one think the location could be specified) but it never happened in the service code. Regardless of what you put in that attribute, DFSR ignores it and creates a C&D folder at the default location.
For example, here I specified a completely different C&D path using ADSIEDIT.MSC before DFSR even created the folder. Once I started the DFSR service, it ignored my setting and created the conflict folder with defaults:
We are trying to find the best way to issue Active Directory "User" certificates to iPhones and iPads, so these users can authenticate to our third party VPN appliance using their "user" certificate. We were thinking that MS NDES could help up with this. Everything I have read says that NDES is used for non domain "computer or device" enrollment.
[From Rob Greene, author of previous post iPad / iPhone Certificate Issuance ]
Just because the certificate template that is used by NDES must be of type computer does not mean you cannot build a SCEP protocol message to the NDES Server for use by a user account on the iPhone in question.
Keep in mind that the SCEP protocol was designed by Cisco for their network appliances to be able to enroll for certificates online. Also understand what NDES means - Network Device Enrollment Service.
Realistically there is no reason why you cannot enroll for a certificate via SCEP interface with NDES and have a user account using the issued certificate. However, NDES is code to specifically only allow for enrollment of computer based certificate templates. If you put a user based template name in the registry for it to issue, it will fail with a not –so-easily deciphered message.
That said, keep in mind that the subject or Subject Alternative Name field identifies the user of the certificate not the template.
So what you could do is:
- Duplicate the computer certificate template.
- Then change the subject to “Supply in the Request”
- Then give the template a unique name.
- Make sure that the NDES account and Administrator have security access to the template for Enroll.
- Assign the Template to be issued.
- Then you need to assign the template to one of the purposes in the NDES registry (You might want to use the one for both signing and encrypting). See the blog.
Now you have a certificate with the EKU of Client Authentication and a subject / SAN of the user account, I don’t see why you could not use that for what you need. Not that I have tested this or can test this, mind you…
Is there a “proper” USN Journal setting versus replicated data sizes, etc. on the respective volumes housing DFSR data? I've come across USN journal wrap issues (that properly self heal ... and then occur again a month or so later). I’m hoping to know a happy medium on USN journal sizing versus size of volume or data that resides on that volume.
I did a quick bit of research - in the history of all MS DFSR support cases, it was necessary to increase the USN journal size for five customers – not exactly a constant need. Our recommendation is not to alter it unless you get multiple 2202 events that can’t be fixed any other way:
The DFS Replication service has detected an NTFS change journal wrap on volume %2.
A journal wrap can occur for the following reasons:
1.The USN journal on the volume has been truncated. Chkdsk can truncate the
journal if it finds corrupt entries at the end of the journal.
2.The DFS Replication service was not running on this computer for an extended
period of time.
3.The DFS Replication service could not keep up with the rate of file changes
on the volume.
The service has automatically initiated the journal wrap recovery process.
Since you are getting multiple 2202 occurrences, I would recommend first figuring out why you are getting the journal wraps. The three reasons listed in the event need to be considered – the first two are avoidable (fix your disk or controller and stop turning the service off) and should be handled without a need to alter the USN journal.
The third one may mean you are not using DFSR as recommended, but that may be unavoidable. In that case, set the USN size value to 1GB and validate that the issue stops occurring. We have no real formula here (remember, only five customers ever), but if you cannot spare another 512MB on the drive you have much more important problems to consider around disk capacity. If still not enough, revisit if DFSR is the right solution for you – the amount of changes occurring would have to be so incredibly rapid that I doubt DFSR could ever realistically keep up and converge. And make sure that nothing else is updating all the files outside of the journal on that drive – there is only one journal and it contains entries for all files, even the ones not being replicated!
Just to answer the inevitable question: you use WMI to increase the USN journal size.
On Win2003 R2 only:
1. Determine the volume in question (USN journals are volume specific) and the GUID for that volume by running the following:
WMIC.EXE /namespace:\\root\Microsoftdfs path DfsrVolumeInfo get VolumePath
WMIC.EXE /namespace:\\root\Microsoftdfs path DfsrVolumeInfo get VolumeGUID
This will return (for example:)
2a. Raise the USN Journal Size (for one particular volume):
WMIC /namespace:\\root\microsoftdfs path dfsrvolumeconfig.VolumeGuid="%GUID%" set minntfsjournalsizeinmb=%MB SIZE%
where you replace '%GUID%' with the volume GUID and '%MB SIZE%' with a larger USN size in MB. For example:
WMIC /namespace:\\root\microsoftdfs path dfsrvolumeconfig.VolumeGuid="D1EB0B66-9403-11DA-B12E-0003FFD1390B" set minntfsjournalsizeinmb=1024
This will return 'Property Update Successful' for that GUID.
2B. Raise the USN Journal Size (for all volumes)
WMIC /namespace:\\root\microsoftdfs path dfsrvolumeconfig set minntfsjournalsizeinmb=%MB SIZE%
This will return 'Property Update Successful' for ALL the volumes.
3. Restart server for new journal size to take effect in NTFS.
Update 4/15/2011 - On Win2008 or later:
1. Open Windows Explorer.
2. In Tools | Folder Options | View - uncheck 'Hide protected operating system files'.
3. Navigate to each drive's 'system volume information\dfsr\config' folder (you will need to add 'Administrators, Full Control' to prevent access denied error).
4. In Notepad, open the 'Volume_%GUID%.xml' file for each volume you want to increase.
5. There will be a set of tags that look like this:
6. Stop the DFSR service.
6. Change '512' to the new increased value.
7. Close and save that file, and repeat for any other volumes you want to up the journal size on.
8. Start the DFSR service back up.
There is a list of DFS Namespace events for Server 2000 at http://support.microsoft.com/kb/315919. I was wondering if there is a similar list of Windows 2008 DFS Event Log Messages?
That event logging system in KB315919 exists only in Win2000 – Win2003 and later OSs don’t have it anymore. That KB is a bit misleading also: these events will never write unless you enable them through registry settings.
Registry Key: HKEY_LOCAL_MACHINE\SOFTWARE\MicroSoft\Windows NT\CurrentVersion\Diagnostics
Value name: RunDiagnosticLoggingDfs
Value type: REG_DWORD
Value data: 0 (default: no logging), 2 (verbose logging)
Registry Key: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Dfs
Value name: DfsSvcVerbose
Value type: REG_DWORD
Value data: Any one of the below three values:
0 (no debug output)
1 standard debug output
0x80000000 (standard debug output plus additional Dfs volume call info)
Value name: IDfsVolInfoLevel
Value type: REG_DWORD
Value data: Any combination of the following 3 flags:
Dave and I scratched our heads and in our personal history of supporting DFSN, neither of us recalled ever turning this on or using those events for anything useful. Not that it matters now, Windows 2000 is as dead as fried chicken.
We currently have inherited auditing settings on a lot of files and folders that live on our two main DFSR servers. The short story is that before the migration to DFSR, the audit settings were apparently added by someone to the majority of the files/folders. This was replicated by DFSR and now is set on both servers. Thankfully we do not have any audit policies turned on for those servers currently.
That is where the question comes in: there may be a time in the relatively near future that we will want to enable some auditing for a subset of files/folders. Any suggestions on how we could remove a lot of the audit entries on these servers, without forcing nearly every file to get processed by DFSR?
Nope, it’s going to cause an unavoidable backlog as DFSR reconciles all the security changes you just made – the audit security is part of the file just like the discretionary security. Don’t do that until you have a nice big change control window open. Maybe just do some folders at a time.
In the future, using Global Object Access Auditing would be an option (if you have Win2008 R2 on all DFSR servers). Since it is all derived by LSA and not directly stamped, DFSR won’t replicated the file – the files are never actually changed. It’s slick:
In theory, you could get rid of the auditing in place currently currently and just use GOAA someday when you need it. It’s the future of file auditing, in my opinion; using direct SACLs on files should be discouraged forever more.
Does the SID for an object have to be unique across the entire forest? It is pretty clear from existing documentation that the SID does have to be unique within a domain because of the way the RID Master distributes RID pools to the DCs. Does the RID Master in the Forest Root domain actually keep track of all the unique base SIDs of all domains to ensure that there is no accidental duplication of the unique base domain SIDs?
A SID will be unique within a forest, as each domain has a unique base SID that combines with a RID. That’s why there’s a RID master per domain. There is no reasonable way for the domain SIDs to ever be duplicated by Windows, although I have seen some third party products that made it happen. All hell broke loose, we don’t plan for the impossible. :) Even if you use ADMT to migrate users with SID History within a forest, it will not be duplicated as the migration will always destroy the old user when it is “moved”.
The RID Masters don’t talk to each other within the forest (any more than they would between different forests, where a duplicate SID would cause just as many problems when you tried to create a trust). The base SID is a random 48 bit number, so there is no reasonable way it could be duplicated by accident in the same environment. It comes down to us relying on the odds of two domains that know of each other ending up with the same SID through pure random chance – highly unlikely math.
You’ll also find no mention of inter-RID master needs or error messages communication here:
I have this message in a health report:
“A USN journal loss occurred 2 times in the past 7 days on E:. DFS Replication monitors the USN journal to detect changes made to the replicated folder. Although DFS Replication automatically recovers from this problem, replication stops temporarily for replicated folders stored on this volume. Repeated journal loss usually indicates disk issues. Event ID: 2204”
Is this how the health report indicates a journal wrap or can I take “loss” literally ?
Ouch. That’s not a wrap, the journal was deleted or irrevocably damaged. I have never actually seen that event in the field, only in a test lab where I deleted my journal intentionally (using the nasty command: FSUTIL.EXE USN DELETEJOURNAL). I would suspect either a failing disk or 3rd party disk management software. It’s CHKDSK and disk diagnostic time for you.
The net recovery process is similar to a wrap for event 2204 ; the journal gets recreated, then repopulated like a wrap recovery (it uses the same code). You get event 2206 to know that it’s fixed.
How come there is no “Set-SPN” cmdlet in AD PowerShell?
Specifies the service principal names for the account. This parameter sets the ServicePrincipalNames property of the account. The LDAP display name (ldapDisplayName) for this property is servicePrincipalName. This parameter uses the following syntax to add remove, replace or clear service principal name values.
To add values:
To remove values:
To replace values:
To clear all values:
You can specify more than one change by using a list separated by semicolons. For example, use the following syntax to add and remove service principal names.
The operators will be applied in the following sequence:
The following example shows how to add and remove service principal names.
We do not have any special handling to retrieve SPNs using Get-AdComputer or Get-Aduser (nor any other attributes – they treat all as generic properties). For example:
get-adcomputer name –properties serviceprincipalnames | select-object –expand serviceprincipalnames
I used select-object –expand because when you get a really long returned list, PowerShell likes to start truncating the readable output. Also, when I don’t know which cmdlets support which things, I sometimes cheat use educated guesses:
I have posted a TechNet forum question around the frequency of KCC nomination and rebuilding and I was hoping you could reply to it.
“…He had made an update to the Active Directory Schema and as a safety-net had switched off one of our domain controllers whilst he did it. The DC (2008 R2) that was switched off was at the time acting as the automatically determined bridgehead server for the site.
Obviously the next thing that has to happen is for the KCC to run, discover the bridgehead server is still offline and re-nominate. My colleague thinks that this re-nomination should take upto 2 hours to happen. However all the documentation I can find suggests that this should be every 15 minutes. His argument is that it is a process of sampling, that it realises the problem every 15 minutes but can take upto 2 hours to actually action the change of bridgehead.
Can anyone tell me which of us is right please and if we could have a problem?”
We are running an exchange program between MS Support and MS Premier Field Engineering and our current guest is AD topology guru Keith Brewer. He replied in exhaustive detail here:
Attaboy Keith, now you’re doing it our way – when in doubt, use overwhelming force.
Other random goo
- If you’re going to jailbreak phones, do it with Microsoft – you get a free handset and t-shirt instead of a subpoena.
- The 2011 CES Innovation Honoree awards are out. Holy crap, the Digital Storm Online gaming rig is nom nom nom. I also want the Recon goggles for no legitimate reason.
- SlingBox and Win7 Phone have had a beautiful baby.
- It’s utterly impossible, but Duke Nukem Forever comes out May 3rd. Trailer is not SFW, as you would expect.
Unless it doesn’t.
- Star Wars on Blu-ray coming in September, now up for pre-order. Damn, I guess I have to get Blu-ray. Hopefully Lucas uses the opportunity to remove all midichlorian references.
- The 6 Most Insane Cities Ever Planned. This is from Cracked, so as usual… somewhat NSFW due to swearing.
- Not sure which sci-fi apocalypse is right for you? Use this handy chart.
- It was an interesting week for Artificial Intelligence and gaming, between Starcraft and Jeopardy.
Until next time.
Ned “and return to Han shooting first!” Pyle