It turns out that weird things can happen when you mix Windows Server 2003 and Windows Server 2012 R2 domain controllers
UPDATE: The hotfix is now available for this issue! Get it at http://support.microsoft.com/kb/2989971
This hotfix applies to Windows Server 2012 R2 domain controllers and should prevent the specific problem discussed below from occurring.
It’s important to note that the symptoms of users and computers not being able to log on can happen for a number of different reasons. Many of the folks in the comments have posted that they have these sorts of issues but don’t have Windows Server 2003 domain controllers, for example. If you’re still having problems after you have applied the hotfix, please call in a support case so that we can help you get those fixed!
We have been getting quite a few calls lately where Kerberos authentication fails intermittently and users are unable to log on. By itself, that’s a type of call that we’re used to and we help our customers with all the time. Most experienced AD admins know that this can happen because of broken AD replication, unreachable DCs on your network, or a variety of other environmental issues that all of you likely work hard to avoid as much as possible - because let’s face it, the last thing any admin wants is to have users unable to log in – especially intermittently.
Anyway, we’ve been getting more calls than normal about this lately, and that led us to take a closer look at what was going on. What we found is that there’s a problem that can manifest when you have Windows Server 2003 and Windows Server 2012 R2 domain controllers serving the same domain. Since many of you are trying very hard to get rid of your last Windows Server 2003 domain controllers, you might be running into this. In the case of the customers that called us, the login issues were actually preventing them from being able to complete their migration to Windows Server 2012 R2.
We want all of our customers to be running their Active Directory on the latest supported OS version, which is frankly a lot more scalable, robust, and powerful than Windows Server 2003. We realize that upgrading an enterprise environment is not easy, and much less so when your users start to have problem during your upgrade. So we’re just going to come out and say it right up front:
We are working on a hotfix for this issue, but it’s going to take us some time to get it out to you. In the meantime, here are some details about the problem and what you can do right now.
1. When any domain user tries to log on to their computer, the logon may fail with “unknown username or bad password”. Only local logons are successful.
If you look in the system event log, you may notice Kerberos event IDs 4 that look like this:
Event ID: 4
"The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server host/myserver.domain.com. This indicates that the password used to encrypt the Kerberos service ticket is different than that on the target server. Commonly, this is due to identically named machine accounts in the target realm (domain.com), and the client realm. Please contact your system administrator."
2. Operating Systems on which the issue has been seen: Windows 7, WS2008 R2, WS2012 R2
3. This can affect Clients and Servers(including Domain Controllers)
4. This problem specifically occurs after the affected machine has changed its password. It can vary from a few minutes to a few hours post the change before the symptoms manifest.
So, if you suspect you have a machine with this issue, check when it last changed its password and whether this was around the time when the issue started.
This can be done using repadmin /showobjmeta command.
Repadmin /showobjmeta * “CN=mem01,OU=Workstations,,DC=contoso,DC=com”
This command will get the object metadata for mem01 server from all DC’s.
In the output check the pwdlastSet attribute and see if the timestamp is around the time you started to see the problem on this machine.
Why this happens:
The Kerberos client depends on a “salt” from the KDC in order to create the AES keys on the client side. These AES keys are used to hash the password that the user enters on the client, and protect it in transit over the wire so that it can’t be intercepted and decrypted. The “salt” refers to information that is fed into the algorithm used to generate the keys, so that the KDC is able to verify the password hash and issue tickets to the user.
When a Windows 2012 R2 DC is promoted in an environment where Windows 2003 DCs are present, there is a mismatch in the encryption types that are supported on the KDCs and used for salting. Windows Server 2003 DCs do not support AES and Windows Server 2012 R2 DCs don’t support DES for salting.
You might be wondering why these encryption types matter. As computer hardware gets more powerful, older encryption methods become easier and easier to break. Thus, we are constantly incorporating newer, more powerful encryption into Windows and Kerberos in order to help protect your user passwords (and your data and your network).
If users are having the problem:
Restart the computer that is experiencing the issue. This recreates the AES key as the client machine or member server reaches out to the KDC for Salt. Usually, this will fix the issue temporarily. (at least until the next password change).
To prevent this from happening, please apply the hotfix to all Windows Server 2012 R2 domain controllers in the environment.
How to prevent this from happening:
Option 1: Query against Active Directory the list of computers which are about to change their machine account password and proactively reset their password against a Windows Server 2012 R2 DC and follow that by a reboot.
There’s an advantage to doing it this way: since you are not disabling any encryption type and keeping things set at the default, you shouldn’t run into any other authentication related issue as long as the machine account password is reset successfully.
Unfortunately, doing this will mean a reboot of machines that are about to change their passwords, so plan on doing this during non-business hours when you can safely reboot workstations.
We’ve created a quick PowerShell script that you can run to do this.
Sample PS script:
> Import-module ActiveDirectory
> Get-adcomputer -filter * -properties PasswordLastSet | export-csv machines.csv
This will get you the list of machines and the dates they last set their password. By default machines will reset their password every 30 days. Open the created csv file in excel and identify the machines that last set their password 28 or 29 days prior (If you see a lot of machines that have dates well beyond the 30 days, it is likely these machines are no longer active).
Once you have identified the machines that are most likely to hit the issue in the next couple of days, proactively reset their password by running the below command on those machines. You can use tools such as psexec, system center or other utilities that allow you to remotely execute the command instead of logging in interactively to each machine.
nltest /SC_CHANGE_PWD:<DomainName> /SERVER:<Target Machine>
Option 2: Disable machine password change or increase duration to 120 days.
You should not run into this issue at all if password change is disabled. Normally we don’t recommend doing this since machine account passwords are a core part of your network security and should be changed regularly. However because it’s an easy workaround, the best mitigation right now is to set it to 120 days. That way you buy time while you wait for the hotfix.
If you go with this approach, make sure you set your machine account password duration back to normal after you’ve applied the hotfix that we’re working on.
Here’s the relevant Group Policy settings to use for this option:
Computer ConfigurationWindows SettingsSecurity SettingsLocal PolicesSecurity Options
Domain Member: Maximum machine account password age:
Domain Member: Disable machine account password changes:
Option 3: Disable AES in the environment by modifying Supported Encryption Types for Kerberos using Group Policy. This tells your domain controllers to use RC4-HMAC as the encryption algorithm, which is supported in both Windows Server 2003 and Windows Server 2012 and Windows Server 2012 R2.
You may have heard that we had a security advisory recently to disable RC4 in TLS. Such attacks don’t apply to Kerberos authentication, but there is ongoing research in RC4 which is why new features such as Protected Users do not support RC4. Deploying this option on a domain computer will make it impossible for Protected Users to sign on, so be sure to remove the Group Policy once the Windows Server 2003 DCs are retired.
The advantage to doing this is that once the policy is applied consistently, you don’t need to chase individual workstations.However, you’ll still have to reset machine account passwords and reboot computers to make sure they have new RC4-HMAC keys stored in Active Directory.
You should also make sure that the hotfix https://support.microsoft.com/kb/2768494 is in place on all of your Windows 7 clients and Windows Server 2008 R2 member servers, otherwise they may have other issues.
Remember if you take this option, then after the hotfix for this particular issue is released and applied on Windows Server 2012 R2 KDCs, you will need to modify it again in order to re-enable AES in the domain. The policy needs to be changed again and all the machines will require reboot.
Here are the relevant group policy settings for this option:
Computer ConfigurationWindows SettingsSecurity SettingsLocal PolicesSecurity Options
Network Security: Configure encryption types allowed for Kerberos:
Be sure to check: RC4_HMAC_MD5
If you have unix/linux clients that use keytab files that were configured with DES enable: DES_CBC_CRC, DES_CBC_MD5
Make sure that AES128_HMAC_SHA1, and AES256_HMAC_SH1 are NOT Checked
Finally, if you are experiencing this issue please revisit this blog regularly for updates on the fix.
- The Directory Services Team