AD RMS: Recovering from a database failure with an old backup
You are an AD RMS administrator and the SQL team calls you because the worst possible thing has happened…the AD RMS SQL instance has died and the database backup that was scheduled to run daily hasn’t run in months. Doom and dread fill your heart until they assure you that they do have full backups from a few months ago, and they are restoring them to a new SQL server. They changed the old CNAME in DNS and it is now pointed to the new instance. Hooray! Life is good! Right??? Or maybe not…
You log into the AD RMS server after the restore and you can open the console. Everything appears to be right where you left it. Unfortunately, users are unable to protect content and new users cannot bootstrap to open protected content. They are getting error messages like "Sorry, something went wrong opening information rights management protected content. The request is not supported.". Being the resourceful admin that you are, you download the RMS Analyzer from http://aka.ms/RMSAnalyzer, and run it with local admin rights on one of the affected clients. On the RMS Analyzer main screen, you select AD RMS User, click OK, and then Run Diagnostics.
After running diagnostics, you see an error like the one below.
You also go and look in the Application log on the AD RMS server and find the following error.
Very interesting, and they certainly appear to be bad, but how do you fix them?
To resolve this specific type of error, you (once again being the resourceful admin you are) go to your favorite search engine (that’s Bing, right?) and search for AD RMS Event ID 11 site:technet.microsoft.com. The first link on every search engine (I tested several) should take you right to the answer you are looking for at https://technet.microsoft.com/en-us/library/cc726178(v=ws.10).aspx. On this page, you find a reference to the specific error you noticed in the RMS Analyzer and in the Event ID 11 on the server, InvalidPrivateKeyPasswordException. It also lists the steps to reset the private key password in the AD RMS console. You take a deep breath, reset the private key password, and test on the client using the RMS Analyzer. This time you get a clean bill of health and send out a nice note to your users to test. I am sure at this point your company throws you a parade and give you a big raise, right? Well, maybe not. Regardless, you know you saved the day and the world is a safer place because of nameless heroes like yourself. You prepare a nice, pleasant email to the SQL admins about fixing their backups so this doesn’t happen again, and go home feeling like a rock star.
Why would this ever happen?
Now, normally this would never happen. However, when admins change it is a good policy to change important things like the private key password. You, being the good admin you are, did that when one of your fellow admins moved on. And, let’s just assume that although you are following best practices to do full backups of your AD RMS databases, something happened and the backup has not taken place since you changed said private key password. I doubt this would ever happen in production, but in test and QA environments it is very possible that something like this could occur. Then, you have the catastrophic SQL server outage described above and the scenario plays out.
Hopefully, there are enough keywords and SEO magic in this blog post that if someone runs into this problem they will stumble on this post and know quickly how to fix this error. I have also walked through some of the troubleshooting steps I like to take for any AD RMS error. Between using the RMS Analyzer to identify client (and server) issues, reviewing AD RMS server logs, and doing some good advanced web searching, there are likely not many errors in AD RMS you can’t solve. Now…if you throw in 3rd party addins and Hardware Security Modules (HSMs), that is a completely different story and things can go sideways pretty darn quick. But for most basic implementations, this should get you going in the right direction.
What comes next?
So, you restored from a database that is god knows how old and you don’t exactly have a warm, fuzzy feeling about the stability and fidelity of your server. That is not a great place to be when you are using it to protect your valuable corporate data. Lucky for you, in my next post I will propose a way to stand up a new infrastructure and test it without your current infrastructure being affected in any way. http://blogs.technet.microsoft.com/kemckinn/2017/01/10/adrms-side-by-side-migration-from-ad-rms-on-2008-r2-to-2012-r2/
Please feel free to comment and let me know if this has been helpful or if there is anything I have missed. Thanks!
PREMIER FIELD ENGINEER – PLATFORMS/SECURITY