AD RMS Performance Insight from Microsoft IT's Implementation

Applies To: Windows Server 2008, Windows Server 2008 R2

Microsoft’s internal AD RMS deployment was one of the first in existence worldwide, and as such it has been in operation for a longer period than most other deployments. Microsoft’s IT experience in operating the AD RMS infrastructure, and in particular in managing performance and logging, might be of interest to you if you are responsible for an implementation.

Microsoft’s AD RMS architecture was initially deployed in 2003 by using Windows RMS beta, prior to its release for the Windows Server 2003 platform. The infrastructure was later moved to the final version of RMS v.1, and then to the SP1 and SP2 versions. In 2007 it was upgraded to AD RMS in Windows Server 2008, and during 2009 it was upgraded again to Windows Server 2008 R2. Nevertheless, during all these changes, the core architecture remained basically the same.

Microsoft operates with five main independent Active Directory forests. One of them, the Corp forest, hosts the majority of users and systems. The other forests are used for the development of specific software, or for testing. There are also support environments. These host a significant number of users and thus have to support both production and consumption of rights-protected documents.

In order to centralize and simplify the operation as much as possible, Microsoft implemented a single RMS licensing cluster in the Corp forest and placed only certification clusters in the other forests (with some exceptions related to product development that use a separate licensing cluster that handles relatively little volume and is kept independent for testing purposes ).

Certification clusters are composed of server pairs with their own standalone database server. The majority of logging volume in certification clusters is related to the issuance of Rights Account Certificates, which are generally issued the first time a computer is used for creating or accessing rights-protected content. As this process is in general performed only once by each user, the logging volume generated by RACs is typically an order of magnitude lower than the volume in licensing clusters. Another source of logging activity in certification clusters is due to the issuance of Server Licensor Certificates to sub-enrolled licensing clusters, a process that generates very low volume, even in complex environments. Given this, certification activity does not generate a significant amount of logging volume, and no AD RMS-specific maintenance must be performed on these databases.

The central AD RMS cluster, which performs both certification for the Corp forest users and licensing for all users in the company (with the exception of the users in the Exchange test forest) is composed of four servers connected to a single database server.

Since the main certification and licensing cluster at Microsoft handles a large volume of operations, its logging database tends to grow significantly in a short period of time. With the initial Windows RMS implementation, logging database growth was significant, since this version logged, by default, copies of all issued certificates. In AD RMS in Windows Server 2008, this is no longer the case, and while logging copies of certificates is still an option, the default logging configuration produces much more modest volumes of data. Additionally, the logging AD RMS database presents a more normalized structure, so less redundant data is stored. This further reduces logging database volume.

Nevertheless, logging activity can yield a significant volume of data being written to the log database for large environments with significant certification and licensing activity. Such is the case of Microsoft’s own deployment, with almost a hundred thousand users and significant volumes of protected mail and documents.

When Windows RMS was originally deployed, it was observed that RMS logging volume grew very quickly for the reasons explained previously. Since the technology was new, it was decided that detailed certificate logging was desired, so that option was kept enabled. But in order to prevent the logging databases from growing indefinitely, log trimming and consolidation was implemented.

A single aggregation database was implemented, which would consolidate partial logs from all the separate licensing and certification clusters. In the main certification and licensing cluster database server, a stored procedure was deployed that identified all records in the logging database older than 31 days, extracted potentially useful data and shipped it to the consolidation database for aggregation. It also deleted the original, old records from the local logging database.

This procedure led to local logging databases that stayed at nearly constant volume, while maintaining, at the consolidation database, aggregated data that enabled long term analysis and diagnostics. These trimming and aggregation scripts were shared with Microsoft’s customers and published as part of the RMS service kit.

Later, with the deployment of AD RMS, the logging volumes became much less significant, as explained earlier. Nevertheless, a solution to trim the databases to keep them at relatively constant size was still desired. Since the logging database schema in AD RMS changed significantly from Windows RMS, the original procedures no longer applied and new database purging procedures were developed that run every night and delete old records. Unlike the original Windows RMS scripts, the consolidation database schema is not significantly different from the main AD RMS logging database schema, and consolidation is done for aggregation, more than for space reduction purposes.

The script developed by Microsoft IT for this procedure is detailed in the AD AD RMS Log Purging Sample. While this script might be useful in most AD RMS deployments, it was developed specifically based on Microsoft’s internal needs. Thus, you should carefully review and adapt it to your own environment before you put it into operation.

At Microsoft, AD RMS logging information is used for various purposes. Daily performance and error reports are run which present IT managers an overview of the health of the AD RMS system. Reports include the number of licenses issued, licensing failures, errors, distribution of licensing requests by client type and other useful information in a graphical format. Reports are generated through SQL Reporting Services, and are created and sent via e-mail automatically every day to interested parties. One important use of such reports is identifying trends in the environment that might yield to future problems. In one such situation, a custom e-mail add-in being used by a small number of users had an error that generated large numbers of invalid requests to the AD RMS servers. As those requests weren’t visible to users, the error wasn’t evident at first. But when the add-in began becoming more popular, the AD RMS managers saw a trend of increasing invalid requests being received by the servers, which led them to research the issue, identify the offending application and correct the problem, before it led to massive amounts of licensing errors, possibly leading to performance problems.

Custom reports are also used for troubleshooting. In some cases, troubleshooting requires detailed logging of issued licenses. In these cases, detailed logging is momentarily enabled and reports that include information from the certificates are generated. Once the troubleshooting tasks end, the logging level is brought back to the default. Deleting the certificates inserted into the logs during the troubleshooting period is not necessary as they will be automatically purged after one month by the scheduled purging process described before. Standard reports run from the AD RMS console are used for some troubleshooting and help desk tasks. These are run from the source AD RMS logging databases, instead of the aggregated databases.

Logged information can also be used for protected document discovery, for administrative or legal purposes, by enabling identification of all documents protected by a specific user in a specific period. Such reports can be complemented by automated bulk protection tools that remove protection to produce unprotected copies of all protected documents. Microsoft uses the superusers group, which can be enabled to override license restrictions, to perform these tasks.

In addition, AD RMS logging is used at Microsoft for performance analysis and server sizing, as well as to provide feedback to the product development groups on AD RMS usage, errors and performance.

Microsoft IT’s internal implementation details are explained in more detail in the “Deploying and Managing AD RMS at Microsoft” whitepaper.