Uniqueness requirements for attributes and objects in Active Directory
If you are involved in writing or using provisioning code for Active Directory you will be aware of uniqueness problems. What do you do with an account for John Smith if he is the tenth of his name? It helps to know about the conditions that Active Directory imposes on you when you create or modify objects. And for the sake of our sanity, let's limit the scope of this discussion to a single Active Directory forest. The goal of this blog is to understand uniqueness from the point of view of AD. It should be clear that applications such as Exchange Server or your own provisioning code will have additional requirements.
There are multiple kinds and degrees of uniqueness. Let's start with the forest a whole and work our way down. As a teaser, these are some of the objects and attributes will be covering:
- domain name
- schema definitions
- service principal name (SPN) and its cousin, user principal name (UPN)
- distinguished name
- samaccountname and trust names
- email addresses
- display name
Unique in the forest
Good examples of forest-wide unique objects are Domain Names, Partition Names, schema definitions and similar. Updates and changes to these are done by the forest FSMO servers: Domain Naming Master and the Schema Master. Uniqueness is enforced by limiting updates to a single server who can do the required checks in its own database. A nice feature of this method is that it does not depend on replication latency of any kind. I like to call this the strict uniqueness enforcement type. Not so nice about this method is that it doesn't scale well: a FSMO is a single point of failure, and any kind of high-frequency update should not belong here. But, there are other ways of making objects unique.
Take for example the object Guid. Any object in Active Directory has a GUID to identify it. Here, uniqueness is enforced (but is not mathematically certain) by a random generator generating a very large number. It's also an attribute that is determined by the system and that cannot be set or changed1 by admins. This type of uniqueness is also quite strong.
Yet another well-known case involves the Service Principal Name (SPN) and the User Principal Name (UPN). Kerberos requires that both of these are unique in the forest. Before Windows Server 2012 R2 this uniqueness was not verified by the base Active Directory engine. Rather, it was left up to provisioning apps such as AD Users & Computers to do their best to keep the attribute unique. This would be a soft uniqueness enforcement. It's more like a wish than a real effort of making it unique.
So yes, this soft requirement leads to a lot of Kerberos problems. With 2012 R2, we make better attempt at enforcing uniqueness of SPN and UPN. Whenever the SPN or UPN is updated, the DC checks at a global catalog, which will usually be itself, to see if the new value is unique. If not, the update is refused. This has been known to break existing provisioning code, which is a good thing.
But if you look carefully, you will see problems with any uniqueness check that depends on the Global Catalog.
- If forest wide replication has not converged, the GC may not have the most recent information, which could allow a duplicate to happen. Normally, this should translate to a replication conflict because both accounts should have the same name.
- Existing duplicates are not discovered until the attributes involved are updated. Forests created before 2012 R2 are a good example: they will very likely still have duplicate SPNs.
This is what I'd call hard uniqueness enforcement. Better than soft, but not on the same level as strict. Let's move on to another good one: object SIDs.
Strictly speaking SIDs must be unique in the domain, but because each domain has a different base SID, SIDs are also unique in the forest. How does this work? When the domain is created, it gets the domain SID from the server that was the promotion source for the first DC. As long as the SID of this server is unique, the domain will be OK. Here you see one of the prime reasons why SYSPREP is required when building server images. I'm pointing that out explicitly because Mark Russinovich' famous blogpost is sometimes misunderstood as saying that SYSPREP is not strictly needed.
Each domain has the Rid Master FSMO taking care of dealing out relative identifiers (RIDs) to all DCs. So this is an interesting construction where, contrary to the forest FSMOs, new SIDs are created by all DCs and are still unique. In theory, SID creation has strict uniqueness. This theory can be defeated by people playing around with unsupported virtualization procedures such as snapshots.
Moving on to the next subject: object naming. Each AD object has a unique path, known as the Distinguished Name (DN). This is enforced by making sure that each object is unique in its own container: users in an OU, Group Policies in their container, etc. To make this explicit, let's take an example of a DN: cn=user,ou=Org1,dc=sol,dc=local. The leaf object is cn=user. The user object has a Relative Distinguished Name (RDN) which makes it unique in its container. This container is an organizational unit, ou=Org1.
The question is, what exactly is the name of an object? An object is just a bag of attributes packed together in a class. Which one defines the object? In other words, what attribute defines the Relative Distinguished Name? The Schema has the answer. Each class definition has an attribute called rDNAttID, which has the name of the attribute defining the RDN. For most object types, this is simply the Common Name (CN), as in the case of the user. Other examples are OU for the Organizational Unit, or DC for Domain Component. In my test forest, this are the statistics for all class definitions:
rDNAttID #Classes ---- ----- c 1 cn 565 dc 4 l 1 msTAPI-uid 1 o 1 ou 1 uid 2
Takeaway: as long as the RDN is unique in its container, the object is unique in the forest. Observe that it is perfectly valid for multiple objects to have the same RDN, as long as they are not in the same container. AD enforces this strict uniqueness of DNs. So what happens if the same DN is created on different DCs before replication happens? That is certainly possible, because local uniqueness is still true. A replication conflict will happen, and one of the objects will get renamed to enforce the uniqueness.
The next case that I want to discuss is that of (forest) trust relations: forest trusts, shortcut trusts, tree-root trusts, etc. Trust names must be unique, otherwise there would be confusion about where accounts are hosted. Technically, trusts are represented by AD objects in the container CN=System,<domain>. As we just learned, this enforces a good level of uniqueness by itself. And because the trust object is created on the PDC, you cannot create conflicting trusts: strict uniqueness. There is a lot more to this, especially in forests with multiple domains, but to keep it simple: avoid external trusts because they only know about NTLM and are not transitive, and avoid trusts with domains having the same name.
Unique in the domain
Domain uniqueness clearly includes anything that is also unique in the forest, because a domain is simply part of the forest. Additionally, there is data that must be unique in the domain but need not be unique in the forest. The most important of this is the samAccountname attribute. For various object types this attribute has a different "friendly" name. On users, this is the User Logon Name, for groups it's the Group Name, and for computers: Computer Name. If a samAccountname is modified the DC looks into its own database for duplicates. If any are found, the update is refused. If duplicates are created on different DCs simultaneously you can have duplicate samAccountname attributes for a while. There is nothing in the AD replication engine that will catch the duplicate. Any object update relating to accountmanagement will detect the duplicate, and will rename one of the samAccountname values.
So the hard rule is that samAccountname must be unique in the domain, or there will be trouble. However, you should try to make the samAccountname unique in the forest as well. Security principals (users, computers) may have Service Principal Names (SPN) as discussed, and these must be unique in the forest. One of the SPNs a computer will have is: HOST/samAccountname. That's right, there will be an SPN conflict on the forest level if the host name is not unique. Any DC with 2012 R2 or higher will block updating the SPN record with this value.
Application dependent uniqueness
A final category is of uniqueness is where the AD core engine does not care about it, but applications do. Some good examples:
- email addresses, kept in the attributes mail or proxyaddressses. The SMTP protocol requires that these are unique, preferably worldwide. But AD does not care. Technically, it would not be hard for AD to check any email address modification with the Global Catalog, but this functionality was never implemented.
- Display Name, hopefully unique to avoid confusing your end users. I have seen applications that actually require this uniqueness.
- Employee ID, which by its nature should be unique in a company and should never be reused.
Here, AD cannot help. Your provisioning processes must take care of it.
In this post I have tried to give you an idea of uniqueness requirements in AD. It is by no means complete, but it should help you understand what is happening when you run into a uniqueness conflict.
(1) If you think carefully you will realize that there are some instances where it must be possible to determine up-front what the GUID on an object must be. However that is not something that we want non-microsofties to do...