Design Concepts for Implementing Reverse Joins
Applies To: Windows Server 2003 with SP1
About MIIS 2003 Design Concepts Guides
The success of an identity integration solution is closely tied to the planning you do upfront. To properly design and plan an identity integration solution, you must have a complete understanding of how your design decisions affect the data flow in and out of your Microsoft® Identity Integration Server 2003 (MIIS 2003) environment. The key to making good design decisions is to understand the nuances of how the different components in an MIIS environment work together so that the interaction between those components can be utilized to create the most effective and efficient solution. The pertinent information that will assist you in your solution-development effort is covered in the MIIS Design and Planning Guide collection available on the Microsoft Web site (http://go.microsoft.com/fwlink/?linkid=30436).
The Design Concepts guides contain detailed discussions of specific challenges that are often encountered during the design of MIIS solutions. These documents present some of the most common design issues that are discussed in newsgroups and in e-mail discussion groups. In each document, you will find the following:
Detailed explanations of particular design challenges.
Possible solutions with best recommendations.
Discussions of the pros and cons of potential solutions.
These challenges, and their proposed solutions, have been discovered and documented through numerous discussions with MIIS deployment experts. Once documented, further review cycles have been conducted on each solution by several MIIS deployment experts from both within and outside Microsoft.
While these guides present information about specific design challenges, anyone doing any type of MIIS design work might find them interesting. These solutions can provide insight into design issues not specifically addressed in these documents.
We hope you find these documents useful. If you would like to discuss the content of a document or if you have any questions, feel free to post a message on the MIIS newsgroup (http://go.microsoft.com/fwlink/?linkid=45219).
This document is targeted at advanced MIIS users. It is written with the assumption that you have:
A good understanding of the different components in MIIS 2003 and how they work together.
An understanding of MIIS concepts such as data aggregation, account management, synchronization rules, and the provisioning functionality as implemented in MIIS 2003.
Before you begin reading this document, make sure that you are familiar with the topic “MIIS 2003 Identity Management Process” documented in the MIIS 2003 Technical Reference (http://go.microsoft.com/fwlink/?linkid=30737).
The Provisioning Challenge
The goal of an identity integration scenario is to implement a central point of administration for distributed identities. This central point of administration consists of a metaverse (MV) object that is linked to all connector space (CS) objects representing the same physical identity. You can build this architecture in the following two ways:
You can project a connector space object into the metaverse and then join the existing connector space objects to the metaverse object, or
You can project a connector space object into the metaverse and then create new connector space objects during provisioning and push them out to the connected data sources.
The following illustration outlines a simple architecture of a metaverse object with two connectors.
Figure 1: Metaverse object with two connectors
An object from connected data source 1 (CDS 1) is staged in the connector space and then projected into the metaverse. Once the object exists in the metaverse, you can use the provisioning functionality to create the object in connected data source 2 (CDS 2).
The provisioning function has an important role in the identity integration process. Provisioning is triggered anytime a change is made to a metaverse object except when the metaverse object itself is deleted. Examples for changes are newly applied attribute values as well as new or removed links during data aggregation. During provisioning, you need to determine whether the applied change to the metaverse object requires further processing in the form of new connectors to be created in the connector spaces of your environment. If your provisioning code detects that the applied metaverse change satisfies your requirements for a connector in a connector space, you can count the number of connectors for the metaverse object in that connector space. If your connector counter indicates that the there is no connector, your code can create one.
An attempt to create connectors in a connector space during provisioning can generate conflicts with already existing connector space objects. In an Active Directory® directory service scenario, if the conflict is based on a collision of the distinguished name (also known as DN), which means your code tries to create a connector with a distinguished name that has already been assigned to another staging object in that connector space, the system throws an exception. Since this type of conflict can be addressed by using exception handling, it is also known as visible conflict. In your provisioning logic, you can now either skip the attempt to create a new connector or you can try to create one with a different distinguished name. In any case, a visible conflict cannot result in the creation of a conflicting connector.
This scenario looks different if the provisioning code tries to create a connector that is in conflict with an already existing connector space object and the conflicting attribute is not the distinguished name. For example, your provisioning code creates a connector with a samAccountName that is already assigned to another connector space object as outlined in the following figure:
Figure 2: Duplicate attribute values in the Target CS
In the above figure, there exists a conflict in the Target CS as User1 is the samAccountName that has already been assigned to an object. The bar on top indicates the direction of the synchronization run.
Such a conflict caused by duplicate attribute values cannot be detected by the provisioning logic and because of that it will not throw an exception. This type of a conflict it is known as invisible conflict. An attempt to export an object to Active Directory with a samAccountName that is already assigned to a different security principal generates an export error.
It is important to realize that it is possible to create connector space objects that are in conflict without detecting them during provisioning.
Regardless of a visible or an invisible conflict, the important question you need to ask is if the conflict was caused by a preexisting connector space object in the Target CS that the metaverse object should be joined to.
The challenge of provisioning is to find a solution that provides an answer to this question. The connector counter approach does not necessarily provide sufficient information to determine whether a new connector is needed. If the value of this counter is 0, it can also mean that a matching partner already exists and needs to be linked to the newly created metaverse object.
This scenario requires a solution that is outside of your provisioning code. During provisioning, you can create new connectors. However, you cannot join a new metaverse object with an existing connector space object.
The concept of a reverse join is a solution to this scenario. In the following sections, you can find a detailed explanation of what a reverse join is and its practical implementation and recommendations.
Reverse Joins Overview
In the previous section, you have been introduced to visible and invisible conflicts during the provisioning phase. This section introduces the concept of a reverse join as a solution to address these types of conflicts that are caused by preexisting connector space objects that are to be joined to newly created metaverse objects. This solution does not pertain to objects in the connected data sources that are not yet represented in the connector space.
In some scenarios, it is difficult to determine if a new connector space object needs to be created during provisioning. The connector space is a staging area used to store various states of an identity in the identity integration process. The actual attributes and attribute values are encapsulated within a data blob. This means that the entire information that MIIS has about an identity is stored as one column value in a row of a database table. As a consequence of this, the connector space is not searchable. During provisioning, there is no direct way to determine whether a corresponding connector space object already exists in a Target CS if it is not yet linked to a metaverse object. However, in some scenarios, a corresponding connector space object already exists, but you cannot link to it because this would require a “forward join” feature, which does not exist in MIIS 2003. The following picture illustrates this scenario:
Figure 3: No link to the object in the Target CS
In the above figure, you can see a staging object with the samAccountName, User1, in a connector space called Source CS and a representation of User1 in the Target CS. The bar on top indicates the direction of the synchronization run. During a synchronization run on the Source CS, User1 is projected into the metaverse, which also causes provisioning to be triggered. In this scenario, creating a new connector in the Target CS would generate an invisible conflict.
To ensure that the corresponding object in the Target CS is linked to the right object in the metaverse, you can initiate a synchronization run on the Target CS, which “reversely” joins the staging object in the Target CS on the basis of the samAccountName attribute with the newly created metaverse object. The following figure outlines this scenario:
Figure 4: Synchronization run from the Target CS to the Source CS
In the above figure, the bar on top indicates the synchronization run being initiated from the Target CS. The technique outlined above is also known as “reverse join”. A reverse join is a conceptual solution that consists of several steps including a combination of several run profiles used to connect related objects with each other. The term “reverse” refers to the fact that the join operation to the metaverse object is initiated from the Target CS or any other connector space that is different from the one that was used to project an object into the metaverse.
Understanding Reverse Joins
The previous section introduced the reverse join process as a concept to link matching objects together. In this section, you will get an in-depth understanding of how you can implement the reverse join process.
A synchronization run is implemented as a complete “CS-MV-CS run. This means, MIIS 2003 processes identity data from a Source CS towards the metaverse and if you apply a change to a metaverse object that triggers provisioning, staging objects in all connector spaces can be updated according to your synchronization rules. This operation is implemented as one cycle per processed object in a synchronization run. MIIS 2003 supports two different types of synchronization runs – full synchronization and delta synchronization. While all staging objects in a connector space are processed in case of a full synchronization, the delta synchronization only processes objects with a pending import.
MIIS initiates a full synchronization run only if you make changes to your synchronization rules logic.
A “pending import” is a connector space object with staged identity data that is waiting to be processed during the next synchronization run. The MIIS synchronization engine marks pending imports with a flag. This flag is cleared during a synchronization run. If you start two consecutive delta synchronization runs, no object is processed in the second run because the pending import flag is already cleared in the first delta synchronization run. This is important because in some scenarios, the synchronization process does not have enough information to make a good processing decision in a single run. As mentioned before, it is possible to create staging objects during provisioning that are in conflict with already existing objects. Moreover, the conflicting object can be a join partner of the newly created metaverse object.
The general idea of a reverse join is to use a synchronization run from a Target CS rather than the projecting connector space link corresponding objects. However, the implemented solution must also address those objects in a Source CS without a matching object in a Target CS. There are also objects requiring a connector to be created in the Target CS.
This means the “CS-MV-CS” transition for all objects has to be divided into three phases:
Projection of new objects from the Source CS into the metaverse.
Reversely joining matching partners from a Target CS.
Creation of new objects in a Target CS.
This phased approach of processing objects from a Source CS to a Target CS requires a combination of different synchronization run profiles because the reverse join activity has to be initiated from the Target CS.
The following figure illustrates this scenario:
Figure 5: No corresponding object for User2 in the Target CS
In this scenario, two objects (User1, User2) are staged in the Source CS and one object (User1) is staged in the Target CS. While User1 has a matching partner in the Target CS that can be reversely joined, there is no matching partner for User2.
In the analysis of reverse joins so far, there is a missing component. A complete reverse join solution needs to encompass two components, a component for objects that need to be reversely joined and a component for objects without a matching partner in the Target CS. After all the objects with matching partners in the Target CS are linked to the related metaverse objects, the solution must ensure that staging objects for the metaverse objects without matching partners are available in the Target CS. These objects are to be pushed out to the connected data sources during the next export run.
An early solution to this problem was based on using the enabled or disabled states of the provisioning functionality to implement the three phases. If provisioning was disabled, the system did not process the provisioning function. In this case, the synchronization process stopped in the metaverse, which was the desired result for phase 1. However, there were usually also objects that needed to be created. This was done in the third synchronization run with provisioning enabled.
The following table lists the three phases and the related implementation:
Projection of new objects from the Source CS into the metaverse
Synchronization run on Source CS with provisioning disabled.
Reverse join of matching partner objects from the Target CS
Synchronization run on the Target CS.
Creation of new objects in a Target CS
Full synchronization run on the Source CS with provisioning enabled.
Implementing Reverse Joins
Implementation of the reverse join process comprises the following steps:
The first synchronization run on the Source CS is in this scenario is initiated with provisioning feature disabled. As a result of this, the synchronization process stops in the metaverse with two new metaverse objects. With provisioning being disabled, the logic to determine whether an object needs to be created in a Target CS is not processed. However, projection takes place and creates new objects in the metaverse.
The following figure outlines this scenario:
Figure 6: Synchronization run with provisioning disabled
Next, the synchronization run needs to be initiated on the Target CS. In this scenario, matching objects in the Target CS can now synchronize with the objects in the connector space, as outlined in the following figure:
Figure 7: Matching objects in the Target CS are synchronized with the Source CS objects
The synchronization run on the Target CS “reversely” joins the matching objects from Target CS with the newly projected metaverse objects. The reverse join process is at this point not yet completed. There are still objects that do not conflict with each other in the Source CS that need to be provisioned to the Target CS. To complete this process, synchronization has to be initiated on the Source CS again. However, this time the provisioning feature must be switched on. After this synchronization run, the desired result is achieved. All objects from the Source CS have a corresponding non-conflicting partner in the Target CS. This is outlined in the following figure:
Figure 8: All objects in the Source CS have corresponding objects in the Target CS
Although the above detailed steps were an early solution and is a common approach for implementing reverse joins, it has some significant flaws that you must consider before you implement:
First, there is no programmatic way to switch provisioning off and on. MIIS 2003 requires you to manually set the required state of provisioning in the Identity Manager.
Additionally, the second synchronization run on the Source CS requires a full synchronization. Since there has been a synchronization run on this management agent already, none of the objects in this connector space are pending anymore. Therefore, processing objects from this connector space during a synchronization run requires a full synchronization. However, running a full synchronization on the Source MA is inefficient because all objects in this connector space are processed – object that require processing (which are in this case the minority) and objects that do not.
Although this example of how a reverse join is implemented is not recommended for a practical implementation because of its obvious flaws, trying it out in a lab setup is a good way to understand the idea of a reverse join. A reverse join is more than just a single feature. It is a series of steps that need to be implemented to achieve a specific goal.
In the following sections, you will find two recommended practical examples for reverse join implementations.
Reverse Joins based on Transient Provisioning
In the previous section, you have been introduced to the general idea of a reverse join solution. In this solution, a synchronization run on the Target CS is required to initiate joins of preexisting connector space objects with newly projected metaverse objects. The first implementation discussed in this document is based on an attempt to avoid the creation of duplicated objects. The challenge of this approach is to find a solution to reprocess those objects that are not in conflict and need to be provisioned to the target data source. The initiative to process these objects comes from the Source CS. Since the “pending flag” is cleared during the first synchronization run on the Source CS, there is only a full synchronization on the Source CS left as an option to process objects in it again.
Instead of avoiding the creation of duplicate objects, you can also disregard them during the first synchronization run on the Source CS and remove those objects that are not required later. This is the approach used in this section.
A reverse join solution always consists of at least two synchronization runs – one on the Source CS and one on the Target CS for the reverse join. Creating duplicates during the initial synchronization run on the Source CS is only a problem if these objects are not removed prior to an export run on the Target CS. The removal of duplicates, however, can also take place during the second synchronization run on the Target CS that reversely joins matching objects. With this approach you can implement the three reverse join phases in two synchronization runs.
The technique of intentionally disregarding the creation of duplicate objects during a synchronization run is also known as Transient Provisioning. The following example outlines how this solution works. The starting point of this solution is the same as in the pervious discussion. There are two objects in the Source CS and there is one matching object in the Target CS.
Figure 9: Two objects in the Source CS and one object in the Target CS
The first synchronization run on the Source CS is based on transient provisioning and creates in this example a conflicting object in the Target CS.
Figure 10: Conflicting object User1 in the Target CS
The next synchronization run on the Target CS reversely joins the already existing disconnector for User1 in the Source CS with the metaverse object. The following figure outlines this scenario:
Figure 11: Conflicting object in the Target CS linked to the metaverse object
A change to a metaverse object that is not its deletion triggers provisioning. In the current example, the metaverse object change is provided by the join operation of User1 from the Target CS.
In your provisioning logic, you can now detect that User1 has two connectors from the Target CS, which indicates a conflict. In this scenario, you can now leverage a special treatment of export objects. If an export object – a connector that has been created during provisioning but has not been exported yet – is deprovisioned, the object “disappears”. This special treatment of deprovisioned export objects is the basis of the transient provisioning technique.
By deprovisioning the export object representation of User1, the conflict situation is resolved and the environment is in the desired state as outlined in the following figure:
Figure 12: Deprovisioning resolves the conflict situation in the Target CS
How can the provisioning logic determine whether a connector needs to be deprovisioned? You can find the matching candidates by counting the connectors. If a metaverse object has more than one connector, one of them must be the export object that was created during the provisioning process from the Source CS. This object must be deleted. You can locate the right object by looping through the connectors and checking the connection rule. One object is connected by provisioning (during the synchronization run on the Source MA) and the other object by a join (during the synchronization run on the Target MA).
The connector that is created by provisioning is the object that can now be safely deprovisioned, which also deletes the object.
The following flowchart illustrates the processing logic of Transient Provisioning:
Figure 13: The Transient Provisioning flowchart
Transient Provisioning is an elegant way to handle invisible conflicts. Rather then avoiding conflicts, they are removed from the Target CS. However, this technique also introduces dependencies on the order in which synchronization run profiles are started. It is paramount to run a synchronization run profile on the Target CS prior to an export run profile to ensure that all duplicates are removed.
The following table summarizes the steps and the associated goals of Transient Provisioning:
Synchronization run on source management agent
Projection of new objects into the metaverse.
Creation of new objects in the Target CS.
Synchronization run on the Target MA
Reverse join of matching partners in the Target CS.
Deprovisioning of unnecessary connectors in the Target CS.
Reverse Joins based on Auxiliary Management Agents
A reverse join that is based on Transient Provisioning may produce conflicting connectors, which have to be removed in the synchronization run that follows on the Target CS.
Using an auxiliary management agent (MA) is an alternative that is based on a different approach. By using an Auxiliary MA, you can avoid the creation of unnecessary objects during provisioning.
The concept of an Auxiliary MA has already been implemented in the document titled, "MIIS 2003 Design Concepts for Managing Reference Attributes" that you can download from the Microsoft web site (http://go.microsoft.com/fwlink/?linkid=58875). An Auxiliary MA is an operational management agent. This means that connectors created in the connector space of the Auxiliary MA are never supposed to be exported. These objects only act as a memory of objects that need to be reprocessed.
The challenge of provisioning is that counting the available connectors to a Target CS might not give you a completely reliable answer for whether a new connector is needed or not. In case of Transient Provisioning, objects are created in the Target CS in the synchronization run on the Source CS. It is possible that some of them are not required. This is why you need the synchronization run on the Target CS – to remove superfluously created connectors. These connectors are unavoidable due to invisible conflicts.
You may not want to create objects in the Target CS that are unnecessary, even if the processing logic of Transient Provisioning has a concept to remove them. These objects can cause export errors if the synchronization run on the Target MA is not directly initiated after the synchronization run on the Source MA. The only alternative to creating duplicate connectors during the synchronization run on the Source CS is not to create any connector in the Target CS at all during this synchronization run.
The reverse join solution that is based on the Auxiliary MA is based on this approach. During a synchronization run that is initiated from the Source CS, connectors are only created in the connector space of the Auxiliary MA if the connector counter indicates that there might be a need to create a connector in the Target CS. These connectors act as a memory for potentially required connectors in the Target CS.
The next synchronization run on the Target CS is, as in case of Transient Provisioning, used to reversely join matching objects that have been projected into the metaverse with the Target CS.
In addition to that, connectors in the connector space of the Auxiliary MA are deprovisioned for each reversely joined object, because they do not need a new representation in the Target CS.
After a synchronization run on the Target CS, only objects requiring a representation in the Target CS are left in the connector space of the Auxiliary MA.
By running a full synchronization on the Auxiliary MA, you can now get the missing (non-conflicting) connectors created in the Target CS.
Like in the case of a Transient Provisioning, a reverse join that is based on an Auxiliary MA introduces a requirement to have synchronization run profiles to be processed in a specific the order.
The rest of this section uses the same example that you saw in case of Transient Provisioning to explain the Auxiliary MA solution.
There are two staging objects in the Source CS and one staging object representing an invisible conflict in the Target CS. The first synchronization run on the Source MA creates two new connectors in the connector space of the Auxiliary MA.
The following figure outlines the state of the current scenario after the first synchronization run on Source CS.
Figure 14: Synchronization run on the Source CS
The connectors in the connector space of the Auxiliary MA function as a memory for objects requiring reprocessing. However, User2 already has a representation in the Target CS and is not needed. Superfluous connectors in the connector space of the Auxiliary MA have to be removed during the synchronization run on the Target CS. These objects already have a representation in Target CS. During this run, matching objects are reversely joined, which also triggers provisioning. The provisioning logic can detect that the reversely joined object has a connector in the connector space of the Auxiliary MA.
In this case, the provisioning logic deprovisions the connector in the connector space of the Auxiliary MA since it is not needed anymore. The deprovisioned connector is an export object – a connector that was created during provisioning but has not been exported yet. As such, the connector just disappears if it is deprovisoned.
The following figure outlines the state of the current scenario after the synchronization run on Target CS:
Figure 15: Synchronization run on the Target CS
The remaining connectors in the connector space of the Auxiliary MA are objects that have not been reversely joined and need to get a representation in the Target CS. By running a full synchronization on the connector space of the Auxiliary MA, you can now remove the remaining connectors in this connector space and create the missing representations in the Target CS. This technique leverages a feature of full synchronization. During a full synchronization, all export objects are removed and reprocessed as desired in this scenario. Since provisioning is triggered for connectors in the connector space of the Auxiliary MA, you can now create the missing objects in the Target CS.
The following figure outlines the state of the current scenario after the synchronization run on the Auxiliary CS:
Figure 16: Synchronization run on the Auxiliary CS
After the last synchronization cycle is completed, all objects are in the desired state. User2 is reversely joined and a new connector for User1 is staged in the connector space of Target CS.
The following table summarizes the steps and the associated goals of a reverse join that is based on an Auxiliary MA:
Synchronization run on Source MA
Projection of new objects into the metaverse.
Creation of memory object in the Auxiliary CS.
Synchronization run on the Target MA
Reverse join of matching partners in the Target CS.
Deprovisioning of memory object in the Auxiliary CS for joined object.
Synchronization run on the Auxiliary MA
Creation of objects without matching partner in the Target CS.
Removing all objects from the Auxiliary CS.
It is important to note that a reverse join based on Auxiliary MA does not necessarily represent the "better" solution. With the Auxiliary MA approach, you can avoid the creation of unnecessary objects and it is a cleaner solution. However, using an Auxiliary MA adds a noticeable amount of complexity to your overall solution.
Recommendations for Implementing Reverse Joins
This document introduced you to two recommended solutions for implementing reverse joins into your scenario. A reverse join implementation based on Transient Provisioning is a very elegant method, which introduces less complexity in comparison to using an Auxiliary MA. However, with Transient Provisioning you can generate export errors, by starting the required synchronization run profiles in an incorrect order.
Auxiliary MAs are in all scenarios helpful, where you need to keep a memory of objects requiring reprocessing. In the document titled, "MIIS 2003 Design Concepts for Managing Reference Attributes" that you can download from the Microsoft web site (http://go.microsoft.com/fwlink/?linkid=58875), you can also find another example of a scenario that can be solved by using Auxiliary MAs. This solution is the preferred method if you are already planning on implementing Auxiliary MAs into your environment.
A reverse join solution that requires manual changes of the provisioning state should only be used if the required manual interaction as well as the requirement for running a full synchronization is not considered to be an issue in a given scenario. In any case, this solution does not represent a recommended solution. You can try out the solution in a lab setup to understand the idea of a reverse join. A reverse join is more than just a single feature. It is a series of steps that need to be implemented to achieve a specific goal.