Microsoft Sync Framework, Part 3: Sync Knowledge

Before we proceed to an overview of the synchronization process and we'll walk through a simple sync provider implementation, it's important to understand how sync metadata works and what knowledge operations are used by the sync engine and sync providers. This understanding, which can be difficult in the beginning, will make things much more clear going forward when you'll realize how different pieces of Microsoft Sync Framework interact with each other.

In this post I'll briefly walk through the most basic knowledge operations and explain where to use them.

Sync knowledge structure

In the previous post, I've described the sync metadata required to implement the sync solution. Part of the sync metadata which sync endpoint needs to store and maintain is the sync knowledge. Sync knowledge is the compact representation of all the changes which a particular sync endpoint knows about which is used during change enumeration and conflict detection phases of the sync process. As you could deduce from the sync knowledge operations, it has the following components:

  1. Scope clock vector.
  2. Range exceptions.
  3. Single item or change unit exceptions.

Scope clock vector defines the most compact knowledge representation where its size is proportional to the number of endpoints which synchronize with each other. Scope clock vector consists of components which are called clock vector elements. Each clock vector element currently is a replica key associated with the maximum tick count of a change which our endpoint knows about.

Each clock vector contains at least a single element dedicated to the local replica (or sync endpoint which is local for the given instance of sync knowledge). Because different replicas have different and unique Ids, there are no collisions between elements for local replicas for each replica.

Range exceptions are used to associate a clock vector with a range of items, i.e. for items which have item Ids falling into the range the knowledge will be equal to the clock vector of the range exception. As you can imagine the range exceptions consume less space in the sync knowledge structure because they typically specify clock vector for more than a single item.

Single item exceptions are used to associated a clock vector with a single item or change unit. This is the least compact sync knowledge representation and used only in the cases where it's impossible to use other size optimization techniques like scope or ranges.

When life is good and there are no conflicts, no errors during sync or sync interruptions, the sync knowledge remains compact and has just a scope clock vector. When life is not that good and there are errors, conflicts or sync interruptions, knowledge becomes fragmented and single item exceptions or range exceptions appear in it. Those exceptions identify "deviations" for certain items, change units or ranges of items from the scope clock vector. Note that these exceptions are not permanent and go away after subsequent successful sync or when conflicts get resolved and knowledge returns to its normal very compact representation.

Note that in the managed Microsoft Sync Framework API knowledge exceptions are called overrides.

Replica key/map

Replica key/map is defined in unmanaged API as an object which implements IReplicaKeyMap interface. Managed API uses ReplicaKeyMap class. Replica key/map is used to associate replica Ids assigned to sync endpoints with the 32-bit replica keys which are used internally in the sync knowledge to save the space allocated by the sync knowledge. The replica/key map can be serialized and deserialized along with the sync knowledge and typically is not required to be directly used by the sync solution (although it's used internally by the sync knowledge).

Operations on sync knowledge

Sync knowledge implements set of operations which allow it to be serialized (for storage), combined with another knowledge, project the knowledge onto a given item or change unit, exclude an item or change unit change from the knowledge, perform a check whether this knowledge instance knows about a given item or change unit change, discover knowledge structure and some other operations. The knowledge operations are also described in the following MSDN article.

Mapping of remote knowledge to local

The ISyncKnowledge::MapRemoteToLocal method in the unmanaged API (SyncKnowledge.MapRemoteKnowledgeToLocal method in the managed API) creates an ISyncKnowledge knowledge instance which is compatible with the sync knowledge this method is called upon. The reason why this may be required is because as you probably remember the sync knowledge references a replica key/map object and internally uses 32-bit replica keys to identify replica Ids of sync endpoints in the community. If we don't map foreign sync knowledge which uses a different replica key/map the operations on that knowledge may return incorrect results because those 32-bit replica keys may refer to different replica Ids in different replica/key maps. This operation is typically required to be done by the source of sync changes to correctly do containment checks on the destination knowledge.

Change containment check

This operation is used during change enumeration and conflict detection. For example, if an item has last update version associated with it A5 where A is the replica key and 5 is a tick count, the fact whether a given knowledge instance knows about this change can be established with the help of IBasicKnowledge::ContainsChange method in the unmanaged API (SyncKnowledge.Contains method in the managed API). For the purposes of the change enumeration the sync solution should return a change to the sync destination when destination's knowledge doesn't contain the change. Symmetrically, if a destination version for an item or change unit whose change we're trying to apply is not contained by the source's made-with knowledge, this indicates that we have an independent update to an item or change unit done both on source and destination otherwise known as a conflict.

Knowledge combining

The ISyncKnowledge::Union method in the unmanaged API (SyncKnowledge.Combine method in the managed API) merges information from another sync knowledge into the current one. This operation is typically used during the change application.

Item exclusion

The ISyncKnowledge::ExcludeItem method in the unmanaged API (SyncKnowledge.ExcludeItem method in the managed API) is used to indicate in the sync knowledge that it doesn't know anything about the item specified. This operation is used during change application to create exceptions in the knowledge.

Summary

I've just given a quick overview of some sync knowledge operations. Those operations are sufficient to implement a simple sync solution. There are many other sync knowledge operations useful in advanced scenarios which we'll cover later. Also, it's much easier to better understand knowledge operations described above during the practical walkthrough of the sync process which we'll be covered in the next post.