TFS Version Control Concepts 2: Item Names

Last time we learned that the principle objects in the source control system are called Items, and a few of their basic properties.  That was awhile ago so let's recap:

  1. Items are unique.  They have an ID that no other item does.
  2. Items are versioned.  Like all version control systems, TFVC is about making it easy to store successive versions of the same item and retrieve old ones when necessary.
  3. Items have names.  Yes, plural.  At a rudimentary level, there's a server path showing where it lives on the server and a local path showing where it lives on your filesystem.  More on this later.  Suffice to say names are not unique.
  4. Items have a type.  Either they are an ANSI, UTF-16, Binary, etc. file; or they are a directory.

Today's topic is #3.  Names (aka paths) are what users usually see when they interact with Items.  Rarely do you know in advance that you want to operate on ItemID #12345.  However, since names are not first-class objects in the system, we need a way to translate them back to Items.  As it turns out, there are several ways to do this. 



We call each consistent class of item names a namespace.   The rules for translating a name within a certain namespace to an Item - or to another namespace - we call mappings, aka functions or transformations.  It's easiest to describe mappings using terms borrowed from math.  Some mappings are injective (aka "one-to-one") or surjective.  Sometimes an explicit set of rules defines the mapping; sometimes the mapping is more of a side effect.

Committed space

Committed space is the closest thing we have to a canonical namespace.  It's what you see when you browse Source Control Explorer: the server paths of the items checked in to source control.  The mapping between items and committed space is one of the fundamental properties we track with each revision, although there is no API to perform the transform explicitly.  In addition, the mapping is injective: at any given point in time, each item has exactly one name in committed space.  It's not surjective: there are zillions of $/bogus or c:\bogus names that don't map to any item.  The only tricky part is variation over time, as users commit renames.

Local space

Local space is where local filesystem paths live.  It is defined relative to committed space by workspace mappings, which are explicit and readily exposed to the user.  This mapping is surjective: every item in your local workspace maps back to an item in committed space.  It is usually not injective, unless you map the entire repository ($/) and don't have any cloaks.

(On the server, there is also a mapping directly from items to each local space.  As you might guess it's injective.  We won't spend time on it because it's really just an optimization - part of the performance boost you get by tracking local version info - not conceptually important.)

Pending space

Pending space is also a property of a local workspace.  It represents the server paths of items as they would look in committed space after you checked in all of your currently pending changes.  In other words, it is defined implicitly, by transforming committed space according to any pending renames you have in that workspace.  Like local space, this mapping is usually surjective only -- items in pending space map back to committed space, but not vice versa.

We sometimes talk about the pending space of local paths, too.  Since the mapping between server & local paths is straightforward, and the idea is the same, it doesn't deserve its own section :)

Target space

Target space is a reflection of committed space from one branch onto another.  The Merge operation must generate a mapping between items in the source branch (committed space) and items in the target branch (target space) in order to determine the correct target items to pend changes on.  The mapping is implicitly defined by the history of the two branches: any renames that have occurred, whether they were merged, and if so how they were resolved.  In the general case the mapping is neither injective nor surjective, because items can be moved in & out of the two branches at will.



That was all pretty abstract.  Let's give an example. 

Changeset Action
10 add $/project/branch1/foo/bar.cs
20 branch it to $/project/branch2/foo/bar.cs

We now have two items under consideration.  Their itemIDs don't really matter except to say that they'll stay the same throughout this exercise.  Their names in committed space are trivial: exactly what is shown in the history chart.  We'll create a workspace with these mappings:
$/project -> c:\myProj
$/project/branch2 -> c:\myProj2

After we synchronize our workspace to changeset 20, we see clearly that our two items have local names c:\myProj\branch1\foo\bar.cs and c:\myProj2\foo\bar.cs.  Edits are the simplest kind of pending change in this context.  If we pended an Edit on them, their names in pending space would simply be the same server paths as found in committed space.

Committed Local Pending
$/project/branch1/foo/bar.cs c:\myProj\branch1\foo\bar.cs $/project/branch1/foo/bar.cs
$/project/branch2/foo/bar.cs c:\myProj2\foo\bar.cs $/project/branch2/foo/bar.cs

Pending a namespace operation (Add, Branch, Delete, Undelete, or Rename) is a little more interesting.  For example, if we created a 3rd branch, it would have a local path and a server path in pending space (c:\myProj\branch3 and $/project/branch3) but would not exist in committed space.  Add and Undelete work similarly; Delete is essentially the reverse.  Rename is the most complex, so let's checkin one and pend another:

30 rename $/project/branch1/foo => $/project/branch1/foo-ren
40 merge branch1 => branch2.  Resolve the rename conflict as AcceptYours (aka keep the target name unchanged)
[pending] rename $/project/branch2/foo/bar.cs => $/project/branch2/foo/bar-ren.cs

At this point, our original two items have the following names:

Committed Local Pending
$/project/branch1/foo-ren/bar.cs c:\myProj\branch1\foo-ren\bar.cs n/a
$/project/branch2/foo/bar.cs c:\myProj2\foo\bar-ren.cs $/project/branch2/foo/bar-ren.cs

A couple things to note.  First of all, the downstream affect of recursive operations: our first item has changed names even though it has never been renamed.  To answer the question "what was item X's name at time T" it's not enough to look at that item's history.  You have to consider the history of its parent folder, grandparent folder, etc. as well as any parent folder it was ever part of.  While the changeset that became #30 was still pending, a similar process had to be followed in order to compute its name in pending space.

Second, note that the names of related items have diverged between the two branches.  Target space arises because we need to preserve that relationship.  To demonstrate, let's checkin our 2nd rename and pend another merge.

50 rename $/project/branch2/foo/bar.cs => $/project/branch2/foo/bar-ren.cs
[pending] merge branch2 => branch1.  Resolve the rename conflict as AcceptMerge

Merge sees that there has been a rename to bar.cs in the source (branch2), and needs to pend the equivalent change in the target (branch1).  By looking at the merge history, it determines which target item is related to branch2/foo/bar-ren.cs, i.e. its name in target space.  During the first merge, it was straightforward: it just substituted the branch root ($/project/branch2).  This time it's more complex:

Committed Local Pending Target
$/project/branch1/foo-ren/bar.cs c:\myProj\branch1\foo-ren\bar-ren.cs $/project/branch1/foo-ren/bar-ren.cs n/a
$/project/branch2/foo/bar-ren.cs c:\myProj2\foo\bar-ren.cs n/a $/project/branch1/foo-ren/bar.cs

Note again that names in target space correspond to names in the other branch's committed space.