It’s Time for the Version Control Migration Feud!

Introducing on Team Foundation Server family! We have Work Item Tracking, Team Build, Version Control and the Data Warehouse!” And the “Other System” family! With Proprietary Bug Tracking, Informal Build Process and Your Version Control System of Choice!

On Your Marks! Let’s start the…


“We asked 100 people to name the version control artifact they most expected a migration utility to migrate to the target system.”

You’ve just rung in. You have 10 seconds to decide. What gets migrated during a version control migration? The clock is ticking. Can you get the number one answer? Will you even be on the board?

Whether you are writing a VC migration tool or trying to identify the right migration tool to use, this is a fundamentally important question.

What gets migrated?

The easy answer is “migrate the source” – or more generally migrate all of the versioned items in the system. But the easy answer is ambiguous. Does that mean to migrate just the latest versions or to migrate history as well?

Everyone loves history, right? Baseball, apple pie and version control history.

But how much history is enough? Is partial history sufficient? What exactly is partial history? Does it mean history back to a certain point in time or does it mean all points in time but not every versioned item? Both? Something else?

Does it have to be done in a single big-bang migration or can many small migrations be done over time?

So we’ve started to think about how to migrate versioned items but is that all there?

What about labels? Change metadata? Encodings? Integration history? Users? Workspaces? Item locks? Checked out items? Links to external systems (i.e. links that associate a changeset with work items)? Destoryed items (what if both systems don’t support a notion of destroy?)?

Migration Feud

There are many questions that need to be answered in order to create or evaluate of a migration tool – my (current) top 20 are:

1) What versioned items do you absolutely have to have in the target system?

2) Do you need history for those items?

a. If so – how much?

3) What would happen if you did not have history for those items?

4) Do you have internal processes that depend on the ability to query against well-known labels?

a. If so – can the process be changed easily?

5) Can those labels be recreated manually in the target system?

6) What would happen if those labels were not available?

7) Do you have a need to have complete and accurate integration history in the target system (i.e. if there was a branch of Foo to Bar in the source system can the new system just have Foo and Bar as unrelated versioned items or does the integration link need to exist?)

8) What would happen if that integration history were lost?

9) Can you recreate it manually using a baseless merge?

10) Does your development environment depend of complex workspace or branch mappings in order to allow sparse branches or workspaces in a manner that would incompatible with your target system? (e.g. ClearCase’s branching model is fundamentally different than TFS and the Perforce client mapping syntax is more robust than the TFS)

11) Do the source and target systems have operational parity?

a. If not – how will you address migrating those? Examples include:

                                                               i. Rename? (TFS does, Perforce does not)

                                                             ii. Destory? (ClearCase does, TFS does not)

                                                            iii. Sharing? (VSS does, TFS does not)

                                                           iv. Perforce and VSS support keyword expansion, TFS does not

12) Are there naming convention rules that need to be addressed between the source and target system?

a. For example TFS cannot have items that have a path segment that begin with ‘$’ whereas Perforce reserves ‘#’ and ‘@’ for revision identifiers

b. TFS has a 260 character path limit (other systems have more, others less)

c. TFS is not case sensitive (E.txt is the same as e.TXT) whereas solutions with unix clients often are case sensitive (E.txt is not the same as e.TXT).

d. What would happen if those files were not migrated because of this?

13) Will users, security groups and permissions be migrated?

14) Will items that are locked in the source system remain locked in the target?

15) Will items that have an encoding associated with them retain that encoding in the target system?

16) Will workspaces be migrated?

17) Will links between items be migrated?

18) Will changeset metadata be retained?

a. Will timestamps, comments and the original author be retained?

b. If the source system allows custom metadata (e.g. ClearCase) but the target system does not will that metadata simply be lost or captured somehow?

19) How does the migration tool respond to failure mid-way through the process?

a. Can it recover and pick up where it left off?

b. Does it require manual intervention to get going again?

c. Is the tool able to consolidate the logs for multiple runs into a single view?

20) How long will the migration take (or how will I know how much time is left)?

Start with those. Spend a day thinking about them and see where it goes.

And one last question … What questions am I missing?