Plan your migration to Git
Moving your team from a centralized version control system to Git requires more than just learning new commands. To support distributed development, Git stores information about file history and branches in a way that is fundamentally different from a centralized version control system. A successful migration requires that you understand these differences and plan your migration accordingly.
Microsoft has helped numerous customers, including teams within Microsoft, migrate from centralized version control systems to Git. Years of experience have produced a significant amount of guidance based on practices that consistently succeed.
Best practices include:
- Evaluate the tools and processes in use
- Select a branching strategy for Git
- Decide if and how to migrate history
- Maintain the previous version control system
- Remove binaries and executables from source control
- Train teams in the concepts and practices of Git
- Perform the actual migration to Git with care
Evaluate the tools and processes
Changing version control systems will naturally disrupt the development workflow as developers begin using new tools and practices. This disruption can be an opportunity to improve other aspects of the development process.
Teams should consider:
- When are builds and tests run? Adopting continuous integration, so that every check-in performs a build and a test pass, will help identify defects early and provides a strong safety net for the project.
- Is the team performing regular code reviews? Are they required, and are they happening before the code is checked in? Git's branching model makes a pull request-based code review workflow a natural part of the development process. This complements a continuous integration workflow nicely.
- Are teams performing continuous delivery? Moving to different version control tools will require teams to make changes to their deployment processes, so a migration is a good time to adopt a modern release pipeline and automate deployment processes.
Select a branching strategy
Before migrating code, the team should select a branching strategy. Using long-lived, isolated feature branches is discouraged; this tends to delay merges until integration becomes very challenging. By using modern continuous delivery techniques like feature flags, teams can integrate code into the main branch quickly, but still keep in-progress features hidden from users until they're complete.
Short-lived topic branches allow developers to work close to the main branch and integrate quickly, avoiding merge problems. Two common topic branch strategies are GitFlow and a simpler variation, GitHub Flow.
Teams currently using a long-lived feature branch strategy may find it easiest to begin adopting feature flags before migrating to Git. This will simplify the migration by minimizing the number of branches. Either way, teams should document the mapping between legacy branches and the new branches in Git so that everyone understands where they should commit their new work.
Teams may be tempted to migrate their existing source code's history to Git. There are numerous tools that claim to migrate a complete history of all branches from a centralized tool to Git, and at first glance this may seem like a good solution. A Git commit appears to map relatively well to the changeset or check-in model that the previous version control tool used, but there are some serious limitations with this translation.
- In most centralized version control systems, all branches exist as folders inside the repository.
For example, the main branch may be a folder named
/trunkwhile other branches exist as folders like
/branch/two. In a Git repository, branches instead apply to the entire repository and a 1:1 translation may be difficult.
- In some version control systems, a tag or a label is a collection that can contain various files in the tree, perhaps even files at different versions. In Git, a tag is a snapshot of the entire repository at a specific point in time and it cannot represent a subset of the repository or combine files at different versions.
- Most version control systems store details about the way files change between versions, recording fine-grained change types like rename, undelete and rollback. Git stores versions as a snapshot of the entire repository, and the metadata about how files change is not available.
These differences mean that a full history migration will be lossy, at best, and possibly even misleading. Given this lossiness, the effort involved in performing a migration with history and the relative rarity that this history is used, it is suggested that teams avoid importing history. Instead, they should perform a tip migration, bringing only a snapshot of the most recent version of a branch into Git.
For most development teams, the time spent trying to migrate history is typically better spent on other areas of the migration that have a higher return on investment, especially improving processes. This recommendation is based on experience with a vast number of migrations.
Maintaining the old version control system
During and after a migration, developers may still need access to the history from the previous version control system. As such, it is highly recommended that it be maintained indefinitely. This is especially true for teams that only perform a tip migration. Although the previous version control history becomes less relevant over time, it is still important to be able to refer back to it, and highly regulated environments may have specific legal and auditing requirements about version control history. Instead, set the old version control system to read-only once the migration has been performed.
For large development teams and regulated environments, it is recommended that teams place breadcrumbs in Git that point users back to the old version control system. A simple example is a text file at the root of a Git repository, added as the first Git commit, before the tip migration, pointing to the URL of the old version control server. If many branches are migrated, a text file in each should explain how the Git branches were migrated from the old system.
Adding breadcrumbs is especially helpful for developers who start working on a project long after it's been converted to Git and who don't have familiarity with the old version control system.
Binary files and tools
Due to the way Git stores history, developers should avoid adding binary files to a repository, especially binaries that are very large or that change regularly. Git's storage model is optimized for versioning text files like source code, which are compact and highly compressible. Binary files typically are neither, and once they've been added to a repository, they will remain in the repository history and in every future clone.
Migrating to Git provides an opportunity to remove these binaries from the codebase. It is recommended that libraries, tools, and build output are excluded from repositories. Instead, use package management systems like NuGet to manage dependencies.
Assets like icons and artwork may need to align with a specific version of source code. Small, infrequently-changed assets like icons can be included directly in a repository. Since they do not change often, they will not bloat the history. But large or frequently changing assets should be stored using the Git LFS (Large File Storage) extension.
Perhaps the biggest challenge in migrating to Git is helping developers understand how Git stores changes and how commits form a history of development. It's not enough to just prepare a cheat sheet that maps commands in the old system to Git commands. Developers need to stop thinking about version control history in terms of a centralized, linear model and need to understand Git's history model and the commit graph. Since people learn in different ways, plan on making several types of training material available. Live, lab-based training with an expert instructor works well for some people. The Pro Git book is available for free online and is an excellent starting point.
There are also several free hands-on training courses available, including:
- Microsoft Learn's Introduction to version control with Git learning path.
- The Get started with Azure Repos and Visual Studio quickstart.
- GitHub's Ramp up on Git and GitHub learning path.
Organizations should work to identify key members of the team as Git experts. Then empower them to help others and make sure that the rest of the team is encouraged to ask them questions.
Once teams have updated their processes, analyzed their code and started training, it's finally time to perform the source code migration. Whether performing a tip migration or also migrating history, it is recommended that one or more test migrations into a test repository are performed. Before performing a final migration, ensure:
- All code files have migrated and there are no stray binaries in the repository.
- Users have the appropriate permissions to fetch and push.
- All branches are available.
- Builds are successful, and all tests are passing.
The final migration should be performed at a time when few people are working, ideally between milestones when there is some natural downtime. Migrating at the end of a sprint may cause issues when developers are rushing to finish work. Aim to migrate over a weekend when nobody needs to check-in.
Plan to make a firm cutover from the old version control system to Git. Trying to keep multiple systems operating in parallel is confusing since developers may not know how, or where, to check in. Setting the old version control system to read-only will help avoid this. Otherwise a second migration that includes interim changes may be necessary.
The actual process you take will vary based on the system being migrating from. Learn more about migrating from Team Foundation Version Control.
|Team workflows||Determine how builds will run|
|Determine when tests will run|
|Develop a release management process|
|Move code reviews to pull requests|
|Branching strategy||Pick a Git branching strategy|
|Document the branching strategy, including why it was selected and how legacy branches map|
|History||Decide how long to keep legacy VC running|
|Identify branches which need to migrate|
|If needed, create breadcrumbs to help engineers navigate back to the legacy system|
|Binaries and tools||Identify which binaries and undiffable files to remove from the repo|
|Decide on an approach for large files, such as Git-LFS|
|Decide on an approach for delivering tools and libraries, such as NuGet|
|Training||Identify training materials|
|Plan training: events, written material, videos, etc.|
|Identify members of the team to serve as local Git experts|
|Code migration||Run multiple test runs to ensure the migration will go smoothly|
|Identify and communicate a time to make the cutover|
|Create the new Git repo on Azure DevOps|
|Migrate the mainline branch first, followed by any additional branches needed|