October 2019

Volume 34 Number 10

[Data Points]

Hybrid Database Migrations with EF Core and Flyway

By Julie Lerman

Julie LermanDuring development, it’s common to use EF Core’s migration commands to not only create migration files but to also execute migrations directly on your development database. You can apply the migrations with the PowerShell update-database command or the command-line interface (CLI): dotnet ef database update. What’s nice about letting EF Core update the database this way is that you have total control over when the database is migrated to a new schema. And I love having control over things. I can trigger the database migration, then open up the database to verify I got what I expected. But this can only happen during design time. It’s an explicit action to perform.

Another path for triggering EF Core to migrate the database using the migration files is to call the Database.Migrate or the Database.EnsureCreated method somewhere in your code. For example, in the first installation of my recent article series, “EF Core in a Docker Containerized App” (msdn.com/magazine/mt833405), I worked out a way to trigger EF Core to migrate my database at application startup. This is handy during development (and for demos, such as the article’s code) if you don’t want to manually use commands to update your database—especially during early stages of development when your model is evolving rapidly.

But applying migrations at application runtime has some critical drawbacks. A major one is that you’re depending on your software to be responsible for the database. Is that what you want? Is that what your DBA wants? Even if you’ve added custom migration files to handle administrative tasks, it’s probably not going to make your DBA happy. What if you have multiple APIs that use the same database? Which one should be responsible for the database?

But there’s also a deeper technical problem you’re inviting by calling Migrate from within the software. In that earlier article, I was publishing my API into a Docker image. What if I were to start up several containers to serve up that API? Each container would be hitting that Migrate method on startup. While unlikely, it’s possible that multiple servers might call Migrate at the same time and try to perform a migration on the database concurrently. This could create unexpected (and detrimental) side effects.

So, when it’s time to move from development and QA to production, letting EF Core update the database for you is generally not the best approach. Even the EF Core documentation (bit.ly/2XVQEHy) makes a note about applying migrations at runtime: “While it’s great for apps with a local database, most applications will require more robust deployment strategy like generating SQL scripts.”

EF Core migrations do let you create SQL scripts, even idempotent scripts (that avoid applying schema modifications that have already been applied). An app like Octopus Deploy can use deployment steps to read EF Core migrations, generate the SQL scripts and then execute them. Or you can generate the SQL yourself using EF Core migration commands (PowerShell: Script-Migration or CLI: dotnet ef migrations script). You (or the database guru in your org) can tweak these scripts to add more database knowledge into them if needed. Either way, this gives you the ability to then feed them to some other application that’s designed to manage your database updates, rather than you having to manage and execute the SQL manually. Redgate’s SQL Change Automation (formerly called Ready Roll), for example, is an ace at managing SQL Server updates with your scripts. There are also tools that aren’t just for SQL Server, such as Flyway (flywaydb.org), now part of the Redgate stable, or Liquibase (liquibase.org).

Recently I used a combination of EF Core migrations and Flyway to demonstrate using Docker containers to apply database migrations. This alternative allows integrating the evolution of the database schema into the same automated Continuous Integration/Continuous Deployment (CI/CD) process you use to deploy new versions of your application code. You can see a video of this from my recent NDC Oslo session at youtu.be/BL3TfA0UEr8. Having the database servers hosted in a Docker container for developer and tester machines, as well as in an automated CI/CD pipeline, eliminates the need to install the server anywhere. And I used a separate Docker container for the short period needed to run Flyway, letting it apply any migrations to the database and then shut itself down.

Flyway Basics

The Flyway app describes itself as “version control for your database” and it works with a large number of relational databases. There are three editions: the free Community edition, Pro—which adds more features—and Enterprise. It’s pretty straightforward: You provide a series of SQL script files and it will run them for you. Similar to EF migrations, it stores a history of which scripts have been run in a special migrations history table. This table will get stored in whatever default database is indicated by the database connection string. You’ll see more about that later.

Flyway is cross-platform and can run on a variety of operating systems—including mobile platforms. I’ve used it in Docker containers to execute scripts as part of a CI/CD process, as demonstrated in the video I mentioned earlier. Flyway also has a rich API, although I’ll only be using the most basic function—migrate—for this article.

The order in which scripts are executed is critical. Flyway has a naming convention to ensure that scripts are executed in the correct order. Key elements of the name are “V” for version (to differentiate from U for Undo scripts and R for Repeatable scripts). I’m going to focus only on Version scripts in this article. Next comes the version number, where you can use an underscore separator for sub versions. The critical double underscore follows, to separate the version info from a descriptive name. I once struggled with SQL files getting ignored because I used a single underscore by mistake in this position, so watch out for this. A typical file name might look like V1__BuildTables or V1_1__CreateRoles.

I installed Flyway to my MacBook using Homebrew (brew install flyway). You can find install instructions for other operating systems on Flyway’s website. The brew installation ensured that flyway was on my file path so I had easy access to it from the command line without having to go look for it. Flyway requires Java and brew guided me to easily install what I needed.

Here’s how I went about the hybrid migration using EF Core and Flyway.

Start with EF Core Migration Files

I let EF Core create migration files for the model I’m working with. For example, my simple model for the conference session I mentioned is shown in Figure 1. There’s an Agilista class (to keep track of all of my gal pals in the Agile community) and a Category class to represent the areas of Agile they focus on, such as Agile Testing.

My Simple Domain Model
Figure 1 My Simple Domain Model

I then let EF Core create my first migration using dotnet ef migration add initial. The resulting file uses the migration API to describe defining the tables, relationship constraints and indexes.

I also want a little bit of test data that I’ll seed so I can do a bit of testing (though not performance testing). Therefore, I added the EF Core HasData method in the DbContext class to define some seed data. Figure 2 shows some of that code, although the downloadable sample code inserts a few more agilistas and categories. You can read more about HasData in my August 2018 column (msdn.com/magazine/mt829703).

Figure 2 Seeding Some Data with EF Core HasData

protected override void OnModelCreating (ModelBuilder modelBuilder
{
  modelBuilder.Entity<Category> ().HasData (
    new {Id = new Guid ("167d1f6b-a93d-49e4-8a0d-e651369e018b"),
         Description = "Agile Testing" },
    new {Id = new Guid ("5f6d6f80-9f9a-469e-9036-07ecbb3971ea"),
         Description = "Exploratory Testing" }
    );
    modelBuilder.Entity<Agilista> ().HasData (
      new { Id = new Guid ("5efdb55d-1205-419f-8a0b-9cc7a15f8565"),
            Name = "Lisa Crispin",
            PrimaryFocusId = new Guid ("167d1f6b-a93d-49e4-8a0d-e651369e018b") },
      new { Id = new Guid ("83eda86f-c652-4666-ba17-db90b218a54b"),
            Name = "Linda Rising",
            PrimaryFocusId = new Guid ("c5b6a0e8-e43f-4765-906f-e15e019a19d8") }
    );

I created a second migration with dotnet ef migration add seeddata, which describes inserting the data into the tables created in the first migration.

Rather than letting EF Core apply these migrations (which means it will generate the relevant SQL in memory and then execute it), I’ll tell EF Core to just output the SQL for me. I’m intentionally excluding the –idempotent flag when generating these scripts. Flyway will take care to avoid running scripts more than once. However, EF Core’s script command, by default, will script every migration file found. That means if you script your first two migrations, then add a third migration and call script again, a script will be created for all three. This won’t work for my plans.

Instead, you want to be explicit about which migrations to script; that way, the SQL that’s output can represent incremental changes to the database and be applied as needed. I could combine both of my existing migrations into a single script, but because they perform different types of tasks, I’ll create separate scripts for explicitness.

The PowerShell syntax uses the -To and -From parameters. From specifies the last run migration and therefore doesn’t include it in the generated script:

script-migration -To initial -Output createAgilistaCategoryTables.sql

script-migration -From initial -To seeddata -Output SeedData.sql

The CLI syntax doesn’t use the parameter names. You only supply the names of the from and to migrations. While these aren’t required arguments, if you do want to specify which migrations to script, you must provide both arguments. Therefore, if you’re starting from the beginning, you need to use the number zero in the from position as I’m doing in the first of these two CLI commands:

dotnet ef migrations script 0 initial -o createAgilistaCategoryTables.sql

dotnet ef migrations script initial seeddata -o SeedData.sql

You can see the details of these files in the accompanying download.

Transforming the Scripts for Flyway

There are some facets of the new script files that are specific to EF Core. Because of that, I need to do a little extra work to use them outside of EF Core migrations. There are five tasks:

  • Create the database
  • Remove EF Core migrations-specific logic
  • Ensure the database is targeted prior to running SQL
  • Rename the files to follow Flyway naming conventions
  • Move the files into a sub-folder called sql

Creating the Database

The first step is to ensure the database gets created. Why? Although the EF Core Migrations Update Database command creates a database if needed, that logic is internal to the API, not expressed in the migration files. So, you’ll need SQL that creates the database before the new scripts are run, and perhaps more tasks to be performed on the database server or the database itself, such as creating roles or users. I want to create a login, user and role on the SQL Server and then, after creating the new database, set up the same login, user and role in the database.

I’ve created another SQL file, InitDatabase.sql, to perform all of those tasks and, again, because of space constraints, you’ll need to grab the download to see the details. However, key in the file is the command to create the database:

CREATE DATABASE DB_Agilistas

Removing EF Core-Specific SQL

The second task I need to perform is to remove EF Core migrations-­specific logic from the generated SQL scripts. That’s all of the references to the Migrations_History table that EF Core maintains. You won’t need these because you won’t be using EF Core to keep track of the migrations.

The first script file (createAgilistaCategoryTables.sql) contains SQL to create the table and to update it after the migration has been performed.

At the top of the file you’ll see:

IF OBJECT_ID(N'[__EFMigrationsHistory]') IS NULL
BEGIN
  CREATE TABLE [__EFMigrationsHistory] (
    [MigrationId] nvarchar(150) NOT NULL,
    [ProductVersion] nvarchar(32) NOT NULL,
    CONSTRAINT [PK___EFMigrationsHistory] PRIMARY KEY ([MigrationId])
  );
END;

This needs to be removed, as well as any commands to insert data into that table (which you’ll find at the end of each SQL file). For example:

INSERT INTO [__EFMigrationsHistory] ([MigrationId], [ProductVersion])
VALUES (N'20190418172940_seeddata', N'2.2.3-servicing-35854');

Specifying the Correct Database

EF Core knows which database to execute migrations on. But your output script files don’t contain this information. Be sure to include it at the top of each of the SQL files, although not the one that’s creating the database. That way you don’t have to worry about the connection string defaulting to a different database such as master. I’m adding the following to the top of the two generated SQL files:

USE DB_Agilistas

GO

Renaming the Files for Flyway

Finally, you’ll need to be sure the file names align with Flyway convention so they’ll be run in the proper order. Note again that I’m focusing only on forward-movement files that start with the letter V for Version, and not Flyway’s ability to have Undo or Repeatable scripts.

Currently my three files are named:

  • InitDatabase.sql
  • createAgilistaCategoryTables.sql
  • SeedData.sql

To ensure that Flyway will execute them in the proper order, I renamed them:

  • V1_0__InitDatabase.sql
  • V1_1__createAgilistaCategoryTables.sql
  • V1_2__SeedData.sql

They all start with V1 because they’re part of my first pass. In addition to one last reminder about the importance of that double underscore, I’ll admit it took me a number of tries before I realized that the name of one of the files (which Flyway kept ignoring) began with a lowercase v rather than upper.

After making all of these changes, I moved the three SQL files into a sub-folder in my project called sql, which follows a convention of Flyway you can override if needed.

Running Flyway from the Command Line

I already had a SQL Server running in a Docker container that was exposed to my environment on localhost:1601. (There are now many articles in the Data Points series that use SQL Server in Docker. The initial introductory article from July 2017 can be found at msdn.com/magazine/mt784660.)

With Flyway, you need to specify the database connection with a URL. I’m using the syntax for SQL Server and setting  master as the default database:

jdbc:sqlserver://localhost:1601;database=master

The Flyway docs provide examples for various database types. Rather than including the URL, user, password and other needed parameters in every command to Flyway, I’ve stored them in a Flyway configuration file, flyway.conf, in my project:

flyway.url=jdbc:sqlserver://localhost:1601;database=master
flyway.user=sa
flyway.password=P@ssword1
flyway.locations=filesystem:sql/
flyway.mixed=true

The last configuration—mixed—allows both transactional and non-transactional SQL commands in the same migration file.

Because I want you to see Flyway perform an update, I’m going to muck up the name of the third SQL file, making the initial letter a lowercase “v”—v1_2__SeedData.sql—causing Flyway to ignore it on the first run.

With this in place, I can finally run flyway migrate at the command line in my project folder.

Flyway will read the configuration file and then look to see if its migration table, named flyway_schema_history, exists on the server yet. In my case it doesn’t, so Flyway creates that table inside the default database (master). Then it notes how many migration files it found, compares those files to what’s in the flyway table (none yet) and runs any that aren’t yet listed in the table.

Flyway’s logs note the current version of the schema, which is “<Empty>” at this point; then that it “successfully validated 2 migrations”; and that it “successfully applied 2 migrations.” The applied migrations are inserted as rows in the flyway_schema_history table. Figure 3 shows part of the listing of the two new rows.

The Flyway_schema_history Table Shows Two Migrations Have Been Applied
Figure 3 The Flyway_schema_history Table Shows Two Migrations Have Been Applied

Had there been any problems, Flyway would have stopped exe­cuting and output details about the issue. As I’m not doing anything advanced in this scenario, the migrations aren’t being run in a transaction. Therefore, at this early stage of learning, you’d need to roll back any changes that were made before any failures.

Now let’s introduce the database change represented by the third SQL file. I’ll change the name back to V1_2__SeedData.sql so Flyway will see it.

I’ll rerun flyway migrate and this time, the logs tell me it found three migrations, noted that the schema version was 1.1 and that it applied the seedData migration. Here’s some of the detail from the ouput:

Successfully validated 3 migrations
Current version of schema [dbo]: 1.1
Migrating schema [dbo] to version 1.2 - seedData
WARNING: DB: Changed database context to 'DB_Agilistas'.
Successfully applied 1 migration to schema [dbo]

Looking at the database in Azure Data Studio, I can see all of the assets created by the various migrations, including the new tables and their data. The flyway_schema_history table also has a new row representing the latest migration.

A More Robust Path for Server Migrations

Flyway uses concepts similar to those you may be familiar with from working with EF or EF Core, especially with respect to the history table in the database server. However, using an explicit mechanism to perform migrations on a production database rather than depending on EF Core to do them from within your application is preferable when your deployment has any level of complexity.

There’s still more to learn if you plan to use Flyway in a hybrid approach to leverage EF Core migrations for certain tasks and to perform migrations on your production database. But my goal here was to introduce you to the basic concept and let you get a feel for how the pieces go together. And while I chose Flyway as the tool for running my SQL scripts, you might want to translate this hybrid approach to some other tool that allows you to run scripts. Remember, though, that one advantage of Flyway (and Liquibase) is that it works with a number of relational databases. Other tools focus on a single database such as SQL Server.


Julie Lerman is a Microsoft Regional Director, Microsoft MVP, Docker Captain and software team coach who lives in the hills of Vermont. You can find her presenting on data access and other topics at user groups and conferences around the world. She blogs at thedatafarm.com/blog and is the author of “Programming Entity Framework,” as well as a Code First and a DbContext edition, all from O’Reilly Media. Follow her on Twitter: @julielerman and see her Pluralsight courses at bit.ly/PS-Julie.

Thanks to the following technical experts for reviewing this article: Julia Hayward (Redgate), Diego Vega (Microsoft)
Julia Hayward is a senior software engineer at Redgate, for several years technical lead for the flagship SQL Toolbelt and now for the newly-acquired Flyway. She’s also an unashamed board-game obsessive in her spare time, chairing the UK Backgammon Federation and a community volunteer at home in Eaton Socon, UK. Follow her on Twitter: @Julia\_Hayward


Discuss this article in the MSDN Magazine forum