[Service Fabric] Full/Incremental backup and restore sample
I recently worked with a client who said that they had had trouble implementing incremental backup and restore for their stateful services, so I was set to the task to build a sample that would do both a full and incremental backups to Azure blob storage.
I fully expect great improvements to happen in Service Fabric in the coming months regarding backup and restore, but for now, the customer needed to get moving.
I've posted the Visual Studio solution on GitHub for you to look at if you are needing something like this:
I am using Visual Studio 2017 15.3.3 with the Service Fabric SDK v2.7.198.
Here are some notes for the code you'll find in the sample. I've put as many comments as seem logical within the code, but more description is always better.
- At the top of the file is a int named backupCount - this is a counter that specifies after how many counts to do before you do the next full or incremental backup.
- In RunAsync, there is an bool, takeBackup that will keep track of whether a backup needs to take place
- When takeBackup is true, based on the logic I'm using, a full backup will take place first, then after this, incremental backups. There is a flag, takeFullBackup and an incremental counter incrementalCount, that will need to be replaced by your own logic on when you want to do backups and under what conditions.
- BackupCallbackAsync - this is the primary backup method. This method calls SetupBackupManager (discussed below) and ArchiveBackupAsync in the AzureBackupStore.cs file. ArchiveBackupAsync is the method that will take the data that needs to be backed up, zip it up and push it out to blob storage. If this is a full backup, any files sitting in your blob storage container will be deleted because you should not need them at this point. (Here, you may want to do archiving if you are not sure of your backup strategy). For a stateful service with multiple partitions, you would generally have multiple blob containers, one for each partition.
- SetupBackupManager - this method gets settings from the Settings.xml file in .\PackageRoot\Config to determine storage account keys, names etc. Then, an instance of the AzureBlobBackupManager class is created. This method is called right before you perform a backup or a restore just to make sure a connection is available.
- OnDataLossAsync - this method is called when you execute the .\SFBackup\Scripts\EnumeratePartitions.ps1. This method calls the backup managers RestoreLatestBackupToTempLocation which is responsible for:
Getting a list of the blobs in blob storage that make up the backup
Looping through each blob and downloading/extracting it to your hard drive (in a folder you specify).
Once unzipped, deleting the zip file from your hard drive
Returning the root directory under which you have the unzipped full backup plus all incremental backups. The correct structure of a full + incremental backup should look like:Backup Folder (the root folder)/
- Full/ (name may be different depending on your naming convention)
- Incremental0 (name may be different depending on your naming convention)
- Incremental1 (name may be different depending on your naming convention)
Next, the RestoreAsync command is called which looks at your root backup directory, then cycles through the full backup and incremental backups to do the restore. The root directory is then deleted after a restore.
- For the parameters in this file, I have commented so you'll know what they mean. Note that you are going to need to add code to the app to programmatically create the directory where your backup files are dropped in to. I'm not calling this the 'root' directory because the root directory IS created programmatically and the name will change based on the partition name/logic.
- I won't go in to each method here because I've commented the utility methods pretty well, but one key thing to watch for is in the constructor, it is showing you how to get the parameters out of the Settings.xml file that you need. Note how the tempRestoreDir (that is the folder YOU create on your hard drive) is combined to the partitionId name/number to form the actual 'root' folder which will be later deleted after you have done the restore.
So, how do you run this application?
- Open the solution and build the app to get your NuGet packages pulled down.
- Create an Azure storage account with a container to use for blob storage backup
- Create a directory on your hard drive to store the backup files
- Fill in the information in settings.xml
- Open EnumeratePartitions.ps1 in PowerShell ISE
When you start running the app, you can set breakpoints in various places to see how it runs, but for me, I let it go through a few backups and write down where I ended up at on the count (from the diagnostics window). Then approximately half way in the middle of another count run, I will execute the PS script. it will take a few seconds to trigger the restore.
The way the app is written is that when you trigger a restore, it does the restore and then after the restore, the next backup is a full backup etc. Something you will notice though as you do restores is that the service (the particular partition), will seem to freeze while the restore is taking place. This is because a 'restore' is expected to be a disaster recovery situation where the service partition would not normally be available. Other partitions may still be running, but not the one being restored.
Hope this helps you in your backup and restore efforts!