Exercise - Move Azure Storage blobs using the .NET Storage Client

Completed

The .NET Client library provides support for Azure storage. You can use this library to write your own custom applications that move data around Azure storage.

In this exercise, you'll see how to write an application that can migrate blobs from hot to cool storage.

Create and add data to hot storage

First, create two accounts by using the Azure CLI.

The free sandbox allows you to create resources in a subset of the Azure global regions. Select a region from this list when you create resources:

  • westus2
  • southcentralus
  • centralus
  • eastus
  • westeurope
  • southeastasia
  • japaneast
  • brazilsouth
  • australiasoutheast
  • centralindia
  1. Create environment variables for your storage account name and region. Replace <location> with a region from the previous list.

    HOT_STORAGE_NAME=hotstorage$RANDOM
    COOL_STORAGE_NAME=coolstorage$RANDOM
    LOCATION=<location>
    
  2. Next, run the following command to create a storage account to hold blobs.

    az storage account create \
      --location $LOCATION \
      --name $HOT_STORAGE_NAME \
      --resource-group <rgn>[Sandbox resource group]</rgn> \
      --sku Standard_RAGRS \
      --kind BlobStorage \
      --access-tier Hot
    
  3. Create a storage account for holding the archived blobs. Use the Cool access tier. As before, specify an appropriate region, and choose a unique name for your storage account.

    az storage account create \
      --location $LOCATION \
      --name $COOL_STORAGE_NAME \
      --resource-group <rgn>[Sandbox resource group]</rgn> \
      --sku Standard_RAGRS \
      --kind BlobStorage \
      --access-tier Cool
    
    
  4. Obtain the keys for your storage account.

    az storage account keys list \
      --account-name $HOT_STORAGE_NAME \
      --resource-group <rgn>[Sandbox resource group]</rgn> \
      --output table
    
  5. Create an environment variable for your account key. Use the value of the first key retrieved by the previous command.

    HOT_KEY="<source account key>"
    
  6. Create a container named specifications in your storage account.

    az storage container create \
      --name specifications \
      --account-name $HOT_STORAGE_NAME \
      --account-key $HOT_KEY
    
  7. Upload the files to your storage account and save each one as a blob. This command uploads several specification files.

    az storage blob upload-batch \
      --destination specifications \
      --pattern "*.md" \
      --source ~/sample/specifications \
      --account-name $HOT_STORAGE_NAME \
      --account-key $HOT_KEY
    
  8. Verify that the blobs have been created.

    az storage blob list \
      --container-name specifications \
      --output table \
      --account-name $HOT_STORAGE_NAME \
      --account-key $HOT_KEY
    

Set up your project

  1. Run the following command to download the sample source code for this exercise.

    git clone https://github.com/MicrosoftDocs/mslearn-copy-move-blobs-from-containers-or-storage-accounts sample
    
  2. Move to the samples folder.

    cd sample/code
    
  3. Open the TransferBlobs.csproj project file using the Code editor.

    code TransferBlobs/TransferBlobs.csproj
    
  4. Change the <TargetFramework> value to net6.0.

    <PropertyGroup>
      <OutputType>Exe</OutputType>
      <TargetFramework>net6.0</TargetFramework>
    </PropertyGroup>
    
  5. Save the file by selecting Ctrl+S, and close the Code editor by selecting Ctrl+Q.

  6. Build the sample application.

    dotnet build TransferBlobs
    

Examine the TransferBlobs application

  1. Move to the TransferBlobs subdirectory. This subdirectory contains the source code for the sample application.

    cd TransferBlobs
    
  2. Open the Program.cs file using the Code editor.

    code Program.cs
    
  3. Look at the first few lines of the application in the Main method:

    string sourceConnection = args[0];
    string sourceContainer = args[1];
    string destConnection = args[2];
    string destContainer = args[3];
    DateTimeOffset transferBlobsModifiedSince = DateTimeOffset.Parse(args[4]);
    Console.WriteLine($"Moving blobs modified since {transferBlobsModifiedSince}");
    

    The TransferBlobs application takes the following command-line parameters:

    • A connection string for accessing the source storage account
    • The name of the container in the source storage account containing the blobs that you want to move
    • A connection string for accessing the destination storage account
    • The name of the container in the destination storage account for holding the blobs after they've been moved
    • A date/time string (in UTC). Blobs in the source container that have been modified since this date and time will be moved to the destination

    Note

    This application performs no validation or error handling. This is to keep the code short and concise. In a production system, you should validate all input carefully, and implement error handling for all storage account operations.

  4. Examine the code under the comment Connect to Azure Storage.

    // Connect to Azure Storage
    BlobServiceClient sourceClient = new BlobServiceClient(sourceConnection);
    BlobServiceClient destClient = new BlobServiceClient(destConnection);
    
    BlobContainerClient sourceBlobContainer = sourceClient.GetBlobContainerClient(sourceContainer);
    sourceBlobContainer.CreateIfNotExists();
    
    BlobContainerClient destBlobContainer = destClient.GetBlobContainerClient(destContainer);
    destBlobContainer.CreateIfNotExists();
    

    This block of code creates BlobServiceClient objects for the source and destination accounts. Then, it creates BlobContainerClient objects that you can use to access blobs in these accounts. The sourceBlobContainer variable is a reference to the container in the source account, containing the blobs to be moved. The destBlobContainer variable is a reference to the container in the destination account, where the blobs are to be transferred and stored.

  5. Scroll down to the method FindMatchingBlobsAsync.

    // Find all blobs that have been modified since the specified date and time
    private static async Task<IEnumerable<BlobClient>> FindMatchingBlobsAsync(BlobContainerClient blobContainer, DateTimeOffset transferBlobsModifiedSince)
    {
        List<BlobClient> blobList = new List<BlobClient>();
    
        // Iterate through the blobs in the source container
        List<BlobItem> segment = await blobContainer.GetBlobsAsync(prefix: "").ToListAsync();
        foreach (BlobItem blobItem in segment)
        {
            BlobClient blob = blobContainer.GetBlobClient(blobItem.Name);
    
            // Check the source file's metadata
            Response<BlobProperties> propertiesResponse = await blob.GetPropertiesAsync();
            BlobProperties properties = propertiesResponse.Value;
    
            // Check the last modified date and time
            // Add the blob to the list if has been modified since the specified date and time
            if (DateTimeOffset.Compare(properties.LastModified.ToUniversalTime(), transferBlobsModifiedSince.ToUniversalTime()) > 0)
            {
                blobList.Add(blob);
            }
        }
    
        // Return the list of blobs to be transferred
        return blobList;
    }
    

    This method takes a blob container and a DateTimeOffset object. The method iterates through the container to find all blobs that have a last modified date after the value specified in the DateTimeOffset object. The blobList collection is populated with a reference to each matching blob. When the method finishes, the blobList collection is passed back to the caller.

    In the Main method, this method is invoked by the following statement.

    // Find all blobs that have been changed since the specified date and time
    IEnumerable<BlobClient> sourceBlobRefs = await FindMatchingBlobsAsync(sourceBlobContainer, transferBlobsModifiedSince);
    
  6. Scroll down to the MoveMatchingBlobsAsync method.

    // Iterate through the list of source blobs, and transfer them to the destination container
    private static async Task MoveMatchingBlobsAsync(IEnumerable<BlobClient> sourceBlobRefs, BlobContainerClient sourceContainer, BlobContainerClient destContainer)
    {
        foreach (BlobClient sourceBlobRef in sourceBlobRefs)
        {
            // Copy the source blob
            BlobClient sourceBlob = sourceContainer.GetBlobClient(sourceBlobRef.Name);
    
            // Check the source file's metadata
            Response<BlobProperties> propertiesResponse = await sourceBlob.GetPropertiesAsync();
            BlobProperties properties = propertiesResponse.Value;
            BlobClient destBlob = destContainer.GetBlobClient(sourceBlobRef.Name);
            CopyFromUriOperation ops = await destBlob.StartCopyFromUriAsync(GetSharedAccessUri(sourceBlobRef.Name, sourceContainer));
    
            // Display the status of the blob as it is copied
            while(ops.HasCompleted == false)
            {
                long copied = await ops.WaitForCompletionAsync();
                Console.WriteLine($"Blob: {destBlob.Name}, Copied: {copied} of {properties.ContentLength}");
                    await Task.Delay(500);
            }
            Console.WriteLine($"Blob: {destBlob.Name} Complete");
    
            // Remove the source blob
            bool blobExisted = await sourceBlobRef.DeleteIfExistsAsync();
        }
    }
    

    The parameters to this method are the list of blobs to be moved, and the source and destination containers. The code iterates through the list of blobs and uses the StartCopyFromUriAsync method to start copying each blob in turn. Once the copy operation has been initiated, the code queries the status of the destination blob at 0.5-second intervals, displaying the progress of the operation, until the copy is complete. When the blob has been copied, it's removed from the source container.

    The StartCopyFromUriAsync method call takes a URL containing a SAS token for the source object, as described in the previous unit.

    In the Main method, this method is invoked by the following statement.

    // Move matching blobs to the destination container
    await MoveMatchingBlobsAsync(sourceBlobRefs, sourceBlobContainer, destBlobContainer);
    

Test the TransferBlobs application

  1. Using the Azure portal, move to your source (hot) storage account.

  2. Under Security + networking, select Access keys. Make a copy of the connection string for key in a text file on your local computer.

  3. Under Data storage, select Containers.

  4. Select the specifications container.

To have some blobs with a different modified date from the batch upload time, you'll modify a few of them.

  1. From the specifications container, select one of the specification files (for example, specifications04.md).

  2. Select the Edit tab from the blob panel, and add any text you want.

  3. Select Save to commit the changes to the blob.

  4. Repeat these steps to modify one or two other blob files.

With several blobs showing newer modification dates, you can differentiate between them when you run the .NET app.

  1. In the list of blobs in this container, note the modification date for the blobs. Select a date and time that is roughly in the middle of the modification date for the blobs (some blobs should have a modification time before your selected date, and others after).

    Note

    The Azure portal will show you times in your local time zone, but our program will expect them in UTC time. Adjust your date from what the Azure portal has shown to it's UTC value. For example, if your time was 6/15/2021, 10:04:27 AM in Korean Standard Time (KST), you would need to subtract 9 hours to UTC: 6/15/2021, 01:04:27 AM.

  2. Using the portal, move to your destination (cool) storage account.

  3. Under Security + networking, select Access keys. Make a copy of the connection string for key in a text file on your local computer.

  4. In the Data storage section, select Containers.

  5. Select + Container, and create a new container named transfer-test.

  6. In the Cloud Shell window, run the following command. Replace <source connection string> and <destination connection string> with the connection strings you recorded in notepad. Replace <selected date and time> with the date and time for your blobs, in the same format as it appeared in the Azure portal. Enclose the connection strings and date/time in double quotes, to prevent them being interpreted by the Bash shell:

    dotnet run "<source connection string>" specifications "<destination connection string>" transfer-test "<selected date and time>"
    

    Note

    If your file does not find any files to move, you may need to adjust your date from what the Azure portal has customized to your timezone to it's UTC time as it is used by the program. For example, if your time was edit date was 6/15/2021, 10:04:27 AM in Korea Standard Time (KST), you would need to subtract 9 hours to UTC: 6/15/2021, 01:04:27 AM.

  7. The application should list the name of each matching blob that it finds, and move them.

  8. When the application has finished, return to the Azure portal.

  9. Move to your destination (cool) storage account.

  10. Browse the transfer-test folder. Verify it contains the blobs that were moved.

  11. Move to your source (hot) storage account.

  12. Browse the specifications folder. Verify the blobs that were transferred have been removed from this folder.