Tutorial: Build a highly available application with Blob storage
Članak
12 min. za čitanje
This tutorial is part one of a series. In it, you learn how to make your application data highly available in Azure.
When you've completed this tutorial, you will have a console application that uploads and retrieves a blob from a read-access geo-zone-redundant (RA-GZRS) storage account.
Geo-redundancy in Azure Storage replicates transactions asynchronously from a primary region to a secondary region that is hundreds of miles away. This replication process guarantees that the data in the secondary region is eventually consistent. The console application uses the circuit breaker pattern to determine which endpoint to connect to, automatically switching between endpoints as failures and recoveries are simulated.
Download the sample project and extract (unzip) the storage-dotnet-circuit-breaker-pattern-ha-apps-using-ra-grs.zip file. You can also use git to download a copy of the application to your development environment. The sample project contains a console application.
Download the sample project and extract (unzip) the storage-python-circuit-breaker-pattern-ha-apps-using-ra-grs.zip file. You can also use git to download a copy of the application to your development environment. The sample project contains a basic Python application.
Download the sample project and unzip the file. You can also use git to download a copy of the application to your development environment. The sample project contains a basic Node.js application.
In the application, you must provide the connection string for your storage account. You can store this connection string within an environment variable on the local machine running the application. Follow one of the examples below depending on your Operating System to create the environment variable.
In the Azure portal, navigate to your storage account. Select Access keys under Settings in your storage account. Copy the connection string from the primary or secondary key. Run one of the following commands based on your operating system, replacing <yourconnectionstring> with your actual connection string. This command saves an environment variable to the local machine. In Windows, the environment variable is not available until you reload the Command Prompt or shell you are using.
In the application, you must provide your storage account credentials. You can store this information in environment variables on the local machine running the application. Follow one of the examples below depending on your Operating System to create the environment variables.
In the Azure portal, navigate to your storage account. Select Access keys under Settings in your storage account. Paste the Storage account name and Key values into the following commands, replacing the <youraccountname> and <youraccountkey> placeholders. This command saves the environment variables to the local machine. In Windows, the environment variable is not available until you reload the Command Prompt or shell you are using.
In Visual Studio, press F5 or select Start to begin debugging the application. Visual studio automatically restores missing NuGet packages if configured, visit Installing and reinstalling packages with package restore to learn more.
A console window launches and the application begins running. The application uploads the HelloWorld.png image from the solution to the storage account. The application checks to ensure the image has replicated to the secondary RA-GZRS endpoint. It then begins downloading the image up to 999 times. Each read is represented by a P or an S. Where P represents the primary endpoint and S represents the secondary endpoint.
In the sample code, the RunCircuitBreakerAsync task in the Program.cs file is used to download an image from the storage account using the DownloadToFileAsync method. Prior to the download, an OperationContext is defined. The operation context defines event handlers, that fire when a download completes successfully or if a download fails and is retrying.
To run the application on a terminal or command prompt, go to the circuitbreaker.py directory, then enter python circuitbreaker.py. The application uploads the HelloWorld.png image from the solution to the storage account. The application checks to ensure the image has replicated to the secondary RA-GZRS endpoint. It then begins downloading the image up to 999 times. Each read is represented by a P or an S. Where P represents the primary endpoint and S represents the secondary endpoint.
In the sample code, the run_circuit_breaker method in the circuitbreaker.py file is used to download an image from the storage account using the get_blob_to_path method.
The Storage object retry function is set to a linear retry policy. The retry function determines whether to retry a request, and specifies the number of seconds to wait before retrying the request. Set the retry_to_secondary value to true, if request should be retried to secondary in case the initial request to primary fails. In the sample application, a custom retry policy is defined in the retry_callback function of the storage object.
Before the download, the Service object retry_callback and response_callback function is defined. These functions define event handlers that fire when a download completes successfully or if a download fails and is retrying.
To run the sample, open a command prompt, navigate to the sample folder, then enter node index.js.
The sample creates a container in your Blob storage account, uploads HelloWorld.png into the container, then repeatedly checks whether the container and image have replicated to the secondary region. After replication, it prompts you to enter D or Q (followed by ENTER) to download or quit. Your output should look similar to the following example:
Created container successfully: newcontainer1550799840726
Uploaded blob: HelloWorld.png
Checking to see if container and blob have replicated to secondary region.
[0] Container has not replicated to secondary region yet: newcontainer1550799840726 : ContainerNotFound
[1] Container has not replicated to secondary region yet: newcontainer1550799840726 : ContainerNotFound
...
[31] Container has not replicated to secondary region yet: newcontainer1550799840726 : ContainerNotFound
[32] Container found, but blob has not replicated to secondary region yet.
...
[67] Container found, but blob has not replicated to secondary region yet.
[68] Blob has replicated to secondary region.
Ready for blob download. Enter (D) to download or (Q) to quit, followed by ENTER.
> D
Attempting to download blob...
Blob downloaded from primary endpoint.
> Q
Exiting...
Deleted container newcontainer1550799840726
The OperationContextRetrying event handler is called when the download of the image fails and is set to retry. If the maximum number of retries defined in the application are reached, the LocationMode of the request is changed to SecondaryOnly. This setting forces the application to attempt to download the image from the secondary endpoint. This configuration reduces the time taken to request the image as the primary endpoint is not retried indefinitely.
private static void OperationContextRetrying(object sender, RequestEventArgs e)
{
retryCount++;
Console.WriteLine("Retrying event because of failure reading the primary. RetryCount = " + retryCount);
// Check if we have had more than n retries in which case switch to secondary.
if (retryCount >= retryThreshold)
{
// Check to see if we can fail over to secondary.
if (blobClient.DefaultRequestOptions.LocationMode != LocationMode.SecondaryOnly)
{
blobClient.DefaultRequestOptions.LocationMode = LocationMode.SecondaryOnly;
retryCount = 0;
}
else
{
throw new ApplicationException("Both primary and secondary are unreachable. Check your application's network connection. ");
}
}
}
Request completed event handler
The OperationContextRequestCompleted event handler is called when the download of the image is successful. If the application is using the secondary endpoint, the application continues to use this endpoint up to 20 times. After 20 times, the application sets the LocationMode back to PrimaryThenSecondary and retries the primary endpoint. If a request is successful, the application continues to read from the primary endpoint.
private static void OperationContextRequestCompleted(object sender, RequestEventArgs e)
{
if (blobClient.DefaultRequestOptions.LocationMode == LocationMode.SecondaryOnly)
{
// You're reading the secondary. Let it read the secondary [secondaryThreshold] times,
// then switch back to the primary and see if it's available now.
secondaryReadCount++;
if (secondaryReadCount >= secondaryThreshold)
{
blobClient.DefaultRequestOptions.LocationMode = LocationMode.PrimaryThenSecondary;
secondaryReadCount = 0;
}
}
}
The retry_callback event handler is called when the download of the image fails and is set to retry. If the maximum number of retries defined in the application are reached, the LocationMode of the request is changed to SECONDARY. This setting forces the application to attempt to download the image from the secondary endpoint. This configuration reduces the time taken to request the image as the primary endpoint is not retried indefinitely.
def retry_callback(retry_context):
global retry_count
retry_count = retry_context.count
sys.stdout.write(
"\nRetrying event because of failure reading the primary. RetryCount= {0}".format(retry_count))
sys.stdout.flush()
# Check if we have more than n-retries in which case switch to secondary
if retry_count >= retry_threshold:
# Check to see if we can fail over to secondary.
if blob_client.location_mode != LocationMode.SECONDARY:
blob_client.location_mode = LocationMode.SECONDARY
retry_count = 0
else:
raise Exception("Both primary and secondary are unreachable. "
"Check your application's network connection.")
Request completed event handler
The response_callback event handler is called when the download of the image is successful. If the application is using the secondary endpoint, the application continues to use this endpoint up to 20 times. After 20 times, the application sets the LocationMode back to PRIMARY and retries the primary endpoint. If a request is successful, the application continues to read from the primary endpoint.
def response_callback(response):
global secondary_read_count
if blob_client.location_mode == LocationMode.SECONDARY:
# You're reading the secondary. Let it read the secondary [secondaryThreshold] times,
# then switch back to the primary and see if it is available now.
secondary_read_count += 1
if secondary_read_count >= secondary_threshold:
blob_client.location_mode = LocationMode.PRIMARY
secondary_read_count = 0
With the Node.js V10 SDK, callback handlers are unnecessary. Instead, the sample creates a pipeline configured with retry options and a secondary endpoint. This allows the application to automatically switch to the secondary pipeline if it fails to reach your data through the primary pipeline.