February 2019

Volume 34 Number 2

[Azure]

Build and Deploy Highly Available and Resilient Solutions on Azure

By Srikantan Sankaran

Some of the technologies discussed in this article are in preview. All information is subject to change.

Organizations these days have a global presence and they expect their mission-critical applications to reach users across multiple geographies. IT staff in such organizations are under pressure to deliver on those expectations, and their success depends on how well their applications and platforms are architected and built to tackle the complexities inherent in that goal. Microsoft Azure offers numerous services and features that provide an almost turnkey implementation to address the relevant requirements, whether they involve enabling geo-replication of services and data, latency-­sensitive routing of user requests to a service closer to end users, or providing seamless application failover to other regions when disaster strikes. In this article I’ll take a look at a few of these objectives and discuss how Azure can be used to implement highly available and resilient global applications.

I’m going to use a three-tier solution consisting of an MVC and Web API application to illustrate the key aspects of building and deploying highly available and resilient applications in Azure. The application is accessed by a user to capture basic information pertaining to his family, which is passed on to the API tier, which in turn persists the data into a geo-replicated Cosmos DB database. The application is deployed Active-Active across multiple regions to ensure resiliency against regional outages. Within a region, it’s deployed across Availability Zones to protect against datacenter failures (bit.ly/2zXlRll). The database has a multi-region write feature enabled (bit.ly/2zV6OI0), ensuring that users can write to and read data from a database that’s close to their point of access, to minimize latency (bit.ly/2QRr1Il).

Architecture of the Solution

Figure 1 represents the architecture of the solution. Azure Service Fabric hosts the application components packaged as Docker containers for Linux, provides orchestration capabilities, and ensures the availability and reliability of the application. There are two zonal Service Fabric clusters deployed to each of the two Azure regions, in South East Asia and East US 2. A zone-redundant Azure Load Balancer is used to round-robin the requests between the two Availability Zones in each region.

Solution Architecture
Figure 1 Solution Architecture

Azure Front Door Service (still in public preview) directs user requests to one of the load-balanced endpoints across the two regions, based on the path of least latency. It also provides resiliency in the event of the application or an entire datacenter going down in one of the regions, by directing the requests to the endpoint in the other region.

The Azure Cosmos DB database used in the application is geo-replicated across the two Azure Regions and configured for multi-region write. Thus, data writes originating from the API Service in each region are written to a Cosmos DB collection local to that region.

Azure DevOps is the source code repository for the application. Azure Pipelines is used to build the applications, package them as Docker containers and upload them to a Docker Hub Registry.

The IT policies at the organization hosting this solution forbid administrative access to the Service Fabric cluster over the public Internet. A continuous delivery (CD) pipeline is implemented using Jenkins deployed inside an Azure Virtual Network. It pulls the deployment packages and containers from the Git repository in Azure DevOps and Docker Hub, respectively, and deploys the application to the Service Fabric clusters in the two regions.

The Service Fabric cluster deployed to each region consists of two node types. The primary node type runs the Service Fabric platform services and exposes the management endpoints using an internal load balancer. The secondary node type runs the MVC and Web API applications packaged using Docker containers. A public-facing load balancer routes requests from end users accessing the application over the Internet.

A third Azure Load Balancer (not depicted in the diagram) is used in this architecture to permit outbound calls to the Internet from the cluster, to download the Service Fabric platform components. The internal load balancer configured doesn’t have a public-facing endpoint and can’t reach the Internet. Without creating this additional public load balancer for outbound connectivity, the Service Fabric cluster can’t be set up (bit.ly/2A0VBpp).

Note that the standard SKU of Azure Load Balancer and Public IP address resources alone support deployment to Azure Availability Zones, ruling out the basic SKU, which does not.

Azure Service Fabric leverages the ability of the underlying Azure Virtual Machine Scale Sets (VMSS) resources for zonal deployment. 

Developing the Applications

There are two Visual Studio 2017 solution files shared with this article:

• censusapp.sln: The ASP.NET Core 2.0 MVC application.

• censusapi.sln: The ASP.NET Core 2.0 Web API application.

Keep in mind that though this sample application has been built using ASP.NET Core 2.0, it could very well be a Web application created using other technologies, such as Node.js, PHP and so forth. Also, the sample application shared with this article doesn’t implement any coding best practices. It’s intended only to illustrate select design implementation detail.

The MVC application implements the UI that lets users submit census data and displays that data. It uses the service discovery feature in Service Fabric to call the REST API that persists the data in Azure Cosmos DB. The API project uses the Cosmos DB SDK to implement a prioritized list of database locations to connect to in order to read and write data. The application that’s deployed to the South East Asia Region would add this region as “priority 1” and East US 2 as “priority 2.” For deployment to the East US 2 Region, it would be the reverse. Hence, there would be two separate container images for the Web API Project, one each for deployment to the respective regions. The container image for the MVC project would be the same, regardless of the Azure Region to which it’s deployed.

The following snippet shows how the prioritized list of locations is added to the Cosmos DB connection, and how to configure multi-region writes in the application:

ConnectionPolicy policy = new ConnectionPolicy
  {
    ConnectionMode = ConnectionMode.Direct,
    ConnectionProtocol = Protocol.Tcp,
    UseMultipleWriteLocations = true,
  };
policy.PreferredLocations.Add("East US 2");
policy.PreferredLocations.Add("South East Asia");
DocumentClient client = new DocumentClient(new Uri(this.accountEndpoint), 
  this.accountKey, policy);

The Web API project uses the “Bogus” NuGet package to generate fictitious data.

The Cosmos DB connection string and access keys are stored in the appsettings.json file of the Web API project for simplicity. For production use, they could instead be stored in Azure Key Vault. Refer to my article, “Secure Your Sensitive Business Information with Azure Key Vault” for guidance (msdn.com/magazine/mt845653).

Enabling the Web Applications for Docker ContainersVisual Studio 2017 Tools for Docker provide a turnkey implementation to enable an ASP.NET 2.0 Web application for Windows and Linux containers. Selecting the right-click option on the project in Visual Studio 2017 to “enable Docker Support” adds a Docker file to it. Selecting the right-click option on the project to “enable orchestration support” creates a Docker compose file.

The applications in this solution were packaged for Docker Linux containers. Minor edits were done to the Docker compose file to add the port mapping to be used when deploying the application (port 80 on the VMSS nodes to access the MVC application and port 5002 for the Web API).

Continuous Integration (CI) The Visual Studio 2017 solution files are checked into an Azure DevOps Git repository. An Azure pipeline is created that would run the Docker compose files for each of the MVC and Web API projects created in the previous steps. This builds the ASP.NET Core 2.0 applications and creates the Docker container images.

In the next step in the pipeline, a Bash script is used to tag the container images and push them to Docker Hub. This step needs to be performed once for deployment to the Azure Region in South East Asia and once for deployment to Azure Region East US 2. Again, the MVC container is the same no matter to which Azure Region it’s deployed. Figure 2 shows a snapshot of the CI pipeline created to accomplish this step.

Azure DevOps Pipeline
Figure 2 Azure DevOps Pipeline

Packaging the Applications for Deployment to Service Fabric I used the Yeoman generator for Windows to generate the application and service manifest files and package the solution. You’ll find guidance in using this tool at bit.ly/2zZ334n.

A single application package is created that bundles the MVC and API apps as Service Types. In the service manifest files, enter the container names uploaded to Docker Hub in the previous steps.

Note that the Web API project’s service manifest defines a service DNS name that should match the value in the MVC project’s appsettings.json file, to be used in service discovery.

The number of instances of each service type (that is, the Web and API projects) in the manifest is set to two. This instructs Service Fabric to have two instances of the container of that service type always running in the cluster. This number can be modified to suit the deployment, or can be set to auto-scale within a min and max range.

The service manifest supports the notion of a placement constraint. In the following snippet, a placement constraint is defined on the node type name in the Service Fabric cluster to which the service should be deployed. If not specified, the code package could get deployed across all node types, including the primary node type, in the Service Fabric cluster. However, I intend to host only the Service Fabric Platform Services in the primary node type.

<ServiceTypes>
  <StatelessServiceType ServiceTypeName="censusapiType" 
    UseImplicitHost="true">
<PlacementConstraints>(NodeTypeName==nt-sfazvm0) </PlacementConstraints>
  </StatelessServiceType>
</ServiceTypes>

The node type name configured in the Azure Resource Manager (ARM) template (discussed later) to deploy the Service Fabric Cluster should match that in the placement constraint of the service manifest.

Deploying the Applications to Service Fabric

Now let’s create the Service Fabric cluster and then deploy the application packages to it.

The first step is to provision a Cosmos DB database, which can be performed from the Azure Portal. Start by creating a database in South East Asia and enable geo-replication to East US 2 Region. Select the “enable Multi-Region Write” option in the wizard.

I retained the Session Consistency configuration, which is the default for the Cosmos DB database, and the default setting that indexes all the properties in the schema. You could select only specific properties to index if you prefer, based on the need to query the data.

To handle any potential conflicts that arise out of enabling the multi-master write, I used the default, “last-writer-wins conflict resolution” policy.

Next, you provision the Service Fabric cluster using an ARM template (bit.ly/2swVkI5 and bit.ly/2rBvFfi). Template SF-Std-ELB-Zonal­Deployment.Json deploys two zonal clusters into an existing virtual network. There are certain prerequisites to run this template.

First, use a certificate stored in Azure Key Vault for node-to-node security in the cluster. The thumbprint, Key Vault URL and certificate URL from this step need to be entered in the Parameters section of the ARM template, as shown in Figure 3. This must be done for each of the Regions separately, because both the Service Fabric cluster and the Key Vault used by it should reside in the same region.

Figure 3 Storing the Thumbprint, Azure Key Vault URL and Certificate URL

"certificateThumbprint": {
      "type": "string",
      "defaultValue": "<Certificate Thumb print>
    },
    "sourceVaultValue": {
        "type": "string",
        "defaultValue": "/subscriptions/<SubscriptionId>/resourceGroups/
          lpvmvmssrg/providers/Microsoft.KeyVault/vaults/<vaultname>”
    },
    "certificateUrlValue": {
        "type": "string",
        "defaultValue": "https://<vaultname>.vault.azure.net/secrets/
          soloclustercert/<GUID>
        }
  }

Second, a self-signed certificate is generated using Open SSL. The thumbprint of this certificate is entered in the Parameters section of the ARM template:

"clientCertificateStoreValue": {
      "type": "string",
      "defaultValue": "<Client certificate thumbprint
    },

This certificate is used in Jenkins to connect to the Service Fabric cluster to deploy the application. See bit.ly/2GfLHG0 for more information.

Note that this article uses a self-signed certificate for illustration. For production deployments, you’d use Azure Credentials to connect to the management endpoint securely and to deploy the application.

For more on Service Fabric security, see bit.ly/2CeEzpi and bit.ly/2SR1E73.

The health probes used in the load balancer for the MVC application and Web API need to be configured for HTTP, with the right port numbers and the request path, as shown in Figure4.

Figure 4 Configuring the Ports and Request Paths

{
    "name": "SFContainerProbe1",
    "properties": {
      "protocol": "Http",
      "port": 5002,
      "requestPath": "/api/family",
      "intervalInSeconds": 5,
      "numberOfProbes": 2
    }
  },
    {
    "name": "SFContainerProbe2",
    "properties": {
      "intervalInSeconds": 5,
      "numberOfProbes": 2,
      "port": 80,
      "requestPath": "/",
      "protocol": " Http "
    }
  }

Here are the other salient configurations in the ARM template, specific to zonal deployments:

• The standard SKU is chosen for all load balancers used in the cluster.

• The standard SKU is chosen for the Public IP address resource.

• The Availability Zone property needs to be specified. Because this is a zonal deployment, a single zone is specified in a property value. For zonal SF Cluster 1 in South East Asia, this value would be “1” and for zonal SF Cluster 2 in South East Asia, the value would be “2.”

• The property “SinglePlacementGroup” is set to “true.”

• To enable service discovery in the cluster, the Service Fabric DNS must be enabled.

Figure 5 shows how these are configured in the ARM template. Verify that the template has deployed successfully and that the services shown in Figure 5 are visible in the Azure Portal.

VMSS Resources Used in the ARM Template for Zonal Deployment
Figure 5 VMSS Resources Used in the ARM Template for Zonal Deployment

Note that Azure Availability Zones are not available in all Azure Regions yet. Also, the feature has reached general availability (GA) in some regions but is only in preview in others.

Now let’s connect to the Service Fabric cluster. The Service Fabric management endpoint is deployed behind an internal load balancer. Hence, to access the Service Fabric Explorer to deploy the application, deploy a jump box virtual machine (VM) running Windows Server 2016, either in the same virtual network as the Service Fabric cluster, or to a different virtual network that’s peered to that of the cluster.

Copy the certificate for the Service Fabric admin client (the self-signed certificate that was created earlier) to the user certificate store on the jump box VM. When launching the Service Fabric Explorer, select this certificate when prompted.

Next, provision Jenkins to deploy the application. For the scenario in this article, I’ll use a Docker container image running Jenkins and the Service Fabric plug-in. This image is deployed to an Ubuntu VM in Azure running in the same virtual network as the Service Fabric cluster.

You’ll find more information on the steps required to download, install and configure the container image at bit.ly/2RRRt1Q.

To prepare Jenkins to connect to the Service Fabric cluster, I performed the following additional steps:

• Copy the certificate for the Service Fabric admin client (the self-signed certificate created earlier) to the home directory on the Jenkins VM, using tools like WinSCP.

• Start the Jenkins container on this VM and run the “docker exec” command to copy this certificate to the Jenkins home directory inside the container.

Launch Jenkins and log in to https://<Public-IP-of-JenkinsVM>:8080 to configure the Service Fabric plug-in.

The steps to configure the Service Fabric plug-in for Jenkins are described at bit.ly/2UBVhWV. The “Post Build Actions” configuration that was used to run the deployment of the application to one of the zonal clusters is shown in Figure 6. Add an additional Post Build Action for the second zonal cluster, along the same lines.

Service Fabric Plug-in for Jenkins
Figure 6 Service Fabric Plug-in for Jenkins

Trigger the Build Job option in Jenkins manually, and ensure that the post-trigger action completes successfully. From the jump box VM, launch the Service Fabric Explorer to verify that the application is deployed to both zonal clusters. Figure 7shows the Service Fabric Explorer after the application is deployed.

The Management Endpoint in Service Fabric Explorer
Figure 7 The Management Endpoint in Service Fabric Explorer

Repeat these steps to deploy to the second Azure Region. Ensure that the service manifest file of the API project is modified so it points to the right container image of the API Service meant for this region. Recall that the two container images of the API Service each have a separate region priority when connecting to Cosmos DB, and to implement multi-region writes.

Running the Application

To bring up the application deployed to Region 1, launch the following URLs:

• https://<Public-IP-Address-LB Region1> starts the MVC Application

• https://<Public-IP-Address-LB Region1>:5002/api/family starts the Web API directly. The MVC project calls the API internally, using Service discovery.

Another set of URLs would be available for Region 2.

Figure 8 shows the application page accessed through the endpoints exposed by the Azure Front Door Service. You’ll notice a column called DataOrigin that indicates the name of the write region of the Cosmos DB database from which the record was inserted, showcasing the multi-region write capability of Cosmos DB.

Sample Census Application
Figure 8 Sample Census Application

Provision Azure Front Door Service

Use the Azure Portal to provision Azure Front Door Service and add the public-facing endpoints of the MVC and Web API applications as back ends to the Front Door Service.

Health probes configured in the Front Door Service ensure that when one of the back-end endpoints isn’t accessible, due either to an outage in that region or when the application gets unresponsive, subsequent requests are directed to the other healthy one. The configuration settings in Figure 9 show how the application endpoints in the two regions have been mapped to a single endpoint exposed by the Front Door Service. While Azure Traffic Manager could have been used as an alternative to Azure Front Door, I chose to implement the latter here as it provides Layer 7 Reverse Proxy, SSL termination and request routing, which are required in such applications.

Azure Front Door Configuration
Figure 9 Azure Front Door Configuration

You’ll find more information on creating a Front Door for global Web apps at bit.ly/2QXRkwu.

Wrapping Up

In this article I discussed how Azure Service Fabric can be used to package and deploy Docker container-enabled applications, and implement orchestration and service discovery features that are fundamental to a microservices architecture. With the support for Availability Zones in Azure, I deployed 2 Zonal Service Fabric clusters that span multiple datacenters in a region in order to eliminate the effects of datacenter failures.

To increase the reach of the application, I deployed the application to multiple Azure regions, and leveraged the multi-region write support in Azure Cosmos DB, which ensures that not only are the stateless application tiers geo-replicated, but the database also is truly distributed and replicated. Finally, to ensure that users experience the least amount of latency when accessing the application, I implemented a single Azure Front Door Service that routes requests to the right application endpoint. To show how you can implement this architecture with the least amount of disruption to business, and to ensure that security policies are honored, I discussed how CI and CD practices can be implemented using Azure DevOps Service and Jenkins, respectively, and how these can be carried out within the confines of a corporate network.

You can download the sample application at bit.ly/2Lra9Dm. See “Software Prerequisites” for information on the software required to implement the sample application.


Srikantan Sankaran is a principal technical evangelist from the One Commercial Partner team in India, based out of Bangalore. He works with numerous ISVs in India and helps them architect and deploy their solutions on Microsoft Azure. Reach him at sansri@microsoft.com.

Thanks to the following Microsoft technical experts for reviewing this article: Sudhanva Huruli, Muni Pulipalyam


Discuss this article in the MSDN Magazine forum