Microsoft IT transitions core business applications to Microsoft Azure PaaS
Technical Case Study
Microsoft IT has transitioned many line-of-business applications to Microsoft Azure platform as a service (PaaS). This article describes why they moved four of these applications and how the transition helped Microsoft IT, other Microsoft teams, and customers.
|Why adopt this strategy?|
Technical Case Study, 844KB, Microsoft Word file
|The Azure platform as a service (PaaS) model conveys the greatest value, especially for complex applications that cannot be supported through software as a service (SaaS). While the other, simpler cloud-delivery models—SaaS and IaaS (infrastructure as a service)—are appropriate for certain tasks, clients who use them realize only some of the capabilities of PaaS.|
In the early 2010s, Microsoft IT decided that any new line of business (LOB) application will be built in Azure PaaS or software as a service (SaaS), and that major application redesigns will also be done in Azure PaaS or SaaS.
There are other cloud delivery models, so why adopt this strategy? Because the PaaS model conveys the greatest value, especially for complex applications that cannot be supported through SaaS. While the other, simpler cloud-delivery models—SaaS and IaaS (infrastructure as a service)—are appropriate for some tasks, clients who use them realize only some of the capabilities of PaaS. The three cloud delivery models are:
SaaS. In this model, the client pays a monthly fee to use a prepackaged program—software that belongs to the hosting organization. Examples are Microsoft Office 365, Microsoft Dynamics Online, InTune, Microsoft Dynamics CRM, Exchange, and SharePoint.
IaaS. This model shifts the responsibility for hosting away from the client to a hosting organization. Examples of IaaS are servers that are hosted in Azure—virtual servers or virtual OS images. When using IaaS, clients do not create new applications; they run the same applications on the virtual server that they would run in their private datacenter.
PaaS. This model starts with the hosting freedom of IaaS and adds prepackaged components that can be used as building blocks for modern LOB applications.
SaaS or IaaS are used for migrating applications or data. In contrast, PaaS is a completely new model, a cloud platform upon which to write or re-architect existing applications. Of course, this model requires more development time, but in the long term, it provides the most benefit because it lets application owners or developers work at a higher level. These benefits include:
Using PaaS avoids many of the activities necessary to maintain the computing environment. For example, in Azure PaaS, you do not need to manage patching. (PaaS has this attribute in common with IaaS for OS patches and with SaaS for application patches.)
Application owners can use Azure PaaS components, such as storage, web sites, and worker roles. In IaaS, for example, developers still must build the applications and then redeploy them to the cloud-based environment.
But it takes an engineering team time to redesign a working application—so why do it? What value did Microsoft realize? The following sections describe both the general advantages of moving applications to Azure PaaS and the actual value realized by moving four important applications used at Microsoft.
The strategy to move applications to PaaS was based on these advantages. Soon after the initial availability of portions of Microsoft Azure, Microsoft IT began to move applications to the new cloud computing platform. By 2015, it had re‑architected and deployed numerous LOB applications, including BCWeb, Connect, Paystub and Employee Data Management, and Returns Service.
Application 1: BCWeb
Business Case Web (BCWeb) is a web-based application that Microsoft employees use to create business cases for exemptions to product pricing. The application is critical to sales operations and has been used by hundreds of users daily since its deployment in 2007. BCWeb has a predictable pattern of use, peaking several times through the year. This pattern made it a great candidate for the inherent scalability of Azure PaaS, saving the cost and operational overhead of maintaining a large on-premises infrastructure to support these peaks.
BCWeb has a user base of 2,500 Microsoft employees; employees use it to process approximately 27,000 pricing-exception requests each year. It was one of the earliest Microsoft IT applications to be migrated to Azure in 2010.
How things worked before
Originally deployed as a solution that used SharePoint, BCWeb uses separate user interfaces for the production of business-case documents and for the management of the workflow and approval process. The transactional data for pricing exceptions was stored as individual SharePoint documents within a SharePoint database.
Before it was moved to Azure, BCWeb had undergone important architectural changes, including the implementation of network load balancing, redesigning to increase reliability, and migration of its storage from SharePoint to a solution based on Microsoft SQL Server.
Deciding to make the change
In 2008, Microsoft IT introduced a new pricing exception type in BCWeb that caused an eightfold increase in BCWeb traffic. This traffic increase resulted in significant performance and reliability issues because the existing application infrastructure could not handle the workload. The application became unstable and even went offline occasionally during peak usage at month-end, quarter-end, and year-end. Despite attempts to alleviate these issues through load balancing, architectural changes, and data-storage changes, the performance and reliability of BCWeb was not adequate.
After the appearance of Azure, Microsoft IT searched for an application to showcase both the extensive capabilities of Azure and the ease of migrating applications to new Azure PaaS. Additionally, Microsoft IT planned to use the migration to develop best practices and reusable components that it could use in future migrations.
Description of the ported application
BCWeb has three distinct application components:
The BCWeb core component provides a web-based UI and underlying functionality that enable users to generate business cases for pricing exceptions.
The Workflow Routing and Approval system (WRAP) routes the user-generated pricing-exception requests for approval within the Microsoft corporate infrastructure.
Rapport provides an additional web-based user interface for the WRAP approval process.
The diagram in Figure 1 shows the relationships among the components of the re‑architected solution.
Figure 1. Architecture of the BCWeb application [view larger version of image in new window]
A key aspect of the project’s design goals was to provide an application that was scalable in performance and resilient in the case of single or multiple component failure, while retaining the same functionality of the original application. Fortunately for Microsoft IT, Azure natively supports architecture components that can fulfill these requirements.
The various capabilities of BCWeb in its three major components were mapped to Azure components. The two web-based user interfaces mapped directly to Azure web roles. The BCWeb services and background processes were split into two groups and mapped to Azure worker roles. The Windows Communication Foundation (WCF) services were assigned to one worker role, while the background processes and notification mechanisms were assigned to another worker role. Finally, Microsoft IT mapped the SQL Server database components to Azure SQL Database.
Architecting BCWeb for resiliency
For the critical aspects of the application, such as the UI and the core WCF components, the team implemented multiple instances of Azure roles running concurrently. In this way, each role can maintain performance by providing more application resources, and provide continued service should a component within any instance fail.
Benefits of Azure PaaS
Microsoft IT realized a number of benefits from porting BCWeb to Azure PaaS:
Increased performance and stability. The development team used built-in Azure components to increase performance and reliability. Multiple role instances in Azure enabled BCWeb to scale according to performance demands, while providing redundancy to protect against single points of failure within the application architecture.
Cost savings. BCWeb hosted on Azure can utilize the distributed cloud architecture. This means that no hardware must be purchased or maintained, as the BCWeb application is stored at an Azure datacenter.
Code refactoring savings. Much of the original application code from the previous version of BCWeb was reused or slightly refactored for use in the Azure version.
Reusable components. The development team created several different components to enable BCWeb to function properly on Azure. Many of these components can now be reused with minimal refactoring in other Azure migrations.
Working with a distributed application brought many challenges to the migration process, but the flexibility of the cloud development platform allowed Microsoft IT to migrate the application while maintaining design goals.
Migrating BCWeb to Azure provided the application with the stability and flexibility that it lacked in its original architecture. Microsoft IT learned numerous best practices for distributed migration that will be used in future implementations.
Application 2: Connect
The Connect application, formerly called "Performance," is used by all Microsoft full-time employees for writing and submitting performance reviews and obtaining feedback from managers. Employees and managers can also use it any time of the year to arrange business-aligned conversations. While Connect can be used spontaneously, it is primarily used on a cyclical basis, two or four times per year. It stores records of these conversations, performance reviews, and feedback.
How things worked before
Originally, Microsoft IT ran an on-premises employee performance application that facilitated a prescribed, cumbersome process of assigning, documenting, and tracking employee commitments on an annual basis. This application required a large web cluster that sat idle much of the year.
Next, a cloud performance application called "Performance@Microsoft" was developed, rolled out, and used for about two years. It was a hybrid version, with parts living in the cloud. It used the same process of tracking employee commitments year by year.
Deciding to make the change
In 2013, Microsoft changed its employee performance review model. This change was part of a new "One Microsoft" philosophy for fostering teamwork and collaboration.
The business process change in the employee review process called for a new solution. It was time to move to version 2, to upgrade the hybrid performance application to an all cloud-based version. Azure was there to provide an elastic computing model that would allow the application to increase capacity as needed to meet peak load while minimizing costs in the quiet periods.
Description of the ported application
The new performance application, Connect, was the first second-generation cloud LOB solution at Microsoft. It was designed specifically to work with the newly introduced performance-management process.
In the new solution, a "connect" is an opportunity for an employee to meet with their manager and have a brief, meaningful discussion about the past few months and the upcoming months. The data from this meeting is captured in a simple form that is configurable for different roles in the company. While connects can take place at any time, the performance review model still has a basic cyclical nature that demands elasticity in the accompanying performance tool.
Previously, with the Performance application, there was a yearly review and a mid-year review. With Connect, managers and employees can agree on any number of performance review discussions, which the tool supports.
The new application also includes a workflow that supports traditional performance reviews: the employee enters their accomplishments during the period, the manager reads the employee accomplishments, and then responds in the tool. Connect stores all of this information indefinitely.
Connect also allows for feedback from both direct reports and other coworkers, which is collected in the system. An employee can request feedback from other people whom they have worked with over a relevant period. The tool then sends a request to provide feedback. This feedback should be taken into account when the yearly cycle is finalized. (A yearly cycle for the performance reviews is still in place.)
Connect is now a second-generation Azure PaaS application. The first version had a hybrid architecture because it used both PaaS components (including Azure databases) and on-premises components (including SQL Server databases).
This application now maintains all of its data in the cloud. Premium service tiers in an Azure SQL Database enables this scenario, guaranteeing a level of cores, worker threads, and disk I/O. Microsoft IT used performance tests and SQL Database views on resource consumption to gauge the need for Premium tiers and chose the P2 level (Premium/P2 is the second highest of the Service Tier/Performance Level offerings for the Azure SQL Database service); this level was sufficient for the peak load scenarios.
Azure Blob Storage (ABS) is used to store rich-text content and attachments. Traditional Azure database storage (such as ABS) is less expensive than Azure SQL Database storage; this lowers overall storage costs by storing mostly metadata in Azure SQL Database storage. Moving all data to the cloud (including the role-based authorization data) made it possible to remove the on-premises components of Connect, so it was no longer necessary to rely on the Azure Service Bus or on a VPN.
Connect now takes advantage of Foundation services, shown at top right in Figure 2. Connect and most of the other Human Resources LOB applications were originally designed to take advantage of several shared Foundation services. This allows features such as configuration, logging, and email notifications to be available in the cloud.
The Foundation services and database were a great fit for the PaaS model, and porting them to the cloud was straightforward. It took about a day to get them running on the cloud development platform. Today, they are shared across many cloud properties in Microsoft IT, including the Paystub application.
The Foundation configuration service hosts all of the configuration data for Connect. All environment-specific data is stored here, including connection strings, web service URLs, and configurable content. This configuration service is important in the cloud for two reasons. First, it let Microsoft IT build one cloud package that can be deployed across all environments (this is not possible if environment-specific data is embedded in the cloud package, such as in a web.config file). Second, it enables application configuration to be updated without requiring redeployment of the service.
Figure 2 shows the relationships among the components of the Connect solution.
Figure 2: Architecture of the Connect application [view larger version of image in new window]
Benefits of Azure PaaS
Microsoft IT realized the following benefits from creating the second-generation Connect application in Azure PaaS:
Agility. This application was built from scratch in four months with six developers. That speed is unheard of for an application that supports 120,000 people. One reason for the speed is that this is a second-generation application, so the developers were well versed in how Azure works. They were able to easily connect blob storage to worker roles, and use web roles with cache services for the web sites.
Elasticity. Because it is used on a cyclical basis, this application is idle for much of the year. It performs well at peak usage and conserves resources during idle times.
Scalability. The biggest benefit from moving to the cloud has been an increased ability to scale out. This application has reach and scale, as it is used by all of the more than 120,000 Microsoft employees. Now that it runs 100 percent on Azure, it can scale up and down nearly invisibly.
Application 3: Paystub and Employee Data Management
The Microsoft Paystub application is a typical corporate paystub application that shows employees current and year-to-date payroll information such as gross pay, net pay, and itemized deductions. Paystub is one of a group of employee-based online activities that also includes applications for managing benefits, talent and performance, and employee learning.
How things worked before
The original Paystub tool was suboptimal in several ways:
Redundant data. When an employee used the application, data was copied from the source and maintained in a local copy. This data was synchronized by using batch jobs. This model resulted in several problems, including:
Delay. There was a delay of 24 to 48 hours to drive changes into the master dataset.
Storage cost. This redundant dataset incurred a significant storage cost.
Reduced security. Copies of sensitive data (such as employee social security numbers) were pulled from the master dataset and managed locally. This had security implications.
Error message latency. If errors occurred, they were not visible for this same period of time (24 to 48 hours), so any issues with updates wouldn’t be known until the next day at the earliest.
Business logic location. The business logic lived in the front-end, so code changes were required whenever business rules were reconfigured. For example, the simple addition of a region/country code required an update to the source code, which caused disruption to the tool’s users, and put the application at risk if the developer inadvertently made mistakes in the code.
On-premises hardware. The hardware required to maintain the Paystub application included two separate environments, for a total of four IIS virtual machines (two processors, 16 GB each) and two SQL virtual machines (four processors, 16 GB each).
Deciding to make the change: moving to EDM
Paystub differs from the other applications in this article because it was re-architected not singly but within a broader framework of applications. This broader framework is known as Employee Data Management (EDM), an effort meant to streamline processes, systems, and operational support for global "hire to retire" transactions across Microsoft. The main goals of EDM are to provide a single, consistent experience for all employee activities and to do this by building in Azure PaaS.
Within the EDM ecosystem of applications are the Time and Absence Reporting tool, Direct Deposit, W-4, HeadTrax, and MS-Vacation. EDM provides users with a central hub experience as well as mobile access for particular functions. EDM also enables self-service usage for employees, managers, and Human Resources staff.
The EDM model is based on these principles:
Use only one source of data. This means that data is not to be overridden downstream.
Provide real-time views (such as in dashboards) and updates.
Automate transactions as much as possible.
Provide a single destination for managers and employees.
Behind the Paystub application is a process for the approval of changes to business processes; this approval process is region/country-specific in regards to vacation accrual and other details. Using the EDM model, this approval process (including escalations) is configurable by region/country.
Microsoft expected to realize the following advantages for all applications re-architected in the EDM model:
Elimination of inefficiency in "hire to retire" transactions. Increased IT agility, more real-time data, improved compliance data flow, and expanded automation of business rules.
An improved employee experience and a better-connected workforce. The employee profile integrates LinkedIn, user-entered, and corporate data to improve personal data management, job searching, mentor connections, talent identification, and other scenarios.
Manager accountability and capability. Managers (or their delegates) can initiate and monitor organizational activities by using a dashboard.
Description of the ported application
In the new model, data redundancy was eliminated because direct calls from Azure to SAP were enabled through the Azure Service Bus. This eliminated the need to host local datasets and run daily batch jobs, which in turn let all data and error handling in SAP be reflected in the cloud in real time rather than with the latency shown in the previous model.
This change also resolved the security implications of hosting and updating multiple copies of sensitive information. Now, when an employee accesses their paystub, the call is made directly to SAP and gives the employee real-time visibility to paystub details. This connection is severed the moment the employee closes their browser window.
Business rules were all moved to SAP. With this move, simple changes (such as adding a new region/country) require only configuration changes, without the need to modify application code. This means that region/country and company code configurations can be made in real time, which enables rapid onboarding.
Finally, the on-premises hardware is no longer needed because an Azure website is used not only for Paystub, but also for other tools in the EDM model, such as HeadTrax.
Figure 3 shows the relationships among the components of the re-architected Paystub solution.
Figure 3. Architecture of the Paystub application [view larger version of image in new window]
As part of EDM, a new services layer was implemented in Azure that serves more than just the Paystub application. For example, the internal Human Resources site HRWeb, which shows employee benefits and other information, uses the same services layer that Paystub uses. Related components of this services layer that support reports, transactions, configurable rules, and approvals are implemented in Azure.
Benefits of Azure PaaS
The primary benefit of Azure PaaS is that applications in the EDM model can be built quickly by using common services and data. Paystub is just one application within a cohesive ecosystem of applications centered on data and services.
In addition to reduced development time, the following benefits were realized when the Paystub application was ported:
The application now uses real-time data.
Data synchronization issues are limited.
Microsoft realizes data-storage cost savings.
Infrastructure and associated capital expenditures have been reduced.
Benefits from decoupling of business logic:
System changes (such as configuration updates) are now in real time, whereas previously they required development effort.
The cost and time of adding features is greatly reduced.
There is reduced potential for code errors when changing business logic.
Mobile applications can now be supported.
Benefits that extend beyond Paystub
Because of this platform-level change, adding functionality is simplified. Specifically, other HR-related features will be able to take advantage of this configurable software model. In many cases, as with Paystub, it is as simple as setting up an Azure website. This website will be able to seamlessly interact with the EDM services layer, where the business rules are configurable. And with the business rules exposed, and by leveraging Azure AD phone authentication, mobile devices can also be enabled.
One such example of new functionality that the HR IT team is finalizing is the end-to-end hiring process within Microsoft, enabling reduction of steps and manual processes by collecting data upfront, automating approvals and notifications, and validating data with business rules.
Application 4: Returns Service
The Microsoft online and retail stores had supported a product-return process that required manual data entry and multiple systems. This process introduced errors at various stages and created dissatisfaction among customers.
To replace the manual system, a new service was designed that automates customer self-serve requests for return shipping labels and integrates with microsoftstore.com.
The new Returns Service system has a responsive, user-friendly flow that has increased satisfaction for customers while reducing support time and minimizing errors. It shows elasticity because it is built on Azure PaaS, using Azure services that can scale on demand.
How things worked before
The manual process for returning merchandise had often been frustrating and time consuming. Customers had to work with a customer service agent, who manually entered the relevant data, after which the customer needed to wait 24 hours to receive the shipping label.
This was painful for customers because it was manual, slow, and sometimes introduced errors such as assigning refund credit incorrectly. This system was also a problem for Microsoft in that it had invisible failure points such as errors in inventory numbers. This became costly for Microsoft because resources were required to address these issues.
Deciding to make the change
Microsoft Customer Service realized that this was a business problem that lowered customer satisfaction. Microsoft IT was tasked with developing an automated system to provide a user-friendly flow to increase customer satisfaction. While complying with existing business processes, this new service had to integrate immediately with the Microsoft online store and potentially with other tenants in the future.
Description of the ported application
The Returns Service application was created and deployed in the second half of 2014. This first version fulfilled most of the design requirements. Others, such as integration with the inventory system, were not included but could appear in a future version.
Figure 4 shows the relationships among the components of the re-architected solution.
Figure 4. Architecture of the Returns Service application [view larger version of image in new window]
The new Returns Service has little in common with the process that came before. In the new service, a consumer submits an online request for a return shipping label through a tenant such as microsoftstore.com. This HTTP POST request must specify several things, including the Ship From address and the Locale.
A custom Web API (an Azure Cloud Service web role) accepts the request, authenticates it, and stores its payload data, in JSON format, in ABS. It also places a message for this request in an Azure Service Bus component that acts as a subscribe-and-publish queue.
The Return Label Worker Processor (an Azure Cloud Service worker role) queries configuration data that has been stored in Azure Table Storage to determine the name and address of the correct returns operations center (ROC). The Return Label Worker Processor calls the web service of the appropriate carrier (such as UPS) to request the creation of a return shipping label. The web service of the carrier generates the label and returns it to the Return Label Worker Processor. In the PDF, the Ship To address is that of the ROC.
The Return Label Worker Processor stores the generated label in a PDF in ABS. Within seconds after the first request, the tenant submits a call to retrieve the generated PDF from ABS. Finally, the customer obtains the PDF in email from the tenant. This label will enable the customer to return the merchandise to the correct ROC without charge.
The Returns Service also stores log information for all its transactions and uses Microsoft StreamInsight to analyze this log data and format it for display in a dashboard. This lets the Microsoft DevOps team monitor the health of the service in real time.
To create the Returns Service web application, the Microsoft IT development team used several off-the-shelf Azure components, primarily a Cloud Service web role, a Cloud Service worker role, the Azure Service Bus, ABS, and Azure Table Storage.
The Returns Service was completed and deployed relatively quickly. Because the development team from the start of development could simply make use of Azure services instead of building them from scratch, the team could concentrate on coding the API, integrating, and testing elements instead of reinventing services for data storage, queueing, and authentication.
Benefits of Azure PaaS
Microsoft IT realized a number of benefits from creating the new automated Returns Service in Azure PaaS:
Elasticity and scalability. Azure gives the Returns Service the ability to scale up and scale down to exactly match the level of demand from tenants and customers.
Global coverage. The network of Microsoft-managed datacenters gives Azure global reach. The Returns Service uses this feature to serve customers globally.
High availability. Most services of Azure have guaranteed 99.9 percent availability, which is reflected in the availability of the Returns Service.
Load balancing. Azure Traffic Manager improves the availability and network performance of applications by distributing incoming traffic among service instances in cloud services or on virtual machines.
Agility. The service was developed and deployed both rapidly—in just a few months—and cost effectively.
Cost savings. The old process required a team of customer service agents to process label requests. The new, automated process reduced this need and began saving money right after deployment. The agility of Azure let the cost savings appear quickly.
Low maintenance. Because the elements of Azure are available in the cloud and provided by Microsoft, there is no need to be concerned with system updates; they are provided automatically.
User satisfaction. Microsoft measured significantly increased user satisfaction for the new returns process in the months that followed its deployment.
The Microsoft online and retails stores were supporting a product return process that required manual data entry and multiple systems. Microsoft teams that were involved with product returns defined and developed an automated solution that uses Azure PaaS. Azure gave the new customer-facing business process high availability, global coverage, and elasticity. The system has reduced the number of manual hours required to support the returns process, reduced errors, and improved the customer experience.
Moving an application to Azure PaaS can pay off in any of several ways. For each application described in this article, Microsoft IT realized some of the following advantages of cloud computing:
Agility. An application that is used for a frequently changing business process requires agility. An example would be a marketing application that must adjust as new marketing campaigns are introduced.
Elasticity. Examples of applications that require elasticity are those that a business uses with more intensity on a cyclical basis, such as many revenue applications at the end of a month, quarter, or fiscal year.
Scalability. This means that the application can easily scale to accommodate large numbers of users. An example would be an application that employees at a large organization would use to enroll in or change healthcare benefits during a limited open enrollment period.
Cost savings. By moving applications to the cloud, organizations can realize cost savings in a number of ways, both during development and after application rollout.
Enabling modern applications. "Modern" often refers to an application that can be accessed by a range of devices inside and outside existing networks; enabling this access is made easier by moving the application to the cloud. (Azure contains a range of mobile services that make it easier to move an application to a phone or a tablet.)
User satisfaction. Microsoft has measured increased user satisfaction for several applications after they were moved to the cloud.
New ways of solving problems. Azure provides new ways to address problems; in particular, a higher level of abstraction from the infrastructure, which provides an easier way to build applications.
New platforms that enable a cohesive ecosystem centered on data or services**.** An example of this is Paystub, one of several applications in the Employee Data Management framework.
For more information
For more information about Microsoft products or services, call the Microsoft Sales Information Center at (800) 426-9400. In Canada, call the Microsoft Canada Order Centre at (800) 933-4750. Outside the 50 United States and Canada, please contact your local Microsoft subsidiary. To access information via the World Wide Web, go to:
© 2015 Microsoft Corporation. All rights reserved. Microsoft and Windows are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.