Chapter 18 – Stress Testing Web Applications

Article
04/27/2010

Performance Testing Guidance for Web Applications

J.D. Meier, Carlos Farre, Prashant Bansode, Scott Barber, and Dennis Rea
Microsoft Corporation

September 2007

Objectives

Understand the key concepts of stress testing.
Learn how to stress-test a Web application.

Overview

Stress testing is a type of performance testing focused on determining an application’s robustness, availability, and reliability under extreme conditions. The goal of stress testing is to identify application issues that arise or become apparent only under extreme conditions. These conditions can include heavy loads, high concurrency, or limited computational resources. Proper stress testing is useful in finding synchronization and timing bugs, interlock problems, priority problems, and resource loss bugs. The idea is to stress a system to the breaking point in order to find bugs that will make that break potentially harmful. The system is not expected to process the overload without adequate resources, but to behave (e.g., fail) in an acceptable manner (e.g., not corrupting or losing data).

Stress tests typically involve simulating one or more key production scenarios under a variety of stressful conditions. For example, you might deploy your application on a server that is already running a processor-intensive application; in this way, your application is immediately “starved” of processor resources and must compete with the other application for processor cycles. You can also stress-test a single Web page or even a single item such as a stored procedure or class.

This chapter presents a high-level introduction to stress-testing a Web application. Stress testing can help you identify application issues that surface only under extreme conditions.

Examples of Stress Conditions

Examples of stress conditions include:

Excessive volume in terms of either users or data; examples might include a denial of service (DoS) attack or a situation where a widely viewed news item prompts a large number of users to visit a Web site during a three-minute period.
Resource reduction such as a disk drive failure.
Unexpected sequencing.
Unexpected outages/outage recovery.

Examples of stress-related symptoms include:

Data is lost or corrupted.
Resource utilization remains unacceptably high after the stress is removed.
Application components fail to respond.
Unhandled exceptions are presented to the end user.

How to Use This Chapter

Use this chapter to understand the key concepts of stress testing and the steps involved in stress-testing a Web application. To get the most from this chapter:

Use the “Input” and “Output” sections to understand the key inputs for stress-testing a Web application and the key outcomes of this type of testing.
Use the “Approach for Stress Testing” section to get an overview of the approach for stress-testing a Web application, and as quick reference guide for you and your team.
Use the various steps sections to understand the details of each step involved in stress-testing a Web application.
Use the “Usage Scenario for Stress Testing” section to understand various real-world scenarios where stress testing is employed.

Input

To perform stress testing, you are likely to use as reference one or more of the following items:

Results from previous stress tests
Application usage characteristics (scenarios)
Concerns about those scenarios under extreme conditions
Workload profile characteristics
Current peak load capacity (obtained from load testing)
Hardware and network architecture and data
Disaster-risk assessment (e.g., likelihood of blackouts, earthquakes, etc.)

Output

Output from a stress test may include:

Measures of the application under stressful conditions
Symptoms of the application under stress
Information the team can use to address robustness, availability, and reliability

Approach for Stress Testing

The following steps are involved in stress-testing a Web application:

Step1 - Identify test objectives. Identify the objectives of stress testing in terms of the desired outcomes of the testing activity.
Step 2 - Identify key scenario(s). Identify the application scenario or cases that need to be stress-tested to identify potential problems.
Step 3 - Identify the workload. Identify the workload that you want to apply to the scenarios identified during the “Identify objectives” step. This is based on the workload and peak load capacity inputs.
Step 4 - Identify metrics. Identify the metrics that you want to collect about the application’s performance. Base these metrics on the potential problems identified for the scenarios you identified during the “Identify objectives” step.
Step 5 - Create test cases. Create the test cases in which you define steps for running a single test, as well as your expected results.
Step 6 - Simulate load. Use test tools to simulate the required load for each test case and capture the metric data results.
Step 7 - Analyze results. Analyze the metric data captured during the test.

These steps are graphically represented below; the following sections discuss each step in detail.

Bb924374.image001(en-us,PandP.10).gif

Figure 18.1* *Stress Testing Steps

Step 1 - Identify Test Objectives

Asking yourself or others the following questions can help in identifying the desired outcomes of your stress testing:

Is the purpose of the test to identify the ways the system can possibly fail catastrophically in production?
Is it to provide information to the team in order to build defenses against catastrophic failures?
Is it to identify how the application behaves when system resources such as memory, disk space, network bandwidth, or processor cycles are depleted?
Is it to ensure that functionality does not break under stress? For example, there may be cases where operational performance metrics meet the objectives, but the functionality of the application is failing to meet them — orders are not inserted in the database, the application is not returning the complete product information in searches, form controls are not being populated properly, redirects to custom error pages are occurring during the stress testing, and so on.

Step 2 - Identify Key Scenario(s)

To get the most value out of a stress test, the test needs to focus on the behavior of the usage scenario or scenarios that matter most to the overall success of the application. To identify these scenarios, you generally start by defining a single scenario that you want to stress-test in order to identify a potential performance issue. Consider these guidelines when choosing appropriate scenarios:

Select scenarios based on how critical they are to overall application performance.
Try to test those operations that are most likely to affect performance. These might include operations that perform intensive locking and synchronization, long transactions, and disk-intensive input/output (I/O) operations.
Base your scenario selection on the specific areas of your application identified as potential bottlenecks by load-testing data. Although you should have fine-tuned and removed the bottlenecks after load testing, you should still stress-test the system in these areas to verify how well your changes handle extreme stress levels.

Examples of scenarios that may need to be stress tested separately from other usage scenarios for a typical e-commerce application include the following:

An order-processing scenario that updates the inventory for a particular product. This functionality has the potential to exhibit locking and synchronization problems.
A scenario that pages through search results based on user queries. If a user specifies a particularly wide query, there could be a large impact on memory utilization. For example, memory utilization could be affected if a query returns an entire data table.

Step 3 - Identify the Workload

The load you apply to a particular scenario should stress the system sufficiently beyond threshold limits to enable you to observe the consequences of the stress condition. One method to determine the load at which an application begins to exhibit signs of stress is to incrementally increase the load and observe the application behavior under various load conditions. The key is to systematically test with various workloads until you create a significant failure. These variations may be accomplished by adding more users, reducing delay times, adding or reducing the number and type of user activities represented, or adjusting test data.

For example, a stress test could be designed to simulate every registered user of the application attempting to log on during one 30-second period. This would simulate a situation where the application suddenly became available again after a period of downtime and all users were anxiously refreshing their browsers, waiting for the application to come back online. Although this situation does not occur frequently in the real world, it does happen often enough for there to be real value in learning how the application will respond if it does.

Remember to represent the workload with accurate and realistic test data — type and volume, different user logins, product IDs, product categories, and so on — allowing you to simulate important failures such as deadlocks or resource consumption.

The following activities are generally useful in identifying appropriate workloads for stress testing:

Identify the distribution of work. For each key scenario, identify the distribution of work to be simulated. The distribution is based on the number and type of users executing the scenario during the stress test.
Estimate peak user loads. Identify the maximum expected number of users during peak load conditions for the application. Using the work distribution you identified for each scenario, calculate the percentage of user load per key scenario.
Identify the anti-profile. As an alternative, you can start by applying an anti-profile to the normal workload. In an anti-profile, the workload distributions are inverted for the scenario under consideration. For example, if the normal load for the order-processing scenario is 10 percent of the total workload, the anti-profile would be 90 percent of the total workload. The remaining load can be distributed among the other scenarios. Using an anti-profile can serve as a valuable starting point for your stress tests because it ensures that the critical scenarios are subjected to loads beyond the normal load conditions.

Step 4 - Identify Metrics

When identified and captured correctly, metrics provide information about how well or poorly your application is performing as compared to your performance objectives. In addition, metrics can help you identify problem areas and bottlenecks within your application.

Using the desired performance characteristics identified during the “Identify objectives” step, identify metrics to be captured that focus on potential pitfalls for each scenario. The metrics can be related to both performance and throughput goals as well as providing information about potential problems; for example, custom performance counters that have been embedded in the application.

When identifying metrics, you will use either direct objectives or indicators that are directly or indirectly related to those objectives. The following table describes performance metrics in terms of related performance objectives.

Performance metrics	Category
Base set of metrics
Processor	Processor utilization
Process	Memory consumption Processor utilization Process recycles
Memory	Memory available Memory utilization
Disk	Disk utilization
Network	Network utilization
Transactions/business metrics	Transactions/sec Transactions succeeded Transactions failed Orders succeeded Orders failed
Threading	Contentions per second Deadlocks Thread allocation
Response times	Transactions times

Step 5 - Create Test Cases

Identifying workload profiles and key scenarios generally does not provide all of the information necessary to implement and execute test cases. Additional inputs for completely designing a stress test include performance objectives, workload characteristics, test data, test environments, and identified metrics. Each test design should mention the expected results and/or the key data of interest to be collected, in such a way that each test case can be marked as a “pass,” “fail,” or “inconclusive” after execution.

The following is an example of a test case based on the order-placement scenario.

Test 1 – Place Order Scenario

Workload: 1,000 simultaneous users.
Think time: Use a random think time between 1 and 10 seconds in the test script after each operation.
Test Duration: Run the test for two days.

Expected results:

Application hosting process should not recycle because of deadlock or memory consumption.
Throughput should not fall below 35 requests per second.
Response time should not be greater than 7 seconds for 95 percent of total transactions completed.
“Server busy” errors should not be more than 10 percent of the total response because of contention-related issues.
Order transactions should not fail during test execution. Database entries should match the “Transactions succeeded” count.

Step 6 - Simulate Load

After you have completed the previous steps to an appropriate degree, you should be ready to simulate the load executing the stress test. Typically, test execution follows these steps:

Validate that the test environment matches the configuration that you were expecting and/or designed your test for.
Ensure that both the test and the test environment are correctly configured for metrics collection.
Before running the test, execute a quick “smoke test” to make sure that the test script and remote performance counters are working correctly.
Reset the system (unless your scenario is to do otherwise) and start a formal test execution.

Note: Make sure that the client (a.k.a. load generator) computers that you use to generate load are not overly stressed. Utilization of resources such as processor and memory should remain low enough to ensure that the load-generation environment is not itself a bottleneck.

Step 7 - Analyze Results

Analyze the captured data and compare the results against the metric’s accepted level. If the results indicate that your required performance levels have not been attained, analyze and fix the cause of the bottleneck. To address observed issues, you might need to do one or more of the following:

Perform a design review.
Perform a code review.
Run stress tests in environments where it is possible to debug possible causes of failures, during test execution.

In situations where performance issues are observed, but only under conditions that are deemed to be unlikely enough to warrant tuning at the current time, you may want to consider conducting additional tests to identify an early indicator for the issue in order to avoid unwanted surprises.

Usage Scenarios for Stress Testing

The following are examples of how stress testing is applied in practice:

Application stress testing. This type of test typically focuses on more than one transaction on the system under stress, without the isolation of components. With application stress testing, you are likely to uncover defects related to data locking and blocking, network congestion, and performance bottlenecks on different components or methods across the entire application. Because the test scope is a single application, it is common to use this type of stress testing after a robust application load-testing effort, or as a last test phase for capacity planning. It is also common to find defects related to race conditions and general memory leaks from shared code or components.
Transactional stress testing. Transactional stress tests aim at working at a transactional level with load volumes that go beyond those of the anticipated production operations. These tests are focused on validating behavior under stressful conditions, such as high load with same resource constraints, when testing the entire application. Because the test isolates an individual transaction, or group of transactions, it allows for a very specific understanding of throughput capacities and other characteristics for individual components without the added complication of inter-component interactions that occurs in testing at the application level. These tests are useful for tuning, optimizing, and finding error conditions at the specific component level.
Systemic stress testing. In this type of test, stress or extreme load conditions are generated across multiple applications running on the same system, thereby pushing the boundaries of the applications’ expected capabilities to an extreme. The goal of systemic stress testing is to uncover defects in situations where different applications block one another and compete for system resources such as memory, processor cycles, disk space, and network bandwidth. This type of testing is also known as integration stress testing or consolidation stress testing. In large-scale systemic stress tests, you stress all of the applications together in the same consolidated environment. Some organizations choose to perform this type of testing in a larger test lab facility, sometimes with the hardware or software vendor’s assistance.

Exploratory Stress Testing

Exploratory stress testing is an approach to subjecting a system, application, or component to a set of unusual parameters or conditions that are unlikely to occur in the real world but are nevertheless possible. In general, exploratory testing can be viewed as an interactive process of simultaneous learning, test design, and test execution. Most often, exploratory stress tests are designed by modifying existing tests and/or working with application/system administrators to create unlikely but possible conditions in the system. This type of stress testing is seldom conducted in isolation because it is typically conducted to determine if more systematic stress testing is called for related to a particular failure mode. The following are some examples of exploratory stress tests to determine the answer to “How will the system respond if…?”

All of the users logged on at the same time.
The load balancer suddenly failed.
All of the servers started their scheduled virus scan at the same time during a period of peak load.
The database went offline during peak usage.

Summary

Stress testing allows you to identify potential application issues that surface only under extreme conditions. Such conditions range from exhaustion of system resources such as memory, processor cycles, network bandwidth, and disk capacity to excessive load due to unpredictable usage patterns, common in Web applications.

Stress testing centers around objectives and key user scenarios with an emphasis on the robustness, reliability, and stability of the application. The effectiveness of stress testing relies on applying the correct methodology and being able to effectively analyze testing results. Applying the correct methodology is dependent on the capacity for reproducing workload conditions for both user load and volume of data, reproducing key scenarios, and interpreting the key performance metrics.