How To: Scale .NET Applications


Retired Content

This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

patterns & practices Developer Center

Improving .NET Application Performance and Scalability

J.D. Meier, Srinath Vasireddy, Ashish Babbar, and Alex Mackman
Microsoft Corporation

May 2004

Home Page for Improving .NET Application Performance and Scalability

Chapter 3, Design Guidelines for Application Performance

Send feedback to

patterns & practices Library

Summary: There are two main approaches to scaling an application: scaling up and scaling out. This How To helps you to determine which approach is suitable for your application, and gives you guidelines on how to implement your chosen approach.


Applies To
Scale Up vs. Scale Out
Load Balancing
Federated Database Servers
.NET Framework Technologies Scalability Considerations
Process for Scaling .NET Applications
Step 1: Gather New Requirements and Performance Objectives
Step 2: Assess the Current System
Step 3: Choose a Scaling Technique
Step 4: Apply and Validate
Additional Resources

Applies To

  • Microsoft® .NET Framework version 1.1


Scalability refers to the ability of an application to continue to meet its performance objectives with increased load. Typical performance objectives include application response time and throughput. When measuring performance, it is important to consider the cost at which performance objectives are achieved. For example, achieving a sub - second response time objective with prolonged 100% CPU utilization would generally not be an acceptable solution.

This How To is intended to help you make informed design choices and tradeoffs that in turn will help you to scale your application. An exhaustive treatment of hardware choices and features is outside the scope of this document.

After completing this How To, you will be able to:

  • Determine when to scale up versus when to scale out.
  • Quickly identify resource limitation and performance bottlenecks.
  • Identify common scaling techniques.
  • Identify scaling techniques specific to .NET technologies.
  • Adopt a step-by-step process to scale .NET applications.

Scale Up vs. Scale Out

There are two main approaches to scaling:

  • Scaling up. With this approach, you upgrade your existing hardware. You might replace existing hardware components, such as a CPU, with faster ones, or you might add new hardware components, such as additional memory. The key hardware components that affect performance and scalability are CPU, memory, disk, and network adapters. An upgrade could also entail replacing existing servers with new servers.
  • Scaling out. With this approach, you add more servers to your system to spread application processing load across multiple computers. Doing so increases the overall processing capacity of the system.

Pros and Cons

Scaling up is a simple option and one that can be cost effective. It does not introduce additional maintenance and support costs. However, any single points of failure remain, which is a risk. Beyond a certain threshold, adding more hardware to the existing servers may not produce the desired results. For an application to scale up effectively, the underlying framework, runtime, and computer architecture must also scale up.

Scaling out enables you to add more servers in the anticipation of further growth, and provides the flexibility to take a server participating in the Web farm offline for upgrades with relatively little impact on the cluster. In general, the ability of an application to scale out depends more on its architecture than on underlying infrastructure.

When to Scale Up vs. Scale Out

Should you upgrade existing hardware or consider adding additional servers? To help you determine the correct approach, consider the following:

  • Scaling up is best suited to improving the performance of tasks that are capable of parallel execution. Scaling out works best for handling an increase in workload or demand.
  • For server applications to handle increases in demand, it is best to scale out, provided that the application design and infrastructure supports it.
  • If your application contains tasks that can be performed simultaneously and independently of one another and the application runs on a single processor server, you should asynchronously execute the tasks. Asynchronous processing is more beneficial for I/O bound tasks and is less beneficial when the tasks are CPU bound and restricted to a single processor. Single CPU bound multithreaded tasks perform relatively slowly due to the overhead of thread switching. In this case, you can improve performance by adding an additional CPU, to enable true parallel execution of tasks.
  • Limitations imposed by the operating system and server hardware mean that you face a diminishing return on investment when scaling up. For example, operating systems have a limitation on the number of CPUs they support, servers have memory limits, and adding more memory has less effect when you pass a certain level (for example, 4 GB).

Load Balancing

There are many approaches to load balancing. This section contains a discussion of the most commonly used techniques.

Web Farm

In a Web farm, multiple servers are load balanced to provide high availability of service. This feature is currently only available in Windows® 2000 Advanced Server and Datacenter Server. Figure 1 illustrates this approach.,PandP.10).gif

Figure 1: Load balancing in a Web farm

You can achieve load balancing by using hardware or software. Hardware solutions work by providing a single IP address and the load balancer translates client requests to the physical IP address of one of the servers in the farm.

Network Load Balancing (NLB)

Network Load Balancing (NLB) is a software solution for load balancing. NLB is available with Windows 2000 Advanced Server and Datacenter Server. NLB dispatches the client requests (sprays the connection) across multiple servers within the cluster. As the traffic increases, you can add additional servers to the cluster, up to a maximum of 32 servers.

More Information

For more information, see the following resources:


You create a clone by adding another server with all of the same software, services, and content. By cloning servers, you can replicate the same service at many nodes in a Web farm, as shown in Figure 2.,PandP.10).gif

Figure 2: Cloning

Figure 2 shows that you can clone your Web server by copying the same business logic to each Web server.

Federated Database Servers

To support the anticipated growth of a system, a federation of servers running Microsoft SQL Server™ 2000 can be used to host a database. With this approach, the database is installed across all servers, and the tables that need to scale out are horizontally partitioned (split into smaller member tables). Then you create a distributed partitioned view that unifies the member tables to provide location transparency.

More Information

For more information, see the following resources:

.NET Framework Technologies Scalability Considerations

When you scale .NET applications, considerations vary depending on the specific .NET technology involved. You will make more informed decisions if you understand the technology considerations from the start.

ASP.NET Applications or Web Services

When scaling an ASP.NET application or Web service, consider the following:

  • Avoid using the in-process session state store, and avoid running the session state server on a local Web server. You need a state store on a server that is accessible from all servers in the Web farm.
  • Use the ASP.NET session state service running on a remote server, or use SQL Server as your session store.
  • Use application state (and the Application object) only as a read-only store to avoid introducing server affinity. ASP.NET application state is server-specific.
  • Avoid machine-specific encryption keys to encrypt data in a database. Instead, use machine-specific keys to encrypt a shared symmetric key, which you use to store encrypted data in the database. For more information, see Chapter 14, "Building Secure Data Access" in "Improving Web Application Security: Threats and Countermeasures" at
  • Impersonating client identities to make database calls reduces the benefits of connection pooling because multiple pools (one per client identity) are maintained instead of a single shared pool. Consider using the trusted subsystem model instead, and use a single, trusted server identity to connect to the database. For more information, see "Data Access Security" in "Building Secure ASP.NET Applications: Authentication, Authorization and Secure Communication" at

Enterprise Services

When scaling serviced components in an Enterprise Services application, consider the following:

  • Avoid impersonating the original client in a remote Enterprise Services application. Instead, authorize the client using COM+ roles and then use a trusted server identity to access downstream databases and systems to take full advantage of connection pooling.
  • Avoid storing state in the Shared Property Manager (SPM) and consider using a shared state store such as a database. The SPM is not scalable and introduces server affinity.
  • Consider Enterprise Services when you are working with transactions that span across multiple databases, or when you need transactions to flow across components. Be aware that using high transaction isolation levels unnecessarily can result in contention, which reduces scalability.
  • Ensure that client code that calls serviced components always calls the Dispose method. Not doing so can quickly increase memory pressure and can increase the chances of activity deadlocks, thus reducing scalability.

.NET Remoting

When scaling middle-tier remote components that use the .NET remoting infrastructure, be aware that the default TCP channel cannot be load balanced using a NLB solution in a server farm. Therefore, this channel does not provide a good solution for scaling out. Although .NET remoting is not recommended for cross-server communication, if you do use remoting, use the HTTP channel in a server farm to provide the scale-out ability.

More Information

For more information, see "Prescriptive Guidance for Choosing Web Services, Enterprise Services, and .NET Remoting" in Chapter 11, "Improving Remoting Performance"

Process for Scaling .NET Applications

The following steps provide a high-level systematic approach to help you scale your application.

  1. Gather new requirements and performance objectives.
  2. Assess the current system.
  3. Choose a scaling technique.
  4. Apply and validate.

Step 1: Gather New Requirements and Performance Objectives

To achieve new levels of application performance and processing capacity, you have to clearly understand your performance objectives. To achieve scalability, you must continue to meet your performance objectives as demand increases. Make sure that you:

  • Gather new requirements. New requirements usually come from marketing data, past growth, anticipated growth, special events (for example, sales events), or future needs.
  • Quantify your objectives. Common performance objectives for server applications include response time, throughput, and resource utilization.

Step 2: Assess the Current System

Assessing your current application architecture and infrastructure is important for making effective scaling decisions. Make sure that you:

  • Analyze the current system. Start by analyzing your application architecture; understand how the parts of the application interact with each other. Identify your current deployment architecture and analyze the current application infrastructure that supports your application. Understand the current limits of your system in terms of acceptable throughput and response time.

  • Identify components that limit scalability. Identify components that would be most affected if they need to scale up or scale out. These are the components that are most likely to become bottlenecks when the workload increases. Prioritize components that are critical to performance and the overall process handling capacity of your application. Understand the dependencies between system components. The following questions can help identify key issues to consider:

    • Does your design partition your application into logical layers?
    • Do you have logical partitioning and loosely coupled interfaces providing a contract between layers?
    • Does your design consider the impact of resource affinity?
    • Does your implementation manage memory efficiently? Does it nimize hidden allocations; avoid the promotion of short-lived objects; avoid unnecessary boxing; efficiently pass parameters of value types and reference types; avoid excessive allocations and deallocations during string concatenations; choose appropriate type of collection and array for functional requirement; and so on?
    • Does your code handle threads efficiently? Having too many threads consumes resources, increases context switching and contention, and decreases concurrency, resulting in a high CPU utilization rate. Having too few threads unnecessarily constrains the throughput, resulting in underutilized CPU.
    • Does your application handle exceptions efficiently? Does it avoid using exceptions for regular application logic? Does it contain defensive code that uses appropriate validations to avoid unnecessary exceptions? Does it use finally blocks to guarantee that resources are cleaned up when exceptions occur?
    • Does your application efficiently manage Web pages? Does your application optimize page size; avoid the unnecessary use of server controls; handle long-running calls efficiently; cache and manage state (session state and view state) across calls; perform efficient data binding; and interoperate with COM?
    • Does your application efficiently manage business components? Does your application avoid client impersonation in the middle tier; avoid thread affinity and thread switches; use the appropriate transaction and isolation levels; free resources quickly and efficiently; and use object pooling where appropriate?
    • Does your application efficiently manage data access? Does it use efficient paging techniques for large record sets? Does it efficiently serialize data, run queries, manipulate BLOBs, handle dynamic SQL and stored procedures, and handle concurrency and transactions appropriately?
  • Identify server configuration and application parameters that limit scalability.

    To optimize server configuration, you must iteratively identify and reduce bottlenecks until you meet your performance and scalability objectives. To achieve this, you need to understand server configuration settings and application tuning options.

More Information

For more information, see the following resources:

Step 3: Choose a Scaling Technique

Characterize the current workload for each of your performance-critical scenarios and document them. Project the workload pattern for your scaling requirements.

Application Considerations

When designing your application for scalability, consider the following:

  • State considerations. Prefer a stateless design where components do not hold state between successive client requests. If you need to maintain state across client requests, store it a shared data store such as SQL Server to allow shared access across Web servers in a Web farm. For objects that require performance intensive initialization, consider using Enterprise Services object pooling.
  • Resource considerations. Eagerly release resources. Write code that makes efficient use of the common language runtime (CLR) and garbage collector. Factors that quickly contribute to resource pressure include:
    • A large working set.
    • Retaining unmanaged resources.
    • Excessive boxing and unboxing.
    • Throwing too many exceptions, for example because you use them to control application flow.
    • Inefficient string concatenation.
    • Poor choice and implementation of arrays and collections.
  • Caching considerations. ASP.NET can cache data using the caching API, output caching, or partial page fragment caching. Regardless of the implementation approach, you need to consider an appropriate caching policy that identifies what data to cache, where to cache it, and how frequently to update the cache. To use effective fragment caching, separate the static and dynamic areas of your page and use user controls. You must also make sure to tune the memory limit for the cache to perform optimally.
  • Security considerations. Avoid impersonating the original caller in the middle tier. Doing so prevents efficient connection pooling and severely limits scalability. Consider using a trusted subsystem model and use a single service or process identity to access the downstream database. If necessary, flow the original caller's identity using stored procedure parameters.
  • Threading considerations. Avoid thread affinity by carefully choosing the threading model. Avoid single-threaded apartment (STA) components where possible. If you do have to use them from ASP.NET, make sure that you use the ASPCOMPAT attribute.
  • Database design considerations. Consider the following design techniques to increase database responsiveness and throughput:
    • Optimize your database schema for how your application will use the data.
    • Use normalization for write operations.
    • Consider denormalization for read operations if appropriate.
    • Design for partitioning and distribution if appropriate.
    • Optimize queries and stored procedures.
    • Optimize indexes that are periodically maintained.
    • Use stand-alone techniques or combinations of techniques such as distributed partitioned views, data-dependent routing, and replication.

Infrastucture Considerations

There are many infrastructure techniques to handle increasing workload and manage resources.

  • Web and application servers

    Common approaches include the following

    • Scale up your Web server by upgrading to a faster server or by upgrading existing hardware components.
    • Scale out by spreading the workload across servers by adding additional servers to a Web farm.
    • Use NLB to scale out your middle-tier application server.
    • Windows 2000 COM+ components are designed to be used in clusters of Windows 2000 application servers to form a clustered business services tier. Each server has identical sets of COM+ components, and Windows 2000 balances the cluster processing load by sending new requests to the server that has the least processing load. This forms an easily administered cluster that can quickly scale out with the addition of a new server.
  • Database servers

    SQL Server 2000 supports federation of servers with updatable distributed partitioned views used to transparently partition data horizontally across a group of servers. For more information, see "Scaling Out on SQL Server" at

Step 4: Apply and Validate

The next step is to apply the changes and evaluate whether the updates or additions have met the workload requirements. Do the following:

  • Apply the optimization process as follows: Establish a baseline, collect data, analyze results, and optimize the configuration.
  • Apply the capacity planning process or predictive analysis to plan for current and future usage levels. For more information, see "How To: Perform Capacity Planning for .NET Framework Applications."
  • Apply the scaling technique that you chose in Step 3.

Additional Resources

For more information, see the following resources:

patterns & practices Developer Center

Retired Content

This content is outdated and is no longer being maintained. It is provided as a courtesy for individuals who are still using these technologies. This page may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist.

© Microsoft Corporation. All rights reserved.