Microsoft Commerce Server 2000: Scalability Planning

This chapter introduces you to concepts for planning how to scale your Microsoft Commerce Server 2000 site. Commerce Server is a highly scalable platform, which makes it easy to add hardware and partition databases or to move processes to enhance performance as your volume of business increases.

A well-designed Web server farm can be expanded (scaled) cost-effectively to accommodate sudden increases in site traffic. Web applications must be scalable because you can't predict customer load at any particular time, but you must handle the load as it occurs. With Commerce Server, you can design and build an affordable, highly scalable server farm that will support thousands of concurrent users. When you scale your Commerce Server site, you:

  • Increase the number of users each server can handle.

  • Increase the number of concurrent users your site can support.

  • Provide faster response times.

The following table describes the elements you should consider when planning how to scale your site.

Scaling element


Throughput (per component)

The maximum number of transactions a particular component can handle in a specified length of time. You measure throughput by rate per second (for example, 400 transactions per second).

Load balancing

The distribution of client requests among multiple servers within a server cluster. If a server fails, the load is dynamically redistributed among the remaining servers.
Of all the scaling elements, load balancing offers some of the most dramatic performance improvements. If you add additional servers and evenly distribute load across them, the throughput of the application can increase in a linear fashion. You can use the Network Load Balancing (NLB) tool, available as part of Windows 2000, to balance the load across all your servers.

Fault tolerance

The ability to deal with failures and faults. The ability to continually handle load is a key component of scalability.


The addition of an external component to an application to preserve requests that the application is unable to process immediately. Stored requests are resubmitted when it is possible to process them. If a portion of the application does not need to respond to a request immediately, it is a candidate for a queue.
Queuing does not improve overall throughput, but it is one method of shifting load. By holding on to excess requests submitted during peak processing times and resubmitting them at a later time, the application can handle more requests than it could handle otherwise.


The ability to postpone transactions until a later time. A throttling mechanism is similar to a queue in that it prevents the load from exceeding the maximum throughput rate and enables a component to remain functional during a period of load that would normally be more than it could handle. If an application includes a client that has the ability to retry its request at a later time, it is a candidate for a throttling mechanism.


The ability to handle contention for shared resources. The techniques you use to deal with locking can affect the amount of load an application can handle.

The following table describes several techniques for scaling your site.



Scale hardware vertically

Increase capacity by upgrading hardware, while maintaining the same physical footprint and number of servers. Scaling hardware vertically simplifies system administration, but has a higher hardware cost than scaling horizontally or optimizing software architecture. In addition, once you reach maximum capacity on existing hardware, you must begin to scale horizontally.

Scale hardware horizontally

Increase capacity by adding servers. Scaling hardware horizontally enables you to increase hardware capacity at a lower cost. However, once your site becomes too complex to manage, you must begin to scale it vertically.

Optimize site architecture

Improve server efficiency by identifying operations with similar workload factors and dedicating servers to each type of operation. You can significantly improve site capacity by dedicating servers to operations with similar workload factors (rather than to a mixed-operation workload) and optimizing performance. You should plan architectural improvements early in the project life cycle, to enable you to build and operate your site more cost effectively. You can use Microsoft SQL Server 2000 or SQL Server 7.0 to distribute individual Commerce Server components to separate servers.

The result of using these scaling techniques is a highly scalable server farm that you can grow well beyond its original design limitations. The following sections provide you with detailed information about how to scale your Commerce Server site.

Scaling Hardware Vertically


Based on performance and capacity planning benchmarks performed by Microsoft, Commerce Server sites have been found to be characteristically CPU-bound because Commerce Server makes extensive use of data caching to improve site performance. In other words, Active Server Pages (ASP) processing, which is highly CPU intensive, is the primary bottleneck in CPU processing capacity.

Scaling vertically is the process of adding memory, increasing input/output (I/O), and increasing processing capacity, so you can get additional throughput without changing your application architecture. It is common to find Web servers with large amounts of memory that can cache nearly all of the content of an entire Web site. In addition, N-way symmetric multiprocessing (SMP) hardware is readily available in the marketplace. Thus, many methods are available to scale a Commerce Server site vertically.

One method of vertically scaling hardware is to use a higher-class processor to increase processing power so that a single server can accommodate more traffic. Another way you can vertically scale hardware is to run Microsoft Windows 2000 on 8-way SMP servers. The aggregate throughput is higher, but it comes at the cost of diminishing returns on investment. (In other words, per-processor throughput is less on 8-way SMP hardware than on 4-way SMP hardware. You get higher aggregate throughput at a disproportionate increase in cost.)

If you plan to scale your hardware vertically, you should do the following:


  • Add a large amount of memory to decrease disk access and to help improve the I/O throughput of the IIS/ASP server cache content and the NTFS file system disk cache buffers

  • Add additional spindles and Redundant Array of Inexpensive Disks (RAID)


  • Scale the hardware with processors up to 4-way SMP hardware, then proceed further with horizontal or architectural scaling techniques

  • Add multiple small computer system interface (SCSI) disk controllers to increase disk I/O throughput

  • Add multiple 100-megabits-per-second (Mbps) network cards to increase network I/O throughput

  • Consider Gigabit Ethernet

Figure 5.1 shows how you might scale from a single Pentium II-CPU server to a dual-CPU server with a Pentium III Xeon-class processor.


Figure 5.1 Scaling hardware vertically

If you have exhausted vertical scaling techniques, you can try the horizontal scaling techniques described in the following topic.

Scaling Hardware Horizontally


Scaling hardware horizontally increases capacity by adding servers to the server farm. You can then distribute Commerce Server components across multiple servers, thereby increasing capacity.

When you begin scaling horizontally, you add the complexity of having to distribute the load evenly across multiple servers. You must address distribution by using load-balancing tools, such as NLB, Domain Name System (DNS) round robin (network software), and hardware solutions such as Cisco LocalDirector (network/router hardware). The benefits of load balancing include providing redundancy of services and presenting higher aggregated capacity by directing the load to multiple servers.

To effectively scale hardware horizontally, you should not use IIS session variables and you should disable IIS session management, unless you use Cisco LocalDirector. In cases where an application is coded with IIS session management (makes use of session variables), you can use hardware such as Cisco LocalDirector to balance the load because it directs site traffic and sends a client back to the same server each time. For more information about IIS session management, see the "Disabling IIS Session Management and Removing Session Variables" topic.

You can horizontally scale the following components of a Commerce Server server farm:

  • Web servers. Add more computers to function as Web servers. Externally, you expose the computers by using a common domain name with a single virtual Internet Protocol (IP) address mapped to a load-balancing system. The load-balancing system directs the traffic to multiple servers. Typically, load balancing directs a Transmission Control Protocol (TCP) connection (such as a Hypertext Transfer Protocol (HTTP) request) to a specific server and keeps it directed to the same server until the TCP connection session ends.

  • Active Directory domain controllers. Add more computers to function as Active Directory domain controllers. Externally, you expose the computers using a common domain name.

Scaling hardware horizontally helps the server farm expand to higher capacity. Further scaling requires architectural improvements. The next section describes how to optimize your site architecture to improve scalability.

Optimizing Site Architecture to Improve Scalability


To improve the architecture of your Commerce Server site, you can:

  • Design your site so that static, high-capacity operations (including operations with relatively simple ASP pages) are separated from dynamic operations with heavier load factors but smaller capacity requirements.

  • Dedicate servers to each type of operation.

  • Optimize the performance of each server.

For example, dedicated servers can process dynamic content, such as ASP and COM+, and run Commerce Server pipeline components, so that the entire bandwidth of the server is used efficiently without interfering with the serving of static Hypertext Markup Language (HTML)/Graphics Interchange Format (GIF) content requests.

Figure 5.2 illustrates a typical site showing how workload might be divided among the available servers.


Figure 5.2 Sample site architecture

IIS processes static HTML/GIF content requests many times faster than it processes ASP requests. An IIS server dedicated to processing HTML/GIF content might be able to handle 10,000 concurrent user requests while an IIS server dedicated to processing ASP and Commerce Server pipeline content might be able to handle only up to 1,000 concurrent user requests.

Another example suggests that most Web-based e-commerce sites process user requests that fall into one of the five categories listed in the following table.


Percentage of customer requests





User registration


Add item to the shopping basket


Check out


This example shows that users browse, search, and register nine times more often than they add items to their shopping baskets and check out. Based on this example, in a population of 100,000 users, there should be approximately 10,000 users adding items to their shopping basket or checking out, while 90,000 users are browsing, searching, or registering.

Given the numbers in this example, servers handling static content (browse, user registration, and search operations) can process approximately 90 percent of the traffic, while servers handling dynamic content (add item and check out operations) can process the remaining 10 percent of the traffic. However, because dynamic operations also account for a fewer number of concurrent users, you can decrease the number of these dedicated servers.

There are many situations in which you can use dedicated servers to divide content, such as static content (HTML/GIF), dynamic content (ASP/Commerce Server pipeline), business rules (COM+ components), disk I/O (cache most active files). The following architectural improvements can help you to get even higher performance, with better scalability:

  • Disabling IIS session management and removing session variables

  • Separating static content from other types of content

  • Caching static content

  • Caching static look-up data

  • Consolidating business rules on dedicated servers

  • Using Message Queuing or e-mail to update systems

  • Processing requests in batches

  • Optimizing SQL Server databases

The following sections provide detailed information about how to implement these architectural improvements.

Disabling IIS Session Management and Removing Session Variables

You must ensure that your application code disables IIS session management and that it does not use IIS session variables, unless you use Cisco LocalDirector. IIS session management consumes a specific amount of memory for each user, consuming more memory as the application stores more values in the session variable (due to an increase in the number of concurrent users). If there are few session variable values, this consumption of memory might not impact performance significantly. On the other hand, if there are a large number of session variable values, such as an object model, memory consumption in IIS session management can impact performance significantly.

For example, if the session variable for each user consumes 1 MB of memory, 1,000 concurrent users consume approximately 1 GB of memory. Based on this example, using session variables severely limits scalability in a case where the computer has 1.5 GB of available memory. Without this memory consumption, it is possible to serve a larger number of concurrent users, up to the limits of the CPU.

Another disadvantage of using session variables is that they reside only on the local server. In other words, the application requires an affinity between the client and the server on which the session variable started, because the session variable resides only on that one server. To maintain the required affinity between the client and server, you must ensure session stickiness (or persistence). This eliminates on-the-fly redundancy (destroying user sessions if a server goes down or needs to be taken offline).

You can configure NLB to enable client-to-server affinity. This sends a client back to the same destination server for each request, providing the load-balancing effect you want. Session variables are local to each server, so the client always sees the correct set of variables and state.

Separating Static Content from Other Types of Content

The following tables compare two server farm methods (non-consolidated and consolidated) of serving 100,000 concurrent users.

Non-Consolidated Server Farm


Type of content

Percentage of users

Number of Web servers

Number of concurrent users per server

Total number of concurrent users

Browse, search, user registration, add item, checkout

All (static, dynamic, ASP, and so forth)











Consolidated Server Farm


Type of content

Percentage of users

Number of Web servers

Number of concurrent users per server

Total number of concurrent users

Browse, user registration, search






Add item, checkout

Other (dynamic, ASP, and so on)









Not applicable


Based on the information in the tables, the total number of servers drops from 100 front-end Web servers to 19 front-end Web servers, if you separate the static content from other types of content.

IIS 5.0 processes static HTML/GIF content very efficiently, but processing ASP content requires a significant amount of CPU time, resulting in reduced performance. To most efficiently use the servers, combine operations that have similar load-factor characteristics or capacity requirements and separate those that differ. The numbers in the previous table suggest that you might benefit by using three different servers, one for each of the following categories:

  • Browse static HTML/GIF content requests

  • Search ASP and user registration requests

  • Add item to basket and checkout purchase ASP requests

Caching Static Content

ASP pages render many types of data to HTML that are not highly dynamic, but not truly static, such as product attributes (description, price, and so forth), site announcements, and sale announcements. You can use a process to render these types of information to static HTML pages and serve them up as static HTML/GIF content. This provides for a much higher throughput, and reduces overhead by avoiding ASP processing and SQL Server data retrieval.

If your information is relatively static but some content (such as product price) is driven by a database look-up (such as pricing by zip code) you can use this technique in combination with framing product information in a separate HTML frame from the product price.

Microsoft Scalable Web Cache (SWC) 2.0 provides an excellent caching solution. For more information about SWC, see . Another solution is to use an Internet Server Application Programming Interface (ISAPI) filter that reads HTML and performs a look-up to an in-memory database, similar to the way early database integration was accomplished using Internet Database Connector (IDC) and HTML extension (HTX) files. This method avoids full ASP processing and retains high-speed serving of HTML pages.

Caching Static Lookup Data

If your data requires dynamic lookups (such as product price based on zip code or user ID) or a database lookup (such as pricing by zip code), you can use an in-memory database to cache the lookup table. This helps reduce overhead associated with retrieving data across a network. You can refresh the in-memory database with a nightly process (or as necessary) to ensure that the dynamic data is up-to-date. This helps reduce overhead associated with retrieving data from the SQL Server database.

On many sites, a page contains an HTML list box/combo box (such as product categories or product compartments) rendered from a lookup table. It is much more efficient to render these records once and cache the HTML fragment globally in the ASP application object than to retrieve them from the lookup table each time they are needed.

When you cache static lookup data, small lookup tables work best. However, you can increase hardware memory capacity to help accommodate larger tables, if you need to do so. You can analyze the IIS and SQL Server logs to determine which lookup tables are accessed most frequently and would benefit most from caching.

Using the Caching Technology Provided by Commerce Server

You can use the Commerce Server CacheManager object to set up and use a collection of data caches, in which you can store profile data, catalog data, transaction data, and campaign data for your site.

You can use the LRUCache data cache object to create, store, and retrieve name/value pairs (referred to as elements) from the cache. When the cache is full, the least recently used (LRU) element is automatically removed from the cache to make room for a new element. Each LRUCache object has its own size, which is defined in the Global.asa file. Flushing is performed using an LRU technique, in which each cache is permitted to grow 10 percent larger than its specified size, at which time the cache returns to its specified size by flushing the least recently used items.

You use Commerce Server Business Desk to empty the caches you set up with the CacheManager object. After the caches have been refreshed, the next time Commerce Server receives a request for this data, the updated data is loaded into the caches.

For example, you use the Publish Transactions module to update your site with new transaction information, such as new tax rates and shipping methods. The Publish Transactions module refreshes the caches that store transaction information. The next time Commerce Server receives a request for transaction data, the updated data is loaded into the caches.

Consolidating Business Rules on Dedicated Servers

Because an optimized Commerce Server site is CPU-bound, you can improve ASP and Commerce Server pipeline processing performance by reducing CPU utilization. You can reduce CPU utilization by identifying and placing complex, processor-intensive business rules (such as COM+ components) on dedicated servers.

There is a trade-off in performance between in-process execution of components and out-of-process execution of components marshaled by the distributed version of Component Object Model (DCOM). To determine the exact trade-off, you must measure both methods and determine which method works best for your site. If a business rule is processor-intensive and the performance cost is greater than the cost of marshalling by DCOM, you could develop the component as a COM+ component. Dedicating a separate server to COM+ components increases server capacity on the ASP and Commerce Server pipeline components, thereby increasing performance of ASP and Commerce Server pipeline processing.

If a business rule consists of only a few lines of code that is not processor-intensive, it is probably not worth having a dedicated server to run it. In this case, either leave it as an ASP function snippet (saving object activation/invocation costs) or, if the code is complicated ASP code, code it as a COM component using Microsoft Visual C++ and Active Template Library (ATL). Activate the COM component locally by using the "Both" threading model of the ATL wizard.

Using Message Queuing or E-mail to Update Systems

You can use Message Queuing or e-mail to update fulfillment, Data Warehouse, reporting, and other systems, rather than using a database transaction. By using Message Queuing or e-mail, you leverage asynchronous communications to get a high rate of "fire and forget" operations and transactions to avoid latency caused by database operations and transactions such as data retrieval or extended computation.

For example, if a department (or an entirely different company) performs the actual order fulfillment at a different geographical location from the department that receives the order (drop ships), the two locations must frequently communicate new orders and shipping status. Instead of using a database operation or transaction (such as a periodic batch database extract) and sending the results to the remote site, the departments within the company can use Message Queuing services or e-mail to send notifications (such as new orders) to and accept status information from the remote site.

The front-end servers accept the request and quickly hand off the information to Message Queuing or to an e-mail server, which then sends the information to the remote location. This results in a higher rate of processing and faster front-end server response time, updating the remote sites more quickly than by using periodic batch database extracts.

You can also submit Commerce Server orders and receipts to Microsoft BizTalk Server 2000 asynchronously, rather than using an inline database transaction. Doing this enables the ASP page to avoid transaction latency and continue processing. The disadvantage is that the customer does not see an immediate order confirmation number and must wait for a confirmation e-mail or wait until you process and record the orders and receipts in the database. Asynchronously recording Commerce Server orders and receipts works best at sites with periodic load peaks.

Processing Requests in Batches

You can process operations that can be deferred until a later time, such as credit card processing or tax calculations, in batch mode on a dedicated server. For example, most B2C sites can't defer tax calculations because customers need to know the total amount due at the time of checkout, but they can process credit card transactions at a later time. Many B2B sites can defer processing tax calculations until monthly invoices are generated by the accounts receivable system.

Deferring processing enables the front-end servers to process requests at a higher rate of speed and to respond to requests more quickly. You can send failure and exception reports to users through e-mail. In many cases, systems that perform batch processing operations already exist in your business. For more information about interfacing with existing business systems that perform batch processing operations, see Chapter 10, "Integrating Third-Party ERP Systems with Commerce Server Applications."