Solutions for Poor Server Performance

By Gary Duthie, MCSE, Microsoft IIS Documentation Team

Poor server performance is usually caused by more than one factor. Finding a solution for poor performance is very much like conducting a scientific experiment. Scientific experimentation, also called The Scientific Method, is a six-step process that involves observation, preliminary hypothesis, prediction, testsandcontrols, and the outcome of this method, a theory, proposes a concluding hypothesis supported by the best collection of evidence accumulated by the steps. Optimum server performance is also obtained by a collection of evidence accumulated by the same steps.

If you observe that your server performance is less than desirable, you might hypothesize that, among several possibilities, the number of active threads in your server is tuned too low. You may then predict that increasing the number of active threads in your server will increase your server performance. However, upon testing your server performance, once the threads have been increased, you find that your performance has not changed, or has become worse.

Your thread setting has now become your control, as you should only make one setting change at a time until you observe a change in performance that is more acceptable. Upon reaching a more satisfactory server performance as a result of several adjustments to thread tuning, you may theorize that a certain thread setting provides the best server performance in combination with all current variables (amount of total required memory, number of applications being run, upgraded software, etc). Any change in the variables will then constitute further experimentation.

The article, Server Performance and Scalability Killers, written by George Reilly, at, was inspired by Murali R. Krishnan's list of the Top Ten IIS Performance Killers. Reilly is a developer assigned to Microsoft's Internet Information Server (IIS) performance development team. Krishnan is a previous member of the IIS performance development team. Reilly's article was written to help users of IIS identify the most common problems associated with poor server performance. Reilly's article continues the discussion of improving server performance by listing the Ten Commandments of Killing Server Performance, and adding solutions for avoiding these 'killers.' Solutions for Poor Server performance expands upon some of Reilly's solutions to the 'Ten commandments' by informing the user of what differences in performance to expect when following Reilly's suggestions, which counters to use to measure any change in performance, and what the optimum measurement of thier counter should be for their situation. In the spirit of Reilly's article and Krishnan's list, the following is the List of Solutions for Server Performance Killers. Remember, any solution to improve server performance is only applicable to your current set of variables.

List of Solutions for Server Performance Killers

  1. Excessive memory allocation slows performance. (Reilly's suggestion as a solution to Krishnan's #2 performance killer, "Thou shalt allocate and free lots of objects.")

    A general purpose memory allocator performs a substantial amount of work, which tends to lead to memory fragmentation.

    On a Windows NT or Windows 2000 server running IIS, memory allocation must be balanced with all applications and other processes running on the server. Allocating too much memory to any one process or application may have consequences to overall system performance.

    Here are counters you can use to monitor your server memory to ensure that memory allocation will not disrupt your server performance:

    Memory:Available Bytes measures the total physical memory available to the operating system and compares it with the memory required to run all of the processes and applications on your server

    (Memory:Committed Bytes). The comparison should be tracked over time to allow for periods of peak activity. You should always have at least 4 MB or 5% more of available memory than committed memory.

    Memory:Page Faults/sec measures page faults that occur when an application attempts to read from a virtual memory location that is marked "not present." Zero is the optimum measurement. Any measurement higher than zero delays response time. The Memory:Page faults/sec counter measures both hard page faults and soft page faults. Hard page faults occur when a file has to be retrieved from a hard disk rather than virtual memory. Soft page faults occur when, a resolved page fault, found elsewhere in physical memory, interrupts the processor but have much less effect on performance.

    Note: Performance Monitor subdivides counters into three sections. The first section lists the performance object, followed by the counter name, and the instance. IIS documentation separates Performance Monitor's sections with colons (Process:Thread Count:Inetinfo) while Windows 2000 documentation separates Performance Monitor's section's with forward slashes (Process/Thread Count/Inetinfo).

  2. Use a thread pool model rather than randomly increasing the number of threads. (Reilly's suggestion as a solution to Krishnan's number 3 performance killer, "Thou shalt create threads. The more, the merrier.")

    When maximizing threads per processor, you should always take into account the number of context switches you are generating. A context switch occurs when the kernel, or core of the operating system, switches the processor from one thread to another. Context switches should be avoided, as each context switch causes the processor L1 and L2 caches to be flushed and refilled.

    IIS 5.0 sets the default value of ASP worker threads per processor at 25. A quad-processor would, therefore, have 100 threads. You can change the number of threads per processor value at the command line by setting AspProcessorThreadMax. AspProcessorThreadMax is a metabase property accessible at the following path: /LM/W3SVC/MD_ASP_PROCESORTHREADMAX

    If you are running ISAPI or ASP applications you will want to monitor those applications with the following counters:

    Process:Thread Count:dllhost, counts the number of threads created by the pooled out-of-process application and displays the most recent value.

    Process:Thread Count:dllhost#1, #2,…, #N, counts the number of threads created by the isolated out-of-process and displays the most recent value.

    Here are some other counters that will help you monitor threads:

    Process:Thread Count:Inetinfo, counts the number of threads created by the process and displays the most recent value.

    Thread:% Processor Time:Inetinfo =>Thread #, measures how much processor time each thread of the inetinfo process is using.

    Thread:Context Switches:sec:Inetinfo =>Thread#, measures the maximum number of threads per processor, or thread pool, You should monitor this counter to make sure you are not creating so many context switches that the memory being lost to context switches supercedes the benefit of added threads, at which point your performance will decrease rather than improve.

    For more information on thread pool models see the Microsoft Internet Information Services 5.0 Resource Guide, Chapter 5: Monitoring and Tuning Your Server.

  3. Measure and analyze. (Reilly's suggestion as a solution to Krishnan's #8 performance killer, "Thou shalt not measure.")

    There are many different types of counters to measure system performance, but for the most part, system performance can be categorized into the following groups: memory management, network capacity, processor capacity, and disk optimization. Kathy Ferguson has written an article describing each of these categories, which counters to use to measure performance associated with the categories, and what the optimum measurement of those counters should be. For more information on performance counters and measurements, see Ferguson's articles, Measuring Hardware Performance of Web Sites, at and Counters Quick Guide at

  4. Stress test your environment. (Reilly's suggestion as a solution to Krishnan's #9 performance killer, "Thou shalt use single-client, single-request testing.")

    Load-testing, or stress-testing, allows the user to test the stability of their environment under any given value of multiple users. You can stress test your environment by using the Microsoft Web Application Stress Tool (WAST), a tool designed to simulate multiple browsers requesting pages from a web application. WAST will allow you to add the following counters to help you measure and monitor your website:

    Web Service:Get Requests/sec, The rate at which HTTP requests using the GET method are made. Get requests are generally used for basic file retrievals or image maps, though they can be used with forms.

    Web Service:Post Requests/sec, The rate at which HTTP requests using the POST method are made. Post requests are generally used for forms or gateway requests.

    Processor:% Processor Time (Windows 2000 only), measures the percentage of elapsed time that all of the threads of this process used the processor to execute instructions. An instruction is the basic unit of execution in a computer, a thread is the object that executes instructions, and a process is the object created when a program is run. Code executed to handle some hardware interrupts and trap conditions are included in this count. On Multi-processor machines the maximum value of the counter is 100 % times the number of processors.

    Active Server Pages:Requests/sec, measures the number of requests executed per second.

    For more information on WAST, go to the WAST website;

    Another stress test tool, the Web Capacity Analysis Tool (WCAT), is recommended only for environments that are heavy in jpeg or static file content. You can learn more about WCAT at

  5. Pick a broad spectrum of real-world scenarios. (Reilly's suggestion as a solution to Krishnan's number 10 performance killer, "Thou shalt not use real world scenarios.")

    In a real-world scenario, you test the performance of your server based on your own contents and the type of applications you are running. Benchmarks are simulated applications that are designed to stress an operating environment and the hardware upon which it runs in ways that are similar to some specific application or mix of applications. Because they are artificial, the benchmark results seldom match the application profile in any given environment. Users should not assume that their server should generate the same performance values when running in their user environment as was obtained under benchmark conditions. Any variable to the simulated environment will affect your server performance.

    Target the critical scenarios that will affect your own environment. For more information on critical, or real-world scenarios, see the Monitoring and Tuning Your Server section of the Internet Information Services 4.0 Resource Guide.

  6. Overprotecting data through the use of global locks can lead to lock contention and low CPU utilization. (Reilly's suggestion as a solution to Krishnan's #5 performance killer, "Thou shalt use global locks for data structures.

    Lock contention arises when several threads are repeatedly trying to get exclusive access to a shared resource. A high number of context switches is an indicator that you have lock contention. Lock contention can be measured using Performance Monitor's System:Context Switches/sec counter. An optimum measurement for the System/Context Switches/sec counter will depend on the type of operating system you are using and the applications you are running on your server.

For more detailed information, see the Preventing Processor Bottlenecks section in the Internet Information Services 5.0 Resource Guide - Chapter 5, Monitoring and Tuning Your Server.

In conclusion, remember, performance tuning is an ongoing process. As the contents of your environment evolve, your environment performance should be continuously monitored, making one adjustment at a time. Continue to tune and stress test until you find the right combination of adjustments that works best for your Web site.