You spent HOW much on our new server and the app slowed DOWN??!!

I had a support case recently where the customer had moved their server farm onto brand new hardware, each server with lots of CPUs. At the same time they had taken the operating system from Windows Server 2003 to Windows Server 2008 R2. I forget how many CPUs they had but let’s just say that their task manager looked something like this:


(This is not a screenshot from my desktop machine at work I hasten to add. Oh, if only!)

They were understandably disappointed to find that a key web service had degraded from about 0.2ms to 0.4ms response time.

Now what they had noticed was that CPU#0 was getting more than its fair share of the work, peaking near 100% a lot of the time. The other curious thing they had found was that if they disabled CPU#0 for the w3wp.exe hosting the application while the process was running the problem resolved and performance increased.  [You can do this by right clicking on the process in the list on the processes tab in task manager and selecting “Set affinity”. This is something I would strongly recommend against doing in the normal course of events but in this case it was a useful diagnostic step].] But they also found that if they permanently disabled use of CPU#0 for that application pool by setting the processor affinity mask in the application pool advanced properties then the high CPU just shifted onto CPU#1.

The other thing that had been observed was that the performance counter for “.NET CLR Memory%time in GC” was now around 40% whereas on the old servers it had been around 2%. Not good.

Anyway, in the end we got the debugger attached (of course!). Now the .NET Garbage Collector (GC), when running in server mode which is what ASP.NET uses on multi-processor machines, creates a dedicated thread per CPU.  As most real world applications tend to allocate objects quite liberally and leave them for the GC to clear up it is not that uncommon to see the GC threads at the top of the list in !runaway output. For example, on a 4 logical CPU machine it might look like this:

0:000> !runaway
User Mode Time
  Thread       Time
  26:56c 0 days 0:05:10.328
38:488 0 days 0:05:10.750
37:be4 0 days 0:05:07.328
39:dc8 0 days 0:04:37.796
  48:acc       0 days 0:00:27.484
  31:1144      0 days 0:00:22.156

You can see the top 4 threads have used up considerably more user mode time than subsequent threads. This is normal and reasonable.

In my customer’s case however it looked more like this:

0:000> !runaway
User Mode Time
  Thread       Time
  26:56c 0 days 0:07:10.328
  38:488       0 days 0:00:10.750
  37:be4       0 days 0:00:07.328
  39:dc8       0 days 0:00:37.796
  48:acc       0 days 0:00:27.484
  31:1144      0 days 0:00:22.156

The top thread (which was indeed a GC thread) was chewing up much more time than any other.

This rang some bells at the back of my mind and I remembered this fix:

A hotfix is available that resolves the System.InsufficientMemoryException exception and enhances the heap balancing on a computer that has over 8 processors for the .NET Framework 2.0 Service Pack 2

Don’t be fooled by the title. If you look at the article you’ll see the hotfix release addresses two issues at the same time [hotfixes are cumulative so when you install a fix you are always getting lots of fixes anyway it is just that usually each hotfix release just adds one new fix to the mix]. Here is the description:

Issue 2

You run a .NET Framework 2.0-based application on a computer that has more than 8 logical processors. The computer uses the server garbage collector. In this case, you may experience a memory issue caused by an unbalanced workload in different processors. For example, the application runs slower than when you run the application on a computer that has 8 logical processors.

That certainly seemed to fit the bill.

We installed the fix on the customer’s server and the issue was resolved!