High CPU in .NET app using a static Generic.Dictionary

A couple of weeks ago I helped out on a high CPU issue in an ASP.NET application.

Problem description

Every so often they started seeing very slow response times and in some cases the app didn’t respond at all and at the same time the w3wp.exe process was sitting at very high CPU usage 80-90%.  This started happening under high load, and to get the application to start responding again they needed to restart IIS.

Debugging the problem

They gathered a few memory dumps during the high CPU situation for us to review and when running the sos.dll command ~* e !clrstack (in windbg) to see what all the threads were doing we found that they were all stuck in callstacks similar to this one:

 OS Thread Id: 0x27dc (124) 
  ESP       EIP     
2f77ed24 795b3c5c System.Collections.Generic.Dictionary`2[[System.Int32, mscorlib],[System.__Canon, mscorlib]].FindEntry(Int32) 
2f77ed3c 795b3835 System.Collections.Generic.Dictionary`2[[System.Int32, mscorlib],[System.__Canon, mscorlib]].ContainsKey(Int32) 
2f77ed40 209f1932 MyComponent.Settings.get_Current() 
... 
SOME STACK FRAMES REMOVED AS THEY ARE NOT IMPORTANT FOR THIS ISSUE 
... 
2f77f0a4 209f7545 ASP.MyApp_default_aspx.ProcessRequest(System.Web.HttpContext) 
2f77f0a8 65fe6bfb System.Web.HttpApplication+CallHandlerExecutionStep.System.Web.HttpApplication.IExecutionStep.Execute() 
2f77f0dc 65fe3f51 System.Web.HttpApplication.ExecuteStep(IExecutionStep, Boolean ByRef) 
2f77f11c 65fe7733 System.Web.HttpApplication+ApplicationStepManager.ResumeSteps(System.Exception) 
2f77f16c 65fccbfe System.Web.HttpApplication.System.Web.IHttpAsyncHandler.BeginProcessRequest(System.Web.HttpContext, System.AsyncCallback, System.Object) 
2f77f188 65fd19c5 System.Web.HttpRuntime.ProcessRequestInternal(System.Web.HttpWorkerRequest) 
2f77f1bc 65fd16b2 System.Web.HttpRuntime.ProcessRequestNoDemand(System.Web.HttpWorkerRequest) 
2f77f1c8 65fcfa6d System.Web.Hosting.ISAPIRuntime.ProcessRequest(IntPtr, Int32) 
2f77f3d8 79f047fd [ContextTransitionFrame: 2f77f3d8] 
2f77f40c 79f047fd [GCFrame: 2f77f40c] 
2f77f568 79f047fd [ComMethodFrame: 2f77f568]

In other words, the method MyComponent.Settings.get_Current() was calling ContainsKey on a Generic.Dictionary object and for some reason it was getting stuck when trying to find the entry.

Looking at the MyComponent.Settings.get_Current() method, we found that the Generic.Dictionary it was calling ContainsKey on was a static dictionary and that all threads were working on the same dictionary.

The MSDN documentation about Generic.Dictionary has the following information about the thread safety of Dictionary objects 

A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

What is happening here, and causing the high CPU is that the FindEntry method walks through the dictionary, trying to find the key.  If multiple threads are doing this at the same time, especially if the dictionary is modified in the meantime you may end up in an infinite loop in FindEntry causing the high CPU behavior and the process may hang.

Resolution:

These type of timing issues with static collections are fairly common in ASP.NET apps with high load.

To resolve this timing issue you should take special care to synchronize (lock) around access to the dictionary if there is a possibility that you may have multiple writers working at the same time or if there is a possibility that you write while someone else is reading/enumerating through the same dictionary. 

In general, I would recommend to always read the thread safety information carefully when using static collections as many of them require that you implement synchronization on concurrent read/write operations, to avoid this type of issue or issues with for example HashTables where you may get exceptions like InvalidOperationException “Load Factor Too High”.

Have a good one,

Tess