Debugging a hung application with WinDbg
This article has been moved to its new home here: https://benperk.github.io/msdn/2013/2013-01-debugging-a-hung-application-with-windbg.html
There are many reasons that threads can be blocked causing an application to hang. This is an example of one of those situations. I loaded a memory dump into WinDbg and loaded the PSSCOR2 extension. When I executed !syncblk, I received the following result shown in Table 1.
|0:000> !syncblkIndex SyncBlock MonitorHeld Recursion Owning Thread Info SyncBlock Owner546 16e2f11c 9 1 0a1df058 23a78 18 02b133e4 System.ObjectWaiting threads: 0 24 77 80|
Table 1, !syncblk result
To me, the result means that Thread 18 is blocking threads 0, 24, 77 and 80. I execute !threads and get the result shown in Table 2. For this example, I restricted the output just to those threads.
|0:018> !threadsID PreEmptive GC Domain Lock Count APT0 Enabled 0066ec18 0 MTA (Threadpool Worker)18 Enabled 0066ec18 1 MTA (Threadpool Worker)24 Enabled 0066ec18 0 MTA (Threadpool Worker)77 Enabled 0066ec18 0 MTA (Threadpool Worker)80 Enabled 0066ec18 0 MTA (Threadpool Worker)|
Table 2, !threads result, reduced for this example
Next I want to see what is happening on one of the threads that is being blocked. I might be able to find what the thread is waiting on. So I enter a ~80s command which gets that thread in focus and then look at the call stack as illustrated in Table 3. What I notice is a Generic Dictionary that is being searched using the ContainsKey() method.
|0:080> kChildEBP RetAddr2ca8ebfc 76c315e9 ntdll!NtWaitForMultipleObjects+0x152ca8ec98 76061a2c KERNELBASE!WaitForMultipleObjectsEx+0x1002ca8ece0 59d445a2 kernel32!WaitForMultipleObjectsExImplementation+0xe02ca8ed48 59d441cf mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT+0x6f2ca8ed68 59d442d8 mscorwks!Thread::DoAppropriateAptStateWait+0x3c2ca8edec 59d4436d mscorwks!Thread::DoAppropriateWaitWorker+0x13c2ca8ee3c 59d444f1 mscorwks!Thread::DoAppropriateWait+0x402ca8ee98 59be5402 mscorwks!CLREvent::WaitEx+0xf72ca8eeac 59d198ea mscorwks!CLREvent::Wait+0x172ca8ef38 59d436f0 mscorwks!AwareLock::EnterEpilog+0x8c2ca8ef54 59d19e78 mscorwks!AwareLock::Enter+0x612ca8efbc 59d19c57 mscorwks!AwareLock::Contention+0x199 2ca8f05c 085c0707 mscorwks!JITutil_MonContention+0xa3 2ca8f098 025324e6 ***_Collections!***.Generic.Dictionary`2.ContainsKey(System.__Canon)+0x4f…|
Table 3, view the call stack using k
Next I want to look at the JIT generated code for mscorwks!JITutil_MonContention+0xa3. Using the !u command, I can view this, as shown in Figure 4. I reduced the output, but you can see the ContainsKey() method is contained within a Monitor.Enter() and a Monitor.Exit() method.
|0:080> !u 085c0707….085c06ff 8b4dcc mov ecx,dword ptr [ebp-34h]085c0702 e89e246251 call mscorwks!JIT_MonEnterWorker (59be2ba5) >>> 085c0707 8b45d4 mov eax,dword ptr [ebp-2Ch]085c070a 8b4804 mov ecx,dword ptr [eax+4]…085c0712 e8e9402670 call Collections.Generic.Dictionary`2.ContainsKey(…)+..0f(…) …085c0737 8b4dcc mov ecx,dword ptr [ebp-34h]085c073a e8e0266251 call mscorwks!JIT_MonExitWorker (59be2e1f) 085c073f 58 pop eax085c0740 ffe0 jmp eax|
Table 4, JIT generated code
Now let’s go check what is happening on the thread blocking 80. Execute ~18s and then k to get the stack shown in Table 5.
|0:018> kChildEBP RetAddr0b1ee410 76c315e9 ntdll!NtWaitForMultipleObjects+0x150b1ee4ac 76061a2c KERNELBASE!WaitForMultipleObjectsEx+0x1000b1ee4f4 59d445a2 kernel32!WaitForMultipleObjectsExImplementation+0xe00b1ee55c 59d441cf mscorwks!WaitForMultipleObjectsEx_SO_TOLERANT+0x6f0b1ee57c 59d442d8 mscorwks!Thread::DoAppropriateAptStateWait+0x3c0b1ee600 59d4436d mscorwks!Thread::DoAppropriateWaitWorker+0x13c0b1ee650 59ce4af8 mscorwks!Thread::DoAppropriateWait+0x400b1ee754 788069df mscorwks!WaitHandleNative::CorWaitOneNative+0x1560b1ee770 78806995 mscorlib_ni!System.Threading.WaitHandle.WaitOne(Int64, Boolean)+0x2f0b1ee788 7b4b2231 mscorlib_ni!System.Threading.WaitHandle.WaitOne(Int32, Boolean)+0x250b1ee7a4 7af94c20 ***_ni!***.Control.WaitForWaitHandle(System.Threading.WaitHandle)+0x89…0b1ee960 04e32a2f ***_Collections!***.Generic.Dictionary`2.OnItemChanged.(…)+0xad0b1ee99c 02531d10 ***_Collections!***.Generic.Dictionary`2.set_Item(…)+0x6f|
Table 5, JIT generated code
Hmmm, interesting, it appears that this thread is writing something to the Generic.Dictionary.
Further analysis of the code resulted in identifying the fact that the Generic.Dictionary was static and all the threads in this scope where accessing the same dictionary.
Check out this article where it discusses the thread safety of a Dictionary. Basically, a dictionary can support concurrent readers are long as the collection is not modified. Which we see in this case, that the dictionary is being modified which is blocking the other threads that are trying to read from it.
The solution is to synchronize your threads, making sure that there will not be a case where the dictionary is written to and read from at the same time.