Implications of using a helper thread for debugging
What it means?
I mentioned in a previous post (http://blogs.msdn.com/jmstall/archive/2004/10/10/240452.aspx) that the CLR debugging services is an “in-process model” which means it has a helper thread running in the same process as the EE which provides debugging information at runtime. Contrast this to an “out-of-process” model where the debugger operates completely from a separate process with little cooperation from debuggee.
The helper thread spends most of its life sitting in a message loop waiting for requests from the debugger. When it gets a request, it can directly access the CLR’s data structures to compute an answer for the request, and then it sends a response back to the debugger.
Having a helper thread and an in-process model has several advantages:
- Maintainability: The helper thread can reuse the same code that the rest of the EE uses to manipulate the data structures. It’s also considerably easier to update EE data structures from a helper thread than from out-of-process.
- Performance: The helper thread is running in the same process as the EE, so there’s less overhead compared to obtaining the same information from out of process.
- Cooperating with the GC – it’s easier to synchronize with the Garbage Collector (e.g., take locks, manipulate synchronization objects) from within the same process.
Having a helper-thread model has many disadvantages:
1) Can’t debug managed dumps (mini/heap/full) . The helper thread requires code to run in the debuggee. Code can only run in a live process, not in a dump. Thus the helper thread is unavailable for dump-debugging.
2) Makes interop-debugging very problematic. Interop-debugging becomes much less stable and some scenarios become completely broken. This is because interop debugging combines native debugging (out-of-process model) with managed debugging (in-process model), and the two don’t mix. This is the top reason why interop debugging in VS 2002 / 2003 is both very slow and prone to deadlock. (GreggM gave a nice overview of the problems here at http://blogs.msdn.com/greggm/archive/2004/01/23/62455.aspx)
3) Larger Heisenberg effect. Even when it looks like the debuggee is stopped (at a breakpoint, for example), the helper thread is still running inside of the debuggee to service managed requests. This can become problematic when trying to debug stress runs.
4) Less stable. If the debuggee is sufficiently corrupted, the functionality needed by the helper thread may not work properly. For example:
a. A memory corruption in the debuggee may break key data structures needed by the helper thread.
b. Debugging out-of-memory scenarios can be compromised because the helper thread may not have enough memory to execute.
c. There are many random corner-case scenarios to affect threads running in a process. For example, If the helper thread gets suspended for whatever reason, the debugger will hang. Also, the helper will run arbitrary code for all dllmain routines in the debuggee. If any of those dllmain routines call managed code, then there won’t be a helper thread available to help debug the managed code running on the real helper.
d. If the helper blocks on anything (or calls anything, like an OS api that block on something), the debugger will deadlock.
5) Extra thread in every managed app, even when you’re not debugging. If we lazily launch the helper thread only once we start debugging then the helper thread would still be missing in the attach scenario.
6) Can’t debug the runtime itself (e.g., mscorwks.dll). For example, you can’t step-in from managed code into mscorwks because you can’t debug any code in the runtime that the helper thread would need to run. Since the runtime is unmanaged code, this is really a subset of interop-debugging. See here for details.
Why did the CLR choose an inproc model?
There’s a long list of problems with the inproc model, so why did we do it that way? It was actually a pretty contested decision with some very very strong resistance. Some of the reasons for going with inproc were:
- Hindsight is 20/20. This decision was made over 6 years ago, and the key scenarios where inproc breaks down weren’t nearly as important at that time. Dump debugging wasn’t nearly as popular. Interop-debugging wasn’t even on the table yet.
- Other similar debugging services, such as script debugging, were successfully inproc.
- There were serious concerns about the maintainability of an out-of-proc solution.