Debugging Multiple CLRs (InProc SxS)

Introduction

In the previous post, we looked at how multiple CLRs can be loaded in the same process (CLR InProc SxS).

In this post, we’ll build on the example from that post to look at how we can do debugging of processes that use multiple CLRs. We’ll focus on CLR4 with CLR2, because CLR 1.x has been largely replaced by new versions (though the same considerations should apply to CLR4 with CLR 1.x).

We’ll focus on the two most popular ways to debug managed processes: Visual Studio and WinDbg + the SOS debugger extension. Visual Studio has  friendlier user interface and is easier to use, WinDbg + SOS allows more in-depth debugging, and is especially versed towards dump analysis.

Visual Studio 2012

With reference to the sample code, if we start ConsoleApplication3 under the debugger, and we let it execute until both CLRs are loaded (Console.ReadKey() statement in Main()), everything seems to work fine as we debug.

If we look at the modules window, however, we see the following:

Modules

The modules in the window are assemblies loaded by the CLR2. ClassLibrary3.dll, loaded by the CLR4, is notably missing here.

If you set breakpoints inside ClassLibrary3.dll, for example in Class1.DoNothing()), the breakpoint won’t be hit. All these symptoms have the same common root: only the CLR2 debugger is active. This is because the process started as a CLR2 process (ConsoleApplication3 binds to CLR2) and, consequently, Visual Studio loaded the “right” CLR debugger.

If we attach Visual Studio to a running process, we do have a way to manually choose the Visual Studio debuggers, by using the Debug…Attach to Process menu item.

AttachToProcess

So we run the process again, outside of Visual Studio (Ctrl+F5), and we attach later on and select the debuggers. Note however that you can’t select both CLR2 and CLR4:

MultipleDebuggers

Note however that, once you attached to a process, you can’t further change the debugger during the same debugging session: the user interface of the Attach To Process dialog grays out the currently debugged process.

If we restart the process and attach using the CLR4 debugger this time, we have a different managed view of the debuggee. For instance, this is the set of managed modules:

Modules CLR4

Along the same lines: a breakpoint in ClassLibrary3.dll (using CLR4) will be hit, but a breakpoint in ConsoleApplication3.exe (using CLR2) will not.

Note: you may have noticed that mscorlib.dll shows up in the list of modules when the CLR2 is being debugged, and when CLR4 is being debugged. Note however that they are different modules: version and timestamp are different.

Note: you can also enable native debugging (step into native code, setting breakpoints on native code, etc.) by selecting the “Enable Native Code Debugging” check box in the Debug tab of the startup project (ConsoleApplication3 in this case). With this sample, this may be handy if you want to follow the thread of execution from ConsoleApplication3 to ATLProject1, down to Classlibrary3. With Visual Studio 2012, this works for 32-bit builds, not for 64-bit builds (mixed-mode debugging is not supported for 64-bit with earlier versions of the .Net Framework).

Exceptions

The same story applies to exceptions: only exceptions raised by a debugged CLR are recognized by Visual Studio. Since you can debug at most one CLR at a time, this means that you don’t have a way to break the debugger on every managed exception. This is a limitation especially if you are troubleshooting a complex issue and any exception can put you on the right track.

WinDbg with SOS

If we attach WinDbg to a running process using both CLR2 and CLR4, we’ll see that both CLR dlls are loaded:

 0:008> lm v mclr
start             end                 module name
00007ffc`6e020000 00007ffc`6e9b8000   clr        (deferred)             
    Image path: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\clr.dll
    Image name: clr.dll
    Timestamp:        Tue Aug 06 01:43:57 2013 (520038BD)
    CheckSum:         0099DECA
    ImageSize:        00998000
    File version:     4.0.30319.33440
    Product version:  4.0.30319.33440
    File flags:       8 (Mask 3F) Private
    File OS:          4 Unknown Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft® .NET Framework
    InternalName:     clr.dll
    OriginalFilename: clr.dll
    ProductVersion:   4.0.30319.33440
    FileVersion:      4.0.30319.33440 built by: FX45W81RTMREL
    PrivateBuild:     DDBLD106
    FileDescription:  Microsoft .NET Runtime Common Language Runtime - WorkStation
    LegalCopyright:   © Microsoft Corporation.  All rights reserved.
    Comments:         Flavor=Retail
 0:008> lm v mmscorwks
start             end                 module name
00007ffc`72010000 00007ffc`729af000   mscorwks   (deferred)             
    Image path: C:\Windows\Microsoft.NET\Framework64\v2.0.50727\mscorwks.dll
    Image name: mscorwks.dll
    Timestamp:        Wed Aug 14 06:55:49 2013 (520B0DD5)
    CheckSum:         0098EB16
    ImageSize:        0099F000
    File version:     2.0.50727.7905
    Product version:  2.0.50727.7905
    File flags:       0 (Mask 3F)
    File OS:          4 Unknown Win32
    File type:        2.0 Dll
    File date:        00000000.00000000
    Translations:     0409.04b0
    CompanyName:      Microsoft Corporation
    ProductName:      Microsoft® .NET Framework
    InternalName:     mscorwks.dll
    OriginalFilename: mscorwks.dll
    ProductVersion:   2.0.50727.7905
    FileVersion:      2.0.50727.7905 (win9rel.050727-7900)
    FileDescription:  Microsoft .NET Runtime Common Language Runtime - WorkStation
    LegalCopyright:   © Microsoft Corporation.  All rights reserved.
    Comments:         Flavor=Retail

SOS is the debugger extension used by native debuggers like WinDbg to debug managed code. Each version of the CLR ships with its own SOS.dll (in the same folder of the CLR dll), and it can be loaded with the .loadby command (for user dumps coming from a different machine,  .loadby may not always work, depending on the path of the .Net Framework on the target machine and on the bitness of the target process, see this post for more details):

 0:008> .loadby sos clr<br>0:008> .loadby sos mscorwks<br>0:008> .chain
...
Extension DLL chain:

C:\Windows\Microsoft.NET\Framework64\v2.0.50727\sos: image 2.0.50727.7905, API 1.0.0, built Wed Aug 14 07:06:07 2013
[path: C:\Windows\Microsoft.NET\Framework64\v2.0.50727\sos.dll]
C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos: image 4.0.30319.33440, API 1.0.0, built Tue Aug 06 01:33:29 2013
[path: C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.dll]

     dbghelp: image 6.13.0015.1725, API 6.3.6, built Sat Feb 16 04:34:59 2013
        [path: C:\dbginternal\dbghelp.dll]
    ext: image 6.13.0015.1825, API 1.0.0, built Sat May 04 11:27:44 2013
        [path: C:\Users\carcolo\AppData\Local\DBG\ExtLibrary\x64\ext\6.13.15.1825\winext\ext.dll]
    exts: image 6.13.0015.1725, API 1.0.0, built Sat Feb 16 04:37:51 2013
        [path: C:\dbginternal\WINXP\exts.dll]
    uext: image 6.13.0015.1725, API 1.0.0, built Sat Feb 16 04:37:47 2013
        [path: C:\dbginternal\winext\uext.dll]
    ntsdexts: image 6.3.9321.0, API 1.0.0, built Sat Feb 16 04:37:58 2013
        [path: C:\dbginternal\WINXP\ntsdexts.dll]

However, once you start issuing commands, something strange happens (note that we are using the full path of the debugger extension, in order to distinguish SOS for CLR2 from SOS for CLR4:

 0:008> !C:\Windows\Microsoft.NET\Framework64\v2.0.50727\sos.threads
ThreadCount: 2
UnstartedThread: 0
BackgroundThread: 1
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
                                              PreEmptive                                                Lock
       ID OSID        ThreadOBJ     State   GC     GC Alloc Context                  Domain           Count APT Exception
   0    1 1d80 000000000118b750      a020 Enabled  00000000030e0160:00000000030e1fd0 0000000001181a80     0 MTA
   2    2 3008 0000000001199c80      b220 Enabled  0000000000000000:0000000000000000 0000000001181a80     0 MTA (Finalizer)
 0:008> !C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.threads
Failed to load data access DLL, 0x80004005
Verify that 1) you have a recent build of the debugger (6.2.14 or newer)
            2) the file mscordacwks.dll that matches your version of clr.dll is 
                in the version directory or on the symbol path
            3) or, if you are debugging a dump file, verify that the file 
                mscordacwks_<arch>_<arch>_<version>.dll is on your symbol path.
            4) you are debugging on supported cross platform architecture as 
                the dump file. For example, an ARM dump file must be debugged
                on an X86 or an ARM machine; an AMD64 dump file must be
                debugged on an AMD64 machine.

You can also run the debugger command .cordll to control the debugger's
load of mscordacwks.dll.  .cordll -ve -u -l will do a verbose reload.
If that succeeds, the SOS command should work on retry.

If you are debugging a minidump, you need to make sure that your executable
path is pointing to clr.dll as well.

In this case, the !threads command works fine for CLR2, but it gives the “failed to load data access DLL” error for CLR4. A bing search can show more about the data access (DAC) dll, mscordacwks.dll. With multiple CLRs, we need to take into consideration that we also have multiple DAC dlls and, when using the .cordll command as suggested by the error message above, we need to specify for which CLR we are loading the DAC dll. Fortunately, .cordll has a “-I” option for that. So we need to load the DAC dll for CLR4:

 0:008> .cordll-I clr -ve -u -l
Automatically loaded SOS Extension
CLRDLL: Loaded DLL C:\Windows\Microsoft.NET\Framework64\v4.0.30319\mscordacwks.dll
CLR DLL status: Loaded DLL C:\Windows\Microsoft.NET\Framework64\v4.0.30319\mscordacwks.dll

Good, we loaded the DAC dll for CLR4, so commands work now:

 0:008> !C:\Windows\Microsoft.NET\Framework64\v4.0.30319\sos.threads
ThreadCount:      2
UnstartedThread:  0
BackgroundThread: 2
PendingThread:    0
DeadThread:       0
Hosted Runtime:   no
                                                                                                        Lock  
       ID OSID ThreadOBJ           State GC Mode     GC Alloc Context                  Domain           Count Apt Exception
   0    1 1d80 000000000120a360    28220 Preemptive  000000001C6D5308:000000001C6D5FD0 00000000011fd530 0     MTA 
   7    2 15b0 0000000001219060    2b220 Preemptive  0000000000000000:0000000000000000 00000000011fd530 0     MTA (Finalizer) 

Note the output of the !threads commands against CLR2 and CLR4: they reflect the different view that each CLR has on the process: thread 0 is known to both CLR2 and CLR4, because this thread has executed code in both. CLR2 also knows thread 2, which is unknown to CLR4. This is because this thread so far has executed CLR2 code only. Conversely, thread 7 has executed CLR4 code only.

Also note that, for thread 0, the thread object is different: 000000000118b750 for CLR2, 000000000120a360 for CLR4. This is because each CLR keeps its own runtime data structures, distinct from each other: therefore each CLR has its own thread object, both referencing the same OS thread. The same applies to other runtime data structures: AppDomains, Types, Method Tables, Method Descriptors, and so on.

If we enter again the !threads command for CLR2, however, we face the “unexpected”:

 0:008> !C:\Windows\Microsoft.NET\Framework64\v2.0.50727\sos.threads
Failed to request ThreadStore

Loading the CLR4 DAC caused unload of the CLR2 DAC, hence the SOS commands directed at CLR2 no longer work. We can reload the CLR2 DAC manually:

 0:008> .cordll -I mscorwks -ve -u -l
Automatically loaded SOS Extension
CLRDLL: Loaded DLL C:\Windows\Microsoft.NET\Framework64\v2.0.50727\mscordacwks.dll
CLR DLL status: Loaded DLL C:\Windows\Microsoft.NET\Framework64\v2.0.50727\mscordacwks.dll

This will make the SOS for CLR2 work, but the SOS for CLR4 will stop working.

The reason of this behavior is that only one DAC dll can be loaded in the debugger at a given time.

You can load the DAC that you need at a given time. For instance, with the CLR4 DAC loaded, the call stack of thread 0 looks like this:

 0:000> kL
Child-SP          RetAddr           Call Site
00000000`0109e878 00007ffc`7b3d525e ntdll!ZwDeviceIoControlFile+0xa
00000000`0109e880 00007ffc`7b485400 KERNELBASE!ConsoleCallServerGeneric+0xe6
00000000`0109e9e0 00007ffc`7b49c318 KERNELBASE!GetConsoleInput+0x38820
00000000`0109ea70 00007ffc`722ca7d7 KERNELBASE!ReadConsoleInputA+0x14
00000000`0109eab0 00007ffc`7191a727 mscorwks!DoNDirectCall__PatchGetThreadCall+0x7b

00000000`0109eb50 00007ffc`7197007d mscorlib_ni+0x8da727
00000000`0109ec90 00007ffc`7196fee4 mscorlib_ni+0x93007d
00000000`0109ed40 00007ffc`12a001c6 mscorlib_ni+0x92fee4
00000000`0109ed80 00007ffc`722cafb2 0x00007ffc`12a001c6

 
00000000`0109ee30 00007ffc`721e0ec3 mscorwks!CallDescrWorker+0x82
00000000`0109ee80 00007ffc`726aa131 mscorwks!CallDescrWorkerWithHandler+0xd3
00000000`0109ef20 00007ffc`72239643 mscorwks!MethodDesc::CallDescr+0x2b1
00000000`0109f160 00007ffc`72103e84 mscorwks!ClassLoader::RunMain+0x22b
00000000`0109f3c0 00007ffc`7279693d mscorwks!Assembly::ExecuteMainMethod+0xbc
00000000`0109f6b0 00007ffc`720f7a03 mscorwks!SystemDomain::ExecuteMainMethod+0x47d
00000000`0109fc80 00007ffc`721143c4 mscorwks!ExecuteEXE+0x47
00000000`0109fcd0 00007ffc`729b7ced mscorwks!_CorExeMain+0xac
00000000`0109fd30 00007ffc`72a5ea5b mscoreei!_CorExeMain+0xe0
00000000`0109fd80 00007ffc`7be81611 MSCOREE!_CorExeMain_Exported+0xcb
00000000`0109fdb0 00007ffc`7dd164ad KERNEL32!BaseThreadInitThunk+0xd
00000000`0109fde0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

The highlighted frames are not properly resolved into symbolic information because they are relative to CLR2 assemblies (this can be inferred by the fact that the caller is code in mscorwks).

Once we reload the CLR2 DAC, the same command gives a different (better Smile) outcome:

 0:000> kL
Child-SP          RetAddr           Call Site
00000000`0109e878 00007ffc`7b3d525e ntdll!ZwDeviceIoControlFile+0xa
00000000`0109e880 00007ffc`7b485400 KERNELBASE!ConsoleCallServerGeneric+0xe6
00000000`0109e9e0 00007ffc`7b49c318 KERNELBASE!GetConsoleInput+0x38820
00000000`0109ea70 00007ffc`722ca7d7 KERNELBASE!ReadConsoleInputA+0x14
00000000`0109eab0 00007ffc`7191a727 mscorwks!DoNDirectCall__PatchGetThreadCall+0x7b

00000000`0109eb50 00007ffc`7197007d mscorlib_ni!DomainNeutralILStubClass.IL_STUB(IntPtr, InputRecord ByRef, Int32, Int32 ByRef)+0xa7
00000000`0109ec90 00007ffc`7196fee4 mscorlib_ni!System.Console.ReadKey(Boolean)+0xfd
00000000`0109ed40 00007ffc`12a001c6 mscorlib_ni!System.Console.ReadKey()+0x14
00000000`0109ed80 00007ffc`722cafb2 ConsoleApplication3!ConsoleApplication3.Program.Main(System.String[])+0xa6

 00000000`0109ee30 00007ffc`721e0ec3 mscorwks!CallDescrWorker+0x82
00000000`0109ee80 00007ffc`726aa131 mscorwks!CallDescrWorkerWithHandler+0xd3
00000000`0109ef20 00007ffc`72239643 mscorwks!MethodDesc::CallDescr+0x2b1
00000000`0109f160 00007ffc`72103e84 mscorwks!ClassLoader::RunMain+0x22b
00000000`0109f3c0 00007ffc`7279693d mscorwks!Assembly::ExecuteMainMethod+0xbc
00000000`0109f6b0 00007ffc`720f7a03 mscorwks!SystemDomain::ExecuteMainMethod+0x47d
00000000`0109fc80 00007ffc`721143c4 mscorwks!ExecuteEXE+0x47
00000000`0109fcd0 00007ffc`729b7ced mscorwks!_CorExeMain+0xac
00000000`0109fd30 00007ffc`72a5ea5b mscoreei!_CorExeMain+0xe0
00000000`0109fd80 00007ffc`7be81611 MSCOREE!_CorExeMain_Exported+0xcb
00000000`0109fdb0 00007ffc`7dd164ad KERNEL32!BaseThreadInitThunk+0xd
00000000`0109fde0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

The same concept applies to references to managed objects on the stack, to the managed heap and to appdomains: commands like !dso, !dumpheap and !dumpdomain will only show managed references, managed heaps and appdomains of the CLR whose DAC dll is currently loaded in the debugger.

Exceptions

All managed exceptions are implemented in terms of SEH, and the SEH exception code for all clr exceptions is 0xe0434f4d, regardless the CLR version. This is good news, because this means that WinDbg knows about all the CLR exceptions, regardless which DAC dll (if any at all) is loaded.

The following command:

 0:000> sxe clr

causes WInDbg to break the debugged process whenever a managed exception occurs.

Further investigation of the exception requires to load the right DAC dll. For instance:

 0:000> g
(570.1d80): CLR exception - code e0434f4d (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
KERNELBASE!RaiseException+0x68:
00007ffc`7b3d8384 488b8c24c0000000 mov     rcx,qword ptr [rsp+0C0h] ss:00000000`0109eac0=000058c9ee5dddb4
 0:000> kbL
RetAddr           : Args to Child                                                           : Call Site
00007ffc`7220326f : 00000000`0118b750 00000000`0118b750 00000000`0109eb30 00000000`0118b750 : KERNELBASE!RaiseException+0x68
00007ffc`72711cf0 : 

00000000`030e18a8

  00000000`00000000 00000000`00000000 00000000`00000001 : mscorwks!RaiseTheExceptionInternalOnly+0x2ff
00007ffc`12a0020b : 00000000`70000001 00007ffc`128b33d8 00000000`00472054 00000000`011c8a20 : mscorwks!JIT_Throw+0x130
00007ffc`722cafb2 : 00000000`030dce80 00000000`00000000 00007ffc`128b39b8 00007ffc`721980f9 : 0x00007ffc`12a0020b
00007ffc`721e0ec3 : 00000000`00000002 00000000`0000000a 00007ffc`720874f0 00000000`00000000 : mscorwks!CallDescrWorker+0x82
00007ffc`726aa131 : 00000000`0109efb8 00000000`00000000 00000000`0109f1c0 00000000`00000008 : mscorwks!CallDescrWorkerWithHandler+0xd3
00007ffc`72239643 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`0109f420 : mscorwks!MethodDesc::CallDescr+0x2b1
00007ffc`72103e84 : 00000000`00000001 00000000`00000000 00000000`00000030 00000000`00000000 : mscorwks!ClassLoader::RunMain+0x22b
00007ffc`7279693d : 00000000`0109fa10 00000000`00000000 00000000`011a6f58 00000000`00000200 : mscorwks!Assembly::ExecuteMainMethod+0xbc
00007ffc`720f7a03 : 00000000`00000000 00000000`00000000 00000000`00000000 00007ffc`722623d2 : mscorwks!SystemDomain::ExecuteMainMethod+0x47d
00007ffc`721143c4 : ffffffff`fffffffe 00000000`00000000 000049e5`00000000 00000000`00000000 : mscorwks!ExecuteEXE+0x47
00007ffc`729b7ced : ffffffff`ffffffff 00000000`0118b750 00000000`00000000 00000000`0109fcd8 : mscorwks!_CorExeMain+0xac
00007ffc`72a5ea5b : 00000000`00000000 00007ffc`72114318 00000000`00000000 00000000`00000000 : mscoreei!_CorExeMain+0xe0
00007ffc`7be81611 : 00007ffc`729b0000 00000000`00000000 00000000`00000000 00000000`00000000 : MSCOREE!_CorExeMain_Exported+0xcb
00007ffc`7dd164ad : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : KERNEL32!BaseThreadInitThunk+0xd
00000000`00000000 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : ntdll!RtlUserThreadStart+0x1d
 0:000> !do 00000000`030e18a8
<Note: this object has an invalid CLASS field>
Invalid object
 0:000> .cordll -I mscorwks -ve -u -l
Automatically loaded SOS Extension
CLRDLL: Loaded DLL C:\Windows\Microsoft.NET\Framework64\v2.0.50727\mscordacwks.dll
CLR DLL status: Loaded DLL C:\Windows\Microsoft.NET\Framework64\v2.0.50727\mscordacwks.dll
 0:000> !do 00000000`030e18a8
Name: System.ApplicationException
MethodTable: 00007ffc7147c618
EEClass: 00007ffc710804c8
Size: 136(0x88) bytes
 (C:\WINDOWS\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
Fields:
              MT    Field   Offset                 Type VT     Attr            Value Name
00007ffc71477df0  40000b5        8        System.String  0 instance 0000000000000000 _className
00007ffc71476198  40000b6       10 ...ection.MethodBase  0 instance 0000000000000000 _exceptionMethod
00007ffc71477df0  40000b7       18        System.String  0 instance 0000000000000000 _exceptionMethodString
00007ffc71477df0  40000b8       20        System.String  0 instance 00000000030e1870 _message
00007ffc7146f410  40000b9       28 ...tions.IDictionary  0 instance 0000000000000000 _data
00007ffc714780b8  40000ba       30     System.Exception  0 instance 0000000000000000 _innerException
00007ffc71477df0  40000bb       38        System.String  0 instance 0000000000000000 _helpURL
00007ffc714776e0  40000bc       40        System.Object  0 instance 0000000000000000 _stackTrace
00007ffc71477df0  40000bd       48        System.String  0 instance 0000000000000000 _stackTraceString
00007ffc71477df0  40000be       50        System.String  0 instance 0000000000000000 _remoteStackTraceString
00007ffc7147f060  40000bf       70         System.Int32  1 instance                0 _remoteStackIndex
00007ffc714776e0  40000c0       58        System.Object  0 instance 0000000000000000 _dynamicMethods
00007ffc7147f060  40000c1       74         System.Int32  1 instance      -2146232832 _HResult
00007ffc71477df0  40000c2       60        System.String  0 instance 0000000000000000 _source
00007ffc7147a7f8  40000c3       68        System.IntPtr  1 instance                0 _xptrs
00007ffc7147f060  40000c4       78         System.Int32  1 instance       -532459699 _xcode

Summary

It is possible to debug multiple CLRs, but this can be done “one CLR at a time” in WinDbg + SOS. If you know about it and you switch among them, this can nonetheless provide for a good debugging experience. It’s important, however, to be aware that, at any given time during debugging, you have a partial view of what’s going on in the process, relative to one CLR only.

Visual Studio 2012 has the additional limitation of not allowing to switch the debugged CLR within the same debugging session.