don't know about you, but in my day job I'm bouncing back and forth so much between .NET and Win32® that my head is spinning. In this month's installment of Bugslayer, I want to discuss some very cool advances that Microsoft® has developed to make debugging your Win32-based applications easier. Anything you can do to stamp out those Win32 bugs faster means you can spend more time playing with your XML Web Services!
I'll start out this column by covering the hot new symbol server technology that will revolutionize how you deal with symbols and stack traces. After a tour of the symbol heaven, I'll discuss the new crash dump handling in Visual Studio® .NET as well as the new WinDBG so you can debug crashes after the fact just as if you were there. The last part of the column will be devoted to a utility that will quickly pull the important information out of crash dumps so you don't even have to open them in the debuggers!
Getting the correct symbols lined up between your application and the operating system is the secret to debugging faster. You know what happens when you don't get them coordinated; you get that beautiful call stack that has exactly one item in it. The reason symbols are so vital is that the Frame Pointer Omission (FPO) data is included as part of the PDB file. While you might think you have it tough messing with symbols, imagine how hard the Windows® operating system developers' lives must be. Whereas you have an application you might think is pretty big, the operating system developers have the largest commercial application in the world. I know people on the operating system team at Microsoft and I've asked if they get any help from the users debugging their applications. They all have laughed and told me that they get as much help with the operating system as I got when I was writing developer tools for a living. In other words, none.
Of course, they have many more versions of the operating system running at any given time than you could ever imagine. During a development cycle they might have anywhere up to 10,000 different builds running around the world. If you think you have trouble getting symbols to match, you have nothing on them!
Developers at Microsoft realized they had to do something to make life easier for themselves as well as their customers. Thus was born Symbol Servers. The concept is simple: store all the symbols for all public builds in a known location and make the debuggers smarter so they load the correct symbols without any user interaction. The beauty is that the reality is nearly that simple as well! There are a few small issues, which I'll point out in this column, but with the Symbol Server properly set up, you'll never want for symbols again.
The first step towards symbol nirvana is to download the latest version of WinDBG from https://www.microsoft.com/ddk/debugging as the Symbol Server binaries are developed by the WinDBG team. You will want to check back for updated versions of WinDBG, as the team seems to be on a fairly quick release schedule and are releasing updated versions every few months. After installing WinDBG, add the installation directory to the master PATH environment variable. The two key binaries, SYMSRV.DLL and SYMSTORE.EXE, must be accessible to read from and write to your Symbol Servers.
Figure 1 SYMSRV
The Symbol Store itself is simply a database that happens to use the file system to find the files. Figure 1 shows a partial listing from Explorer of the tree for the Symbol Server on one of my computers. The root directory is WebSymbols, and each symbol file, such as ADVAPI32.PDB, is listed at the first level. Under each symbol file name is a directory that corresponds to the date/time stamp, signature, and other information necessary to completely recognize a particular version of that symbol file. Keep in mind that if you have multiple versions of a file such as ADVAPI32.PDB for different operating system builds, you'll have multiple directories under ADVAPI32.PDB for each unique version you have accessed. In the signature directory, you'll probably have the particular symbol file for that version. There are provisions for having special text files to point to other locations in the Symbol Store, but by following my recommendation, you'll have the actual symbol files.
Actually creating your Symbol Server takes two excruciatingly difficult steps. First, create a folder on a server giving everyone in the development team read and write access and ensure that you have plenty of disk space available. Second, share that folder for all developers. You'll probably want the server and share name to be something like \\Symbols\Symbols or something easily remembered.
The absolute beauty of the Symbol Server reveals itself when you populate it with operating system symbols. If you've been a good bugslayer over the years, you are probably already installing the operating system symbols on your machine. That's always been a little frustrating as you probably have a few hot fixes installed and certain operating system symbols never include the hot fix symbols. The great news with Symbol Servers is that you can be guaranteed of always getting the right operating system symbols with no work whatsoever! This is a huge boon.
The magic here is that Microsoft has made the symbols for all released operating systems, from Windows NT® 4.0 through the latest beta release of Windows Server 2003, including all operating system hot fixes, ready for download. To experience the magic, you need to set your _NT_SYMBOL_PATH environment variable to SRV*\\Symbols\Symbols* https://msdl.microsoft.com/download/symbols. Please note that I am assuming that your symbol store will be on a server called \\Symbols in a shared folder called Symbols. If yours are different, just substitute your values.
When you next start debugging, the debuggers will see that _NT_SYMBOL_PATH is set, automatically start downloading the operating symbols from Microsoft over HTTP, and put them in your Symbol Store if the symbol file has not already been downloaded. Remember, the Symbol Server will only download the symbols it needs, not every single operating system symbol. That's why putting the Symbol Store in a shared directory is so important; if one of your teammates has already downloaded the symbol, you avoid a potentially long download.
That takes care of the appropriate operating system symbols, so let's turn to getting your product symbols into the Symbol Server. SYMSTORE.EXE is a command-line utility that lets you add to your Symbol Store whole directory trees that contain symbols. SYMSTORE.EXE has a number of command-line switches (see Figure 2).
The best way to use SYMSTORE.EXE is to have it automatically add your complete build tree at the end of a daily build or milestone build. You probably do not want to have developers adding their local builds unless you are really into chewing up tons of disk space. For example, the following command will store all PDB and binary files in your symbol store for all directories found under D:\BUILD\RELEASE, inclusive:
symstore add /r /f d:\build\release\*.* /s \\Symbols\Symbols
/t "MyApp" /v "Build 632"
It's nice to have the binaries stored in your Symbol Store so your crash dumps can automatically line up the binaries, but you can eat up a lot of disk space doing that. If you only want to include the PDB files, you can use the following:
symstore add /r /f d:\build\release\*.PDB /s \\Symbols\Symbols
/t "MyApp" /v "Build 632"
There's lots more to read about SYMSTORE.EXE and Symbol Servers in the WinDBG documentation under Symbols; what I have discussed here are the steps that I have found to work well for me. I've been amazed how well the Symbol Server works and have been able to debug much faster because I nearly always have perfect call stacks.
What's really fantastic about Symbol Servers is that both WinDBG and Visual Studio .NET will use them if you are reading crash dumps as well. Just in case you are coming from one of those other operating systems, crash dumps are what Microsoft calls the user mode dump of the process when it crashes. Dr. Watson, the default debugger, writes crash dumps if you check the Create Crash Dump button shown in Figure 3. As you can guess, crash dumps are almost the next best thing to sitting there watching the application crash.
Figure 3 Create Crash Dump
As most folks realize, WinDBG has been able to read and process crash dumps for quite a while. What might be news though is that Visual Studio .NET can also handle crash dumps perfectly. That's great, because the UI of WinDBG takes minimalism to a new level.
Handling a crash dump is quite easy in Visual Studio .NET, but getting one opened is a little confusing. Start with a fresh instance of Visual Studio .NET and select Open Solution from the File menu. In the File Open dialog, select the fifth item down in the Files of Type combo box, Dump Files (*.dmp; *.mdmp). Navigate to the directory with your crash dump file and open it. That will create a new solution which you'll need to save. To start viewing the crash dump, simply press one of the debugging keys such as F5 (Go) or F10 (Step). You'll see the message box pop up reporting the error and, if you have all the appropriate symbols and source, you'll be dropped right on the line where you had the crash. It's that simple!
Both debuggers can write out crash dumps at any point during debugging. I do this frequently when tracking down tough problems so I can quickly look at the various stages I saw when debugging. This saves huge amounts of time.
Writing a dump from Visual Studio .NET is as simple as clicking on the Debug menu while debugging and selecting the last item on the menu, Save Dump As. Visual Studio .NET can write out two types of crash dumps. The minidump contains module information, such as name and date/time stamp, and the call stacks of all the threads. Minidumps are very small, on the order of 3-10KB. A minidump with heap, on the other hand, writes out the same information but also writes out all the memory marked as allocated memory. This way you can look at what pointer variables point to. Minidumps with heap are quite a bit larger; for simple "Hello World!" programs they're on the order of 2.5MB.
In WinDBG, creating crash dumps is done with the .dump command. One additional feature of WinDBG's crashes is that you also have all the handle data for the process stored in the crash dump with the .dump/mh command. With the !handle command you can then see the exact state of your handles right from the crash dump. This is invaluable for tracking down deadlocks.
You can even write out your own crash dumps at any time by calling the MiniDumpWriteDump API function from DBGHELP.DLL. Keep in mind that you must use the latest version of DBGHELP.DLL from the WinDBG installation in order for this function to work correctly. The only gotcha is that if you call MiniDumpWriteDump on yourself, your crash dump will start in the middle of MiniDumpWriteDump, which might mean you can't walk the stack back to your own code. Thus, BugslayerUtil.DLL contains a function called CreateCurrentProcessMiniDump that will properly wrap the call to MiniDumpWriteDump so you can get the best crash dumps possible just when you need them.
The Debugging Engine
While it's wonderful to have crash dumps, you always do the same thing when you load them up; you enumerate the threads so you can see where each one is. Since I am basically lazy, I wanted a tool that would just give me the information I always looked for so I didn't even have to start the debugger. I started poking through the docs looking for a way to read dump files and eventually ran across a mention that WinDBG is really a shell on top of a debugging engine. I figured if I could get the interface to that debugging engine, I could easily write a tool to dump the cool stuff. Hidden in the WinDBG installation is a node that says SDK, but is not set to install by default. I set it to "Will be installed on local hard drive" and got the header files and libraries for DBGENG.DLL, the debugger engine.
Figure 4 Setting Up the WinDBG SDK
If you look at Figure 4, which shows what you need to do to install the WinDBG SDK, you'll notice there's not an installation node for Documentation. What makes using DBGENG.DLL fun is that the only documentation is the comment section in the header file DBGENG.H. For the most part, the comments can get you going, but until there's full documentation, you are going to have to spend some time playing with parameters to figure out what some APIs expect (see Figure 5). Oddly, the interface appears like it's all COM-based. While it uses interfaces, it does not use OLE32.DLL at all. Think of the API as pseudo-COM. It's also pseudo-COM in the sense that you get all the pain of reference counting, but none of benefits of enumerators and the like.
Another issue with the interface is that it is essentially the internal interface to WinDBG. Some of the interfaces and methods return items in what is obviously internal WinDBG format. Additionally, the engine outputs lots of text messages that could make your application look just like the WinDBG Command window if you don't suppress it. All in all, the fact that there is a debugging engine more than makes up for the quirks in the interface. In Figure 5, I list only the most derived interfaces as it looks like the "2" interfaces are the latest and most complete. Since you can't call CoCreateXxx on the debugging engine interfaces, DBGENG.DLL exports two functions, DebugConnect and DebugCreate, to create the specific interfaces for you.
The best way to get started with the debugging engine is to compile and carefully step through the DUMPSTK sample included with the SDK installation. The only problem is that it doesn't work. DUMPSTK is supposed to dump the call stacks for a dump file. I nearly drove myself nuts wondering why the code did not work as expected.
The key method to get the debugging engine cranking is IDebugControl::WaitForEvent. Whenever DUMPSTK calls it, it always returns E_INVALIDARG. Since it only takes two unsigned longs, the flags to indicate what you are waiting on, such as the initial breakpoint, and the time to wait, I was completely confused. It slowly dawned on me that DBGENG.DLL was complaining that the image path and symbols path were not both set. I set the environment variable _NT_IMAGE_PATH, thinking it might get picked up, and all of a sudden IDebugControl::WaitForEvent started working. There's nothing like returning values that have no relationship at all to the actual error!
Once I got DUMPSTK limping along, it proved useful. It's small enough to get your head around but actually does something handy. Also, I recommend you spend some time reading the complete DBGENG.H header file. As you can see from the list in Figure 5, the information you might need to solve a problem with the debugging engine is scattered across multiple interfaces.
When I first started looking at the debugging engine, I could see all sorts of very cool debugging and analysis utilities that I would like to write when my commercial programs crash at the customer's site. The good news is that DBGENG.DLL is part of the Windows XP and Windows Server 2003 operating systems. To use it legally on Windows 2000, your customers must download the complete WinDBG package and install it on their machines.
The Crash Dump Information Dumper
Now that I've covered the debugging engine's interfaces, I want to describe the DMPINFO program I wrote. I have always wanted a program that could tell me the important information from a user mode crash dump. When I open a user mode crash dump in Visual Studio .NET and WinDBG, I always do the same operations, so I wanted to automate them. DMPINFO is also a much more complete sample on how to use the debugging engine's interfaces.
Using DMPINFO is trivial; just type DMPINFO in a command prompt followed by the user-mode crash dump file you want to dump. The DMPINFO outputs the system information from the user-mode crash dump, the loaded modules in the crash dump, the registers of the crashing thread, a disassembly for the crashing thread, and the call stack with all local variables. If you want to see all threads, pass -a on the command line. You can also pass in the specific source paths, symbols paths, and image paths. When looking at the DMPINFO output, you might notice that module symbol types are Document Interchange Architecture (DIA) even though you have PDB files. DBGENG.H defines the DIA symbol type and DIA appears to be the new symbol format for Visual Studio .NET. However, all PDB symbols are reported as DIA.
I wrote DMPINFO with release 4.00.0018.0 of DBGENG.DLL. There are two bugs in DBGENG.DLL that you might see from DMPINFO. The system information values don't look right and occasionally the locals are not displayed for a stack scope. If you are running a debug build of DMPINFO, you will see an assertion message box. For some reason, DBGENG.DLL stops calling the IDebugOutputCallbacks interface so DMPINFO can't display locals. I'll discuss this problem in more detail later.
It actually took me quite a while to write DMPINFO because I had to spend so much time in trial and error development. The documentation is not bad in DBGENG.H; it's just not complete. Consequently, I had to try passing different parameters in all the time to get the results I wanted. You will see more assertions in DMPINFO.CPP than in any program you have ever seen because I needed to know instantly when something failed.
The first issue I ran into was that the debugging engine spews quite a bit of output, which gets in the way. I set up my own interface, IBetterDebugOutputCallbacks, derived from IDebugOutputCallbacks, so that I could filter out the debugging engine output that I didn't want to see. You can see the work in OUTPUT.H and OUTPUT.CPP available from my downloadable source sample. Fortunately, the output all seems to occur when you load a crash dump, so I could just turn off output until I was finished getting everything loaded. Use the -v command line switch on DMPINFO's command line in order to see all output.
The next issue I ran into was that there does not seem to be a way to determine if loaded symbols are programmatically mismatched with the binary. The debugging engine will output the mismatch when you load the crash dump so that engine knows about the mismatch. I hope that Microsoft will add a method to IDebugSymbols or a new field to the DEBUG_MODULE_PARAMETERS structure so you can find the mismatches.
My goal for DMPINFO was to show how to do all the work without using some of the easy methods of some of the interfaces. That way you would have a stronger sample and would have an idea how to apply the techniques yourself. When it came time for me to do the disassembly part of DMPINFO, I have to admit I wimped out. It's impossible to disassemble backwards in IA32 assembly language because the instructions are variable length, so I was not looking forward to grinding through an algorithm to get everything lined up so I could show 15 instructions before the instruction pointer. The output from IDebugControl->OutputDisassemblyLines wasn't what I wanted because I couldn't stick in a little pointer prefix, which indicated the instruction pointer. The output is just a blob of text. OutputDisassemblyLines will do all the work to find the instruction starts in the disassembly and return them as an array. When I saw that OutputDisassemblyLines would do the work for me, I punted! I turned off output, called OutputDisassemblyLines so I could get the offsets of all the instructions starts, then called IDebugControl->Disassemble so I could format the lines as I wanted.
I spent what seemed like forever wrestling with the final part of DMPINFO: getting the local symbols. The first problem was that I could not figure out how to get the local symbols loaded after I set the scope. After calling IDebugSymbols->SetScope, I could see that I needed to call IDebugSymbols->GetScopeSymbolGroup. When I called IDebugSymbolGroup->GetNumberSymbols, I always got back that there were zero symbols. After nearly giving up, I finally asked Microsoft how to get local symbols. You have to pass the "*" string to IDebugSymbolGroup->AddSymbols to get the locals loaded into the IDebugSymbolsGroup. You can take a look at all of this in action in the OutputScopeSymbols function in DMPINFO.CPP.
Once I got the locals loaded, I thought I was on my way. That's when I ran into the biggest problem of the current IDebugSymbolGroup interface: there's no way to enumerate local symbols values! You can call IDebugSymbolGroup->GetSymbolName to get the name of a symbol index. What's missing are two methods, GetSymbolType and GetSymbolValue. You can get the type in a roundabout way by calling IDebugSymbolGroup->GetSymbolParameters to get the DEBUG_SYMBOL_PARAMETERS structure for a symbol. In there is a TypeId field which you can call IDebugSymbols->GetTypeName (notice it's a different interface). That's two thirds of the information, but it doesn't have the all-important value. I called IDebugSymbolGroup->OutputSymbols and that did output all the symbol information, but in this very strange format:
The debugging engine outputs all symbols in this format packed end-to-end in a giant string. I especially liked the fact that a common value "*" (think pointer) was used as a delimiter. Since there is no other way to get values, I had to trap the string and parse it up to show them. I certainly hope that future releases of the debugging engine will fix this oversight.
Getting a Symbol Server set up is so important I urge you to stop reading right now and get one set up for your organization! It will make your debugging life so much easier. Also, armed with the new crash dump handling in Visual Studio .NET and WinDBG, getting rid of bugs should be even easier. Finally, I hope I was able to help you get over some of the same hurdles I ran into when I started with DBGENG.DLL. While it might have a few quirks, it's still a work in progress and will only get better with time. I encourage you to think about the possibilities and start creating some of those debugging tools you've always wanted!
The sweet smell of flowers in the spring should help you think of even more tips. Send your tips to me at firstname.lastname@example.org.
Tip 53 If you have any really tough debugging problems, the new WinDBG documentation has a couple of excellent discussions in the Debugging Techniques section.
Tip 54 John Maver reports a cool trick with the Visual Studio .NET debugger. If you have a line like this
HeapFree ( GetProcessHeap ( ) , 0 , lpdwPIDs ) ;
and if you want to step into HeapFree, but not GetProcessHeap, put your cursor on HeapFree, right-click, and choose Step Into HeapFree. The text changes based on where you place your cursor. I like this one so much I assigned the shortcut Ctrl-Alt-F11 to it.
Send questions and comments for John to email@example.com.