[Newsletters Archive ^] [< Volume 4, Number 3] [Volume 5, Number 2 >]

The Systems Internals Newsletter Volume 5, Number 1

http://www.sysinternals.com
Copyright (C) 2003 Mark Russinovich


February 19, 2003 - In this issue:

  1. EDITORIAL

  2. WHAT'S NEW AT SYSINTERNALS

    • Filemon v5.01
    • DebugView v4.2
    • NewSID v4.02
    • PsShutdown v2.01
    • Autoruns v2.02
    • ShareEnum v1.3
    • TCPView v2.31
    • Bluescreen v3.0
    • Sysinternals at Microsoft
  3. INTERNALS INFORMATION

    • New XP/Server 2003 Internals Video
    • Mark and David Solomon Teach Internals and Troubleshooting in Seattle
    • Windows 2000 SP3 Common Criteria Certified
    • Visual Studio: Put a Watch on LastError
    • The LameButtonText Registry Value Explained
    • Windows Development History
    • Introduction to Crash Dump Analysis

The Sysinternals Newsletter is sponsored by Winternals Software, on the Web at http://www.winternals.com. Winternals Software is the leading developer and provider of advanced systems tools for Windows NT/2K/XP. Winternals Software products include ERD Commander 2002, NTFSDOS Professional Edition (a read/write NTFS driver for DOS), and Remote Recover.

Winternals is proud to announce Defrag Manager version 2.10, the fastest, most thorough enterprise-defragmenter available. Now you can manage defragmentation schedules across your entire Windows enterprise from a simple MMC snapin - without even having to install any client software on your NT, Windows 2000 or Windows XP systems. Visit http://www.winternals.com/es for more information or to request a free 30-day trial version.

Hello everyone,

Welcome to the Sysinternals newsletter. The newsletter currently has 36,000 subscribers.

I'm pleased to inform you that David Solomon is the guest author of this month's editorial, where he describes some of his real-world troubleshooting experiences with several Sysinternals utilities.

Please pass the newsletter to friends you think might be interested in its content.

Thanks!

-Mark

EDITORIAL - by David Solomon

I have a new motto: "When in doubt, run Filemon and Regmon (and Process Explorer)".

Before I explain, let me first say thanks to Mark for inviting me to write this guest editorial (of course, since this a glowing report on how useful his tools are, it's not like he's doing me a big favor or anything!).

As many of you know, Mark and I work together in helping to educate people on the internals of Windows. Our latest project was an update to the Windows 2000 internals video tutorial we created last year to cover the kernel changes in Windows XP and Windows Server 2003, and our next public Windows internals class is April 21-23 in Bellevue, Washington- see details on both in the relevant sections of this newsletter. And, as many have asked us, we're in the process of our book Inside Windows 2000 for XP & Server 2003 (tentative release date is late summer).

And now, why I am so enthused about the Sysinternals tools? Because in the last year or so, they've helped me troubleshoot and solve a wide variety of application and system issues that would otherwise be unsolvable. In fact, I can't begin to describe the number of totally different, unrelated issues I've been able to troubleshoot with these tools. Even in cases where I didn't think they would help, they did. Hence my new motto, "When in doubt, run Filemon and Regmon".

There are two basic techniques I've found to apply these tools:

  1. Look at the last thing in the Filemon/Regmon trace that the application did before it failed. This may point to the problem.
  2. Compare a Filemon/Regmon trace of the failing application with a trace from a working system.

In the first approach, run Filemon and Regmon, then run the application. At the point the failure occurs, go back to Filemon and Regmon and stop the logging (press CONTROL+E). Then, go to the end of the log and find the last operations performed by the application before it failed (crashed, hung, or whatever). Starting with the last line, work your way backwards examining the files and/or registry keys referenced-often this will help pinpoint the problem.

Use the second approach when the application fails on one system but works on another. Capture a Filemon and Regmon trace of the application on the working and failing system and save the output to a log file. Then, open the good and bad log files with Excel (take the defaults on the import wizard) and delete the first 3 columns (otherwise, the compare will show up every line as different, since the first 3 columns contain information that is different from run to run, such as the time and the process id). Finally, compare the resulting log files (for example with WinDiff, which on Windows XP is included in the free Support Tools you can install off the XP CD, or for Windows NT4 and Windows 2000 you can find it in the Resource Kit).

Now, a few real life examples.

On a Windows 2000 workstation with Microsoft Office 97 installed, Word would get a Dr. Watson shortly after starting. You could actually type a few characters before the Dr. Watson occurred, but whether you typed anything or not, within a few seconds of starting, Word would crash. Of course, the user had tried uninstalling and reinstalling Office, but the problem remained. So, I ran Filemon and Regmon and looked at the last thing done by Word before it died. The Filemon trace showed the last thing Word did was open an HP printer DLL. Turns out, the workstation had no printer, but apparently did at one time. So, I deleted the HP printer from the system and the problem went away!

Apparently, Word's enumerating the printers upon startup caused this DLL to load, which in turn caused the process to die (why this was happened I don't know-perhaps the user had installed a bogus version; but since the system no longer had a printer it really didn't matter).

In another example, the Regmon saved a user from doing a complete reinstall of his Windows XP desktop system. The symptom was that Internet Explorer (IE) would hang on startup if the user did not first manually dial the internet connection. This internet connection was set as the default connection for the system, so starting IE should have caused an automatic dialup to the internet (because IE was set to display a default home page upon startup). Following my new motto, I ran Filemon and Regmon and looked backwards from the point in the log where IE hung. Filemon didn't show anything unusual, but the Regmon log showed a query to a key HKEY_CURRENT_USER\Software\Microsoft\RAS Phonebook\ATT. The user had told me he had the AT&T dialer program installed at one time, but had uninstalled it and manually created the dialup connection. Since the dialup connection's name was not "ATT", I suspected this was left over registry junk from the uninstall that was causing IE to choke. So, I renamed the key and the problem went away!

Using the "compare the logs" technique helped solve why Access 2000 was hanging on a programmer's XP workstation trying to import an Excel file. Importing the same file worked fine on other users' workstations, but failed on this one workstation. So, a capture was made of Access on the working and failing system. After appropriately massaging the log files they were compared with Windiff. The first several differences were due to names of temporary files being different and due to some filenames being different due to case differences, but of course these were not "relevant differences" between the two systems.

The first difference that was not ok was that an Access DLL was being loaded from \Windows\System32 on the failing system but from the \Program Files\Microsoft Office\Office folder on the working system. Comparing the DLLs revealed that the version in \Windows\System32 was from a previous version of Access. So, the user renamed that DLL to .bad and re-ran Access and the problem went away!

One class of problems Filemon is incredibly useful for is uncovering file permission issues. Many applications do a poor job of reporting access denied errors. However, running Filemon reveals clearly failures of this type since the result column shows "ACCESS DENIED" for failures to open files due to rights issues (and the most recent version even shows the username that failed to access the file). Two specific examples where this was the case:

  1. A user was getting a strange Macro error upon starting Word; turns out the permissions on a .DOT file referenced by a macro were changed to disallow this user access. Filemon clearly showed Word getting an access denied error on the .DOT file. Once the permissions were fixed, the problem went away.
  2. An Outlook application popped up a message box that said Application defined or object-defined error-Message ID: [Connect].[LoadGlobalVariables].[LN:?].[EN:287]-another example of how many applications generate useless error messages on random I/O failures. Again, running Filemon revealed an access denied error (this time to a folder that Outlook needed to access). The permissions were adjusted on the folder and the problem went away.

These are just a few examples-I have many other success stories where Filemon and Regmon (and Process Explorer, which I didn't discuss here) have saved the day. It's no wonder Microsoft Product Support uses these tools on a daily basis to help resolve customer issues (at last count, some 40 Knowledge Base articles point to Mark's tools-see http://www.sysinternals.com/ntw2k/info/mssysinternals.shtml for a list).

So, when in doubt, run Filemon and Regmon!

David Solomon David Solomon Expert Seminars http://www.solsem.com

WHAT'S NEW AT SYSINTERNALS

FILEMON V5.01

Filemon, one of the utilities David highlights in his editorial, has undergone its first major revision in several years. The new release brings a new level of usability to a tool that already had an accessible user interface. The most significant enhancement is the change to how file system activity is presented in Filemon's default setting when run on Windows NT, 2000, XP or Server 2003, something I'd been thinking about for a while and that I finally implemented based on real user feedback from David.

Previous versions of Filemon display file system operations with the textual names of the internal I/O requests that execute the operations. While technically precise in its presentation, many users are not familiar with the inner-workings of the Windows I/O subsystem and find operations like FASTIO_CHECK_IF_POSSIBLE meaningless and others, like a reported failure of a FASTIO_READ operation, confusing. There are numerous other examples of operations that most would classify as "noise" and operation names that aren't self-explanatory.

Filemon version 5.01's default viewing mode now has a filtering mechanism to remove the activity that is useless in most troubleshooting scenarios and that presents intuitive names for all I/O operations. FASTIO_CHECK_IF_POSSIBLE is filtered out, FASTIO_READ failures aren't shown, and FASTIO_READ's that succeed are reported as READ operations. In addition, the default view omits file system activity in the System process, which is the process from which the Memory and Cache Manager's perform background activity, and all Memory Manager paging activity, including that to the system's paging file. The Options|Advanced menu item will satisfy users, such as file system filter driver developers, that want the "raw" view of file system activity shown by previous Filemon versions.

Several users, including Microsoft employees, requested that Filemon show the account in which "access denied" errors occur in order to aid security-settings debugging in Terminal Services environments. In response version 5.01 displays that information as well as the access mode (read, write, delete, etc) a process wants when it opens a file and how a file is being opened, for instance whether it's being overwritten or opened only if it exists.

Many troubleshooting sessions focus on identifying the files a process accesses or attempts to access, in which case operations like reads, writes and closes are just noise. In recognition of this fact I've added a new "log opens" filtering option enables you to isolate just open operations.

Another major change is in the way Filemon v5.01 handles network-mapped shares. In previous versions each mapping shows up as a drive letter in the Drives menu. Now, all such mappings are encompassed in the "Network" selection of the Volumes menu (which is the renamed Drives menu). Selecting Network has Filemon monitor all network shares, as well as report UNC-type network activity of the type that occurs when you access remote files using the "\\computer\share\directory" naming convention. This change makes it possible for you to view network file activity even when you don't have a mapped network share, as was required by previous Filemon versions. There are numerous other minor changes to the latest Filemon, including an updated menu structure that mirrors the more usable menus I introduced in Regmon several months ago.

Download Filemon v5.01 at
http://www.sysinternals.com/ntw2k/source/filemon.shtml

ABOUT FILEMON AND REGMON SOURCE CODE

Developers of software, hardware, and networking products support Sysinternals by purchasing licenses to redistribute our code. However, during the past year we've found a range of software, from Trojans to commercial products from some multi-billion dollar corporations, containing unlicensed Sysinternals source code. In an effort to keep Sysinternals growing and our products legally licensed we've discontinued publishing the source code for some of our products, including the latest Filemon and Regmon releases. We will continue to make source code available to commercial licensees. If you discover mirrors for Sysinternals source code please let us know.

DEBUGVIEW V4.2

DebugView is a very popular Sysinternals utility that software developers use to capture debug output generated by their software. Version v4.2 reflects a number of user-requested enhancements and features. A Microsoft-requested option allows you to capture debug output of processes executing within the console session of a Terminal Services environment when you run DebugView in a non-console session. V4.2 supports expanded command-line options that allow you to specify a log file to load, history depth, and other startup behavior. Several users requested more and longer filters, filtering on process IDs, and the ability to insert comments into the output, all of which are possible with the latest version. The new release is rounded out with several bug fixes, better support for extracting kernel-debug output from crash dump files, and better balloon windows for text that exceeds the width of its output column and even the screen.

Download DebugView v4.2 at
http://www.sysinternals.com/ntw2k/freeware/debugview.shtml

NEWSID V4.02

The SID (Security ID) duplication problem is one that you'll encounter if you use a preinstall Windows image to deploy more than one system. Each computer that shares the image has the same internal Windows SID, which is an identifier that the Windows security subsystem uses as the basis for local group and account identifiers. Because of the security issues the sharing can cause most administrators take steps to post-apply a unique SID to each computer using a SID changing tool.

NewSID, Sysinternals' SID-changer, is popular because unlike other changers that rely on DOS or require that a system be free of add-on software, NewSID is a Win32 program that you can use to assign a new SID to computers that have installed applications. Version 4.02 is major update that has a new Wizard interface, adds support for Windows XP, and lets you rename a computer.

A feature requested by many administrators is NewSIDs ability to apply a SID that you specify, something that might be useful for migrating an installation's settings to a different computer or for reinstalling. As NewSID runs it causes the Registry to grow when it applies temporary security settings to portions of the Registry in order to make them accessible. This bloating can cause the Registry to exceed its size quota so a new v4.02 function compresses the Registry to its minimum size as its last step of operation.

Download NewSID v4.02 at
http://www.sysinternals.com/ntw2k/source/newsid.shtml

PSSHUTDOWN V2.01

Shutdown is a tool that Microsoft has long included in the Windows Resource Kit and that is included in Windows XP installations. Prior to v2.01 PsShutdown, a member of Sysinternals PsTools command-line administration toolkit, was simply a clone Shutdown, but this latest release expands its capabilities far beyond those of Shutdown's. For example, you can shutdown and power-off if a system supports power management, lock the desktop, and logoff the interactive user, all on the local or a remote machine, without manually installing any client software.

Download PsShutdown v2.01 at
http://www.sysinternals.com/ntw2k/freeware/psshutdown.shtml
Download the entire PsTools suite at
http://www.sysinternals.com/ntw2k/freeware/pstools.shtml

AUTORUNS V2.02

We've all been annoyed by the installation of unwanted applets that run when we log in and been frustrated in our search for their startup command. It's no wonder when Windows has close to 2-dozen mechanisms for such activation. The MsConfig utility included with Windows Me and XP can sometimes help, but it misses roughly half of the possible startup locations.

Autoruns, a Sysinternals tool written by Bryce Cogswell and myself, shows you the whole picture. Its display shows a list of all the possible Registry and file locations where an application can enable itself to run at system startup or logon. The latest version displays icon and version information for each startup-configured image for easy identification and adds user-interface enhancements like a context menu. In addition, the new version identifies more startup locations, including logon and logoff scripts, task scheduler tasks that execute upon logon, and Explorer add-on launch points.

Download Autoruns v2.01 at
http://www.sysinternals.com/ntw2k/source/misc.shtml

SHAREENUM V1.3

Systems administrators often overlook a critical part of local network security: shared folders. Users within a corporate environment frequently create shares to folders containing documents in order to provide easy access to coworkers in their group. Unfortunately, many users fail to lock down their shares with settings that prevent unauthorized access to potentially sensitive information by other employees.

ShareEnum is a Sysinternals utility written by Bryce Cogswell that helps you to identify rogue shares and tighten security on valid ones. When you start ShareEnum it uses NetBIOS enumeration to locate computers on your network and reports the shares they export along with details on the security settings applied to the shares. Within seconds you can spot open shares and double-click on a share to open it in Explorer so that you can modify its settings. You can also use ShareEnum's export feature to save scans and compare a current scan to one you've previously saved.

Download ShareEnum v1.3 at
http://www.sysinternals.com/ntw2k/source/shareenum.shtml

TCPVIEW V2.31

TCPView is a graphical netstat-type utility that displays a list of a system's active TCP and UDP endpoints. On Windows NT, 2000, XP and Server 2003 installations it shows you the process that owns each endpoint. Version 2.31 displays the icon of a process' image file for easier identification.

Download TCPView v2.31 at
http://www.sysinternals.com/ntw2k/source/tcpview.shtml

BLUESCREEN V3.0

The Sysinternals Bluescreen of Death Screensaver has been a favorite download for several years and version 3.0 adds Windows XP compatibility. The screensaver displays an authentic-looking blue screen of death, complete with formatting and random details appropriate to the operating system on which its run (e.g. Windows NT, 2000, or XP), and after a pause simulates a reboot cycle and subsequent repeat of a different crash screen. It's so convincing that David Solomon has fooled me with it and I've fooled him with it. Use it as your own screen saver or to fool your friends and coworkers, but make sure that your boss has a sense of humor before you install it on a production system.

Download Bluescreen v3.0 at
http://www.sysinternals.com/ntw2k/freeware/bluescreensaver.shtml

SYSINTERNALS AT WWW.MICROSOFT.COM

Here's the latest installment of Sysinternals references in Microsoft Knowledge Base (KB) articles released since the last newsletter. I'm honored to report that this brings to 41 the total number of KB references to Sysinternals.

  • ACC2000: Error Message: ActiveX Component Can't Create Object http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q319841&

  • HOW TO: Troubleshoot ASP in IIS 5.0 http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q309051&

  • OL2002: How to Create Trusted Outlook COM Add-ins http://support.microsoft.com/default.aspx?scid=KB;en-us;327657&

  • PRB: Error 80004005 "The Microsoft Jet Database Engine Cannot Open the File '(Unknown)'" http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q306269&

  • User Profile Unload Failure When You Start, Quit, or Log Off NetMeeting http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q327612&

  • XADM: Error Message: Error 123: The Filename, Directory Name, or Volume Label Syntax Is Incorrect http://support.microsoft.com/default.aspx?scid=KB;EN-US;Q318746&

INTERNALS INFORMATION

NEW XP/SERVER 2003 INTERNALS VIDEO

Our new video update on Windows XP/Server 2003 internal changes is available for pre-release ordering! As an adjunct to our existing video tutorial, INSIDE Windows 2000, or as a standalone product in its own right, this new video provides training on the kernel changes in Windows XP and Microsoft's new product Windows Server 2003, which will launch in April. Topics covered include performance, scalability, 64 bit support, file systems, reliability, and recovery.

In the same interactive style as its predecessor, the Windows XP/Server 2003 Update puts you across the desk from David Solomon and Mark Russinovich for 76 minutes of highly focused, intensive training. It includes review questions, lab exercises and a printed workbook, and is available in DVD video and Windows Media on CD-ROM.

Because these videos were developed with full access to the Windows source code and development team, you know you're getting the real story. As the ultimate compliment, Microsoft has licensed this video for their internal training worldwide.

SPECIAL PRE-RELEASE PRICING IF YOU PURCHASE BEFORE MARCH 15! Buy INSIDE Windows 2000 for $950 and get the Windows XP/Server 2003 Update FREE! That's almost 40% off the combined retail value of $1,390. Or, buy the standalone Windows XP/Server 2003 Update video for only $169 ($195 retail value). Other license configurations available from the web site. To take advantage of this limited time offer, order now at http://www.solsem.com/vid_purchase.html

MARK AND DAVID SOLOMON TEACH INTERNALS AND TROUBLESHOOTING IN BELLEVUE, WA

Hear me and David Solomon present our 3-day Windows 2000/XP/.NET Server internals class in Bellevue, WA (near Seattle) April 21-23. Based on "Inside Windows 2000, 3rd Edition", it covers the kernel architecture & interrelationship of key system components & mechanisms such as system threads, system call dispatching, interrupt handling, & startup & shutdown. Learn advanced troubleshooting techniques using the Sysinternals tools and how to use Windbg for basic crash dump analysis. The internals of key subsystems covered include processes & threads, thread scheduling, memory management, security, the I/O system, and the cache manager. By understanding the inner workings of the OS, you can take advantage of the platform more effectively and more effectively debug and troubleshoot problems.

To register or for more information see http://www.sysinternals.com/seminar.shtml

WINDOWS 2000 SP3 COMMON CRITERIA CERTIFIED

Many of you are probably familiar with the terms "Orange Book" and C2, both of which are related to an outdated security evaluation standard used throughout the 1980's and 90's by the US government to rate the security capabilities of software, including operating systems. Since 1999 the Orange Book ratings, which were part of the Department of Defense Trusted Computer System Evaluation Criteria (TCSEC), have been subsumed by the newer Common Criteria (CC) system. The CC was agreed upon by multiple nations as an international security ratings standard that is richer than TSCEC and England's obsolete Information Technology Security (ITSEC) ratings.

When a vendor has their software certified against the CC standard they specify a "protection profile", which is a set of security features, and the evaluation reports a level of assurance, known as an Evaluation Assurance Level (EAL), that the software meets the requirements of the protection profile. There are 7 EALs with higher assurance levels indicating greater confidence in the reliability of the evaluated software's security features.

Microsoft submitted Windows 2000 for CC rating against the Controlled Access Protection Profiles, which is roughly the CC equivalent of the TCSEC C2 rating, several years ago and in October of 2002 its evaluation completed. Science Applications International Corporation (SAIC), the independent company that performed the evaluation, found Windows 2000 with Service Pack 3 to meet the Control Access Protection Profile with an Evaluation Assurance Level (EAL) of 4 plus Flaw Remediation. An EAL of 4 is considered the highest level achievable by general purpose software, and Flaw Remediation refers to the Windows Update mechanism for timely application of security fixes. This rating is the highest level achieved thus far under the CC by an operating system.

VISUAL STUDIO: PUT A WATCH ON LASTERROR

If you develop applications that rely on the Win32 API then you've almost certainly have written code that executes a Win32 function, but for whatever reason doesn't report specific errors. If so, you'll find this tip useful. By adding the expression @ERR,hr to the watch window you'll see the numeric and textual representation of the value stored as the current thread's LastError variable, which is the value returned by the GetLastError() Win32 function.

THE LAMEBUTTONTEXT REGISTRY VALUE EXPLAINED

If you've examined the Regmon traces on a Windows 2000 or XP system of a Windows application startup you've probably seen references to the Registry value HKCU\Control Panel\Desktop\LameButtonText, usually with a NOTFOUND error. Someone at Microsoft obviously has a sense of humor, but what's this value for? It turns out it stores the text that you see in Windows beta and release candidate releases on window title bars that directs you to click on a link in order to report feedback. It would be cool if you could enable it on non-pre-release versions of Windows to place custom text, but its functionality is unfortunately disabled on production releases.

WINDOWS DEVELOPMENT HISTORY

Paul Thurrott has a nice 3-part article series going on the history of the Windows NT development process. Check it out at http://www.winsupersite.com/reviews/winserver2k3_gold1.asp

A QUICK INTRODUCTION TO CRASH DUMP ANALYSIS

When a system crashes immediately after you've installed new hardware or software diagnosing the cause is obvious. Sometimes, however, system crashes occur occasionally and there's no apparent reason. In such cases the only way to determine the cause of the crash is to analyze a crash dump. In this tutorial I'll describe how Microsoft's Online Crash Analysis (OCA) works and how it may give you the answer to a crash puzzle, and then tell you how to set up your own crash analysis environment so that you can take a look at crashes that OCA won't or cannot successfully analyze.

Microsoft introduced OCA with the release of Windows XP to be an automated analysis service based on a centralized repository of crash-related information. After an XP system reboots from a crash it prompts you to send in crash information to the OCA site (http://oca.microsoft.com/en/Welcome.asp). If you agree, XP uploads an XML file that describes your basic system configuration along with a 64 KB minidump crash file. A minidump contains a small amount of data immediately relevant to a crash such as the crash code, the stack of the thread that was executing at the time of the crash, list of drivers loaded on the system, and the data structures managing the process running when the crash happened.

Once OCA receives the information it proceeds to analyze it and stores the analysis summary in a database. If you follow the prompt after the upload and visit the OCA site you're offered the opportunity to track the analysis. This requires that you sign in with a Passport account. You then enter a name for the crash and some text describing the nature of the crash. If the OCA engine correlates the crash with others in the database for which Microsoft has identified a cause the site notifies you by e-mail and when you revisit the site and lookup the crash you submitted the resolution tells you where to get a driver or operating system update. Unfortunately, while OCA support is built into XP and it accepts Windows 2000 crash dump files, it doesn't support NT 4 and cannot identify the cause of most crashes (at least in my experience).

In order to perform crash analysis yourself, you'll need the appropriate tools, which Microsoft provides in the form of the Debugging Tools for Windows package that you can download from http://www.microsoft.com/ddk/Debugging/. The package includes, among other things, the Windbg analysis tool. After you've downloaded and installed the tools, run Windbg and open the File|Symbol File Path dialog. There you tell Windbg where to find symbol files for the version of the operating system from which a crash you're analyzing generated. You can enter a path to a directory where you've installed symbols, however that requires you to obtain symbol files for the exact operating system, service pack, and hot fixes installed on the crashing system. Manually keeping up with symbol files is tedious and if you want to analyze crashes from different systems you have to worry about different sets of symbol files for each different installation.

You can avoid symbol file hassle by pointing the Windbg at Microsoft's symbol server. When you configure it to use the symbol server Windbg automatically downloads symbol files on demand based on the crash dump you open. The symbol server stores symbols for NT 4 through Server 2003 betas and release candidates, including service packs and hot fixes. The syntax for directing Windbg to the symbol server is srv*c:\symbols*http://msdl.microsoft.com/download/symbols. Replace c:\symbols with the directory where you want to store symbol files. For more information on symbols, see http://www.microsoft.com/ddk/debugging/symbols.asp

There's one more step you have to perform before you're ready to analyze crash dumps: configure your systems to generate them. Do this by opening the System applet in the control panel and, on Win2k and higher, clicking on the Startup/Shutdown button of the Advanced page. On NT 4, go to the Startup/Shutdown tab of the applet. The only crash dump option on NT 4 systems is a full memory dump, where the entire contents of physical memory at the time of a crash are saved to the file you specify. On Win2k and higher there are three options: mini, kernel, and full. Win2K & XP Professional and Home default to mini; Server systems default to full. For workstation/client machines, change the setting from mini to kernel dump, which saves only parts of physical memory owned by the operating system (as opposed to applications), since that minimizes the size of the crash dump file and still provides full information on kernel data structures that Windbg needs to effectively analyze a crash. For Server systems, full dumps are fine, but kernel dumps are a safe choice (and possibly your only choice if you have a very large memory system).

Now you're ready to analyze a crash. When one occurs simply load the resulting dump file into Windbg by selecting the File|Open Crash Dump menu option. As the dump loads Windbg begins to process it and you'll see messages about the operating system version and symbol loading. Then you'll see a message with the text "Bugcheck Analysis". The output following the message reports the crash code and crash code's parameters, as well as a "probably caused by".

In some cases the basic analysis Windbg performs here is enough to identify the faulting driver or kernel component. I recommend, however, always entering the following command: !analyze -v. This command results in the same analysis, but with more information. For example, text will explain the meaning of the crash code and tell you what the optional parameters represent, sometimes with advice on what to try next. You'll also see a stack trace, which is a record of function execution leading to the code in which the crash occurred. If a driver passes faulty data to the kernel or a driver pinpointed by the analysis you might see its name in the trace and can identify it as a possible root cause.

If you want to dig deeper into the state of the system at the time of the crash there are numerous Windbg commands that let you see the list of processes running, drivers loaded, memory usage, and more. The Windbg help file also contains a bugcheck reference that I recommend you follow up with for more information and guidance, and if you still end up stumped I recommend that you perform a search in Microsoft's Knowledge Base (KB) for the crash code. Microsoft creates KB articles for common crashes and conducts you to vendor sites or hot fixes that remedy particular problems.

If you want to see me present this information live, with examples, then check me out at any of the following conferences:

  • The internals and troubleshooting seminar David and I are delivering in April in Bellevue, WA
  • Windows and .NET Magazine Connections in Scottsdale, AZ in May: http://www.winconnections.com/win
  • TechEd US (Dallas) or TechEd Europe (in Barcelona) this summer

Thank you for reading the Sysinternals Newsletter.

Published Wednesday, February 19, 2003 4:47 PM by ottoh

[Newsletters Archive ^] [< Volume 4, Number 3] [Volume 5, Number 2 >]