What is a Crash (technically)... in ASP.NET and what to do if it happens?

Many times while troubleshooting performance related issues in ASP.NET/IIS we find that customers come in saying that the ASP.NET process crashes n number of times a day, two or more. Now, the question arises, that is it really a crash, or some Yellow color error message (or a plain "Page cannot be displayed" error) which is shown on the client's browser? Well, as they say, everything that glitters is not gold. On similar grounds, every error thrown on the browser is not a crash. We need to keep multiple things in mind depending on what exactly could have caused the error, and before we decide to call it a crash, we should know what *exactly* is a crash! In the forthcoming series under "Performance" category of my blog, I will discuss basics of CRASH, HANG, HIGH CPU, HIGH MEMORY, DEADLOCK and OUT OF MEMORY issues.

Symptoms

Prior to IIS6, when a crash happened it brought the down the IIS and the connections remained open. So the client browsers typcially see a disconnected connection and shows "Page cannot be found" or some similar error.

Now, in IIS6 Worker Process Isolation Mode, HTTP.SYS takes care of the connections in the kernel-mode itself. So even if w3wp.exe crashes for some reason, the connection remains and IIS spawns a new process to handle future requests. Eventually, client browsers will not show "Page cannot be found" error. But, any unsent response from the server will be lost.

If you open your Event Viewer you should find a similar Error entry in the Application log.

Event Type: Error
Event Source: ASP.NET 1.1.4322.0
Event Category: None
Event ID: 1000 <-------- CRASH
Date: 2/8/2006
Time: 4:45:23 AM
User: N/A
Computer: RAHULSONI
Description:
aspnet_wp.exe (PID: 956) stopped unexpectedly.

Or

A process serving application pool '<The Application Pool Name>' suffered a fatal communication error.

Or

Faulting application w3wp.exe, version 6.0.3790.1830, faulting module <module name>, version <some version number>, fault address <some address>.

Or

Faulting application w3wp.exe, version 6.0.3790.1830, faulting module <module name>, version <some version number>, fault address <some address>.

Why does it happen?

Well, it depends! But to give you a brief idea, it could simply be many things, like... "Stack Overflow", "Access Violation", "A *bad* dll", "A *bad* filter", etc etc etc. The bottom line is that, ASP.NET *crashed* because something really bad happened and it found it impossible to continue. BUT... believe me, there are reasons to it, and more often than not, we can fix it!

What to do?

When something goes wrong while coding, we DEBUG... "LIVE". When something goes wrong in your production box, we DEBUG... but this time we do it after the issue has happened. It is called Post-Production debugging. For that, we typically require a set of memory dump (it is nothing but a memory snapshot of the ASPNET_WP.exe or W3Wp.exe which is captured when the issue happens). In a *CRASH* scenario, since it happens automatically we must do something which captures the data as soon as the issue manifests itself again.

 1) As you may guess, We need a tool to capture the memory dumps. Lets visit the follwing link... http://www.microsoft.com/windowsserver2003/iis/diagnostictools/default.mspx
 2) You need to download IIS Diagnostics Toolkit depending on your OS version
3) Install the software on your production box and rest assured, since the install doesn't require any reboot.
4) Open Debug Diagnostics Tool 1.0 (Start -> All Programs -> IIS Diagnostics (32bit) -> Debug Diagnostics Tool)
5) On the Rules tab, click "Add Rule"
6) Select "Crash" option and click "Next"
7) Select "All IIS Processes" option and click "Next"
8) Under "Advanced Settings", click "Breakpoints"
9) Click "Add Breakpoint"
10) Select "KERNEL32!ExitProcess" in the "Offset Expression" list.
11) Change "Action Type" to "Full UserDump"
12) Click "Add Breakpoint"
13) Select "Kernel32!TerminateProcess" in the "Offset Expression" list.
14) Change "Action Type" to "Full UserDump"
15) Click "Save and Close"
16) Click "Next", "Next" and "Finish"
17) Ensure that the status Debug Diagnostic tool says, "Active"

18) Wait for the issue to occur.
19) If the issue happens again, you should be ready with a dump. The Userdump Path in the screenshot above will show you the path where the dump files are generated.
20) Now, the major part of collecting the dump is done. To analyze it, go to the Debug Diagnostic Tool once again, and click "Advanced Analysis" tab.
21) Click "Add Data Files" and browse to the dump file (.dmp extension, in the format aspnet_wp__PIDxxxxxxxx.dmp or w3wp__PID__2716xxxxxxxx.dmp)
22) In the "Available Analysis Scripts" select Crash/Hang Analyzers and click "Start Analysis"
23) It might take some time to analyze your dump file, since the tool will try to download symbols(more on this later) from the internet.
24) Once the analysis is complete it will create a .mht file and open a browser automatically showing the analysis.
25) You might take a look into the report and if you are fortunate enough, you will find the issue right away!!

In the other case, please visit the following link and choose appropriate support option suited to your need. We will be more than happy to assist you!
http://www.microsoft.com/services/microsoftservices/srv_support.mspx

Hope that helps!

DebugDiagCrash.GIF