Exception Management Best Practices

Applies to: Windows Communication Foundation

Published: June 2011

Author: Alex Culp

Referenced Image

This topic contains the following sections.

  • Introduction
  • Faults
  • Exception Management Best Practices

Introduction

This article focuses on practical approaches to exception management that will reduce development and troubleshooting time. The foundation for good exception management is the Dependency Injection (DI) design pattern. The techniques in this article rely on that pattern, as discussed in the following article: www.microsoft.com.

If you rely on DI to manage exceptions, SOAP faults, and validation, the developers on your team will not have to incorporate complex exception logic in their service operations. This is because, when implemented correctly, the service code will have little need for trycatch blocks. (An exception to this rule is data access code. In this case, you should capture all of the parameters in stored procs, SQL statements, and so on).

The following figure illustrates how, with DI, validation and exception management for the MyService implementation are performed before the service operation is called.

Referenced Image

The following code is an example of how complex the code for exception management and validation can be without DI.

public void SomeOperationWithExceptionHandling(MyRequest request)
{
    //validate fields on request object and 
    //throw fault if invalid request data 
    try
    {
        //Do some work
    }
    catch (Exception ex)
    {
        //Write a bunch of exception logic here
        //and duplicate across every operation
    }
}

The following code shows how much simpler it is to use DI.

public void SomeOperationWithExceptionHandling(MyRequest request)
{
    //Do some work
}

Faults

This article assumes that you have a basic understanding of how faults and fault contracts work in WCF. For more information about faults in WCF, see "Specifying and Handling Faults in Contracts and Services" at https://msdn.microsoft.com/en-us/library/ms733721.aspx. The two basic types of faults that can occur in a service implementation are service-side faults and client-side faults. A receiver fault signals that an unexpected failure occurred in the service. Receiver faults fall under the category of exception management. A sender fault is used to tell the client that it is the cause of a problem. Sender faults fall under the category of validation. This article explains how to handle these faults.

Exception Management Best Practices

Capture the Details

The more detail that you can capture when an error occurs, the easier it will be to resolve the issue. In a production environment, you may not be able to attach a debugger to the service to see what is going on. Instead, you must rely entirely on the data that you capture when the error occurs. Log all of the parameters to your service when an error occurs. Because it is a service, all of its parameters can be serialized to XML.

Note

If you do capture all of the parameters to your service, you need to consider if there is any sensitive data. For example, you probably should not capture credit card numbers or social security numbers when an error occurs.

Log Errors Only at Service Boundaries

It is a good practice to log an exception at the service boundary only. In other words, log the error just before you send the fault exception back to the client. There are two reasons why you should log errors at the service boundary. First, if you log exceptions in other places, the end result may be error-handling logic that is scattered throughout the code. If you log at the service boundary, you need only one trycatch block. Code that is in a single place is less complex and easier to test. The second reason is that you can take advantage of techniques such as aspect-oriented programming (AOP) or Unity Interception to perform error handling. This modular approach to error handling again means that you can contain the logic in a single place in your service. (For information on capturing database information, which is an exception to this advice, see later in this article). For information on AOP, see http://en.wikipedia.org/wiki/Aspect-oriented_programming. For information on Unity Interception, see "Using Interception with Unity" at https://msdn.microsoft.com/en-us/library/ff647107.aspx.

Logging Errors in the Database

In addition to writing errors to the event log, it is also very helpful to write errors to a database. There are two reasons for this.

  • Reporting and trend analysis - if errors are logged to a database, management can track error rates, analyze trends, or monitor service-level agreements (SLA).

  • Troubleshooting - Often, one error can cascade into a series of errors. If the errors are in database, you can write queries to put the pieces of the puzzle together. It can be just as important to understand the sequence of events that occurred during a failure as it is to understand the individual errors. The relation of one event to another can be difficult to discover from an event log.

Note

It is a good idea to write errors to a different database cluster than to the transactional database(s) used in the application. If the error is caused by problems with the transactional database (such as SQL timeouts from a missing index), writing errors to the same database can often compound the problem.

Management Tools

When an error occurs at three in the morning, how do you know if someone needs to get out of bed and resolve the issue, or if it is something that can wait until normal working hours? In any large enterprise implementation, there are going to be many errors. Some are benign, while others may indicate serious problems, such as database failures. Even with proper testing, production errors are inevitable. Management tools such as Systems Center Operations Manager (SCOM) enable you to categorize errors, and to take the appropriate actions. For benign errors, you may simply create a defect and monitor it. For more serious errors, someone might be paged in the middle of the night. SCOM and other management tools can also filter out some of the noise. For example, if the same error occurs five thousand times, you do not want to send five thousand emails and bring down your mail server. For more information about SCOM, see the Microsoft® System Center site at https://www.microsoft.com/systemcenter/en/us/operations-manager.aspx.

Unique Event IDs

Depending on your organization, you might have a first-tier support level that monitors the event logs. If you do not want to receive unnecessary pages, it is a good idea to define what constitutes a critical issue, and what is less important. While SCOM can be configured to create rules based on the content of an individual event log entry, your support team will still route all of the error messages directly to the development team if clear rules are not in place. The easiest way to establish some criteria is to use unique event IDs for specific errors or operations. These IDs, in combination with error messages that are marked with the appropriate severity level, will make it much easier to manage errors in SCOM. You can use the Microsoft Enterprise Library configuration tool to control the event IDs according to the exception type, or the exception policy. Fortunately, for more flexibility, you can also configure Enterprise Library programmatically. The following code shows how to configure Enterprise Library to associate a specific event ID with a specific event type.

Visual C# Example of Configuration of Enterprise Library for Specific Event ID

var builder = new ConfigurationSourceBuilder();
builder.ConfigureExceptionHandling()
    .GivenPolicyWithName(policy.ToString())
    .ForExceptionType<Exception>()
    .LogToCategory("General")
    .UsingEventId(eventId).UsingTitle(ex.Message);

Visual Basic Example of Configuration of Enterprise Library for Specific Event ID

Dim builder = New ConfigurationSourceBuilder()
builder.ConfigureExceptionHandling().GivenPolicyWithName(policy.ToString()).
ForExceptionType(Of Exception)().LogToCategory("General").UsingEventId(eventId).UsingTitle(ex.Message)

Previous article: Using Virtual Methods to Override Behavior

Continue on to the next article: Implementing Exception Management, Part 1