October 2013

Volume 28 Number 10

.NET Framework - Adaptive Access Layers + Dependency Injection = Productivity

By Ulrik Born | October 2013

One of the toughest challenges when managing and developing the source code of a complex enterprise solution is to ensure the codebase remains consistent, intuitive, and highly testable while being maintained and extended by multiple teams within a large software development department. The core problem is that developers typically have a number of rules and guidelines to follow and a set of utility libraries to use, which tend to cause their solutions to become bigger and more complex.This is because they end up tweaking the ideal, intuitive business logic implementation to make it adhere to rules and fit fixed APIs. This generally means more work for the developers, more bugs, less standardization, less reuse, and reduced overall productivity and quality.

I’m a senior developer for a leading online investment bank and have observed how these challenges can limit productivity. This article is a case study that presents how our development team has analyzed and overcome the challenges by innovative use of runtime code generation and Dependency Injection (DI). You might not agree with some of our team’s design choices, but I believe you’ll agree they represent a fresh and efficient way of addressing some common architectural challenges.

My company has a big in-house software development department (working on two continents) that constantly maintains and extends our huge Microsoft .NET Framework codebase. Our codebase is focused on numerous mission-critical Windows services that make up the high-performance, low-latency trading system hosted in our datacenters. We have a number of platform teams that guard the codebase and runtime environment, plus many project teams that continuously (and in parallel) improve and extend the system.

I’ve worked on platform teams for several years and experienced the downside of an inconsistent and overly complex codebase during numerous reviews and support cases. Two years ago, we decided to address these issues, and I found the following problems:

  • We had way too many solutions to the same fundamental problems. A good example is that most of our Windows services had their own unique way of combining the various APIs into a simple service with proper support for logging, tracing, database access and so on.
  • Our business logic implementations were either simple—­but not unit-testable and too naïve, not adhering to guidelines—or overly complex due to a lot of plumbing code. A common example: simple code that worked directly on the .NET SQL Server API versus complex code that spent far more lines on trivial plumbing to support automatic retries, caching and so on than on the real business logic.
  • We had utility libraries supporting most of our architectural principles and coding guidelines, but they were implemented in several different styles and evolved independently. So even when using them as dictated by the guidelines, each feature solution ended up having a huge footprint in terms of referenced assemblies and exposure toward API changes. This in turn made it a complex task in itself to put a new feature in production and also made it difficult to update the utility libraries.
  • The overall set of guidelines and rules to apply and utilities to use was simply so big that only our most experienced developers had a fair chance of understanding everything, and the entry barrier for new developers was extremely high. This meant a lot of nonstandard code was written, either to be discarded later or to reach production with increased inconsistency.
  • Several of our core Windows services had central “registration points” where all project teams had to touch the same code—for example, a big switch statement dispatching commands or jobs. This made it nontrivial to merge code into our main branch.

Natrually, these problems were neither new nor unique to us, and a number of well-established design patterns describe how to address such problems:

  • The façade pattern hides all the details of accessing a complex resource behind a simple access layer interface. This facilitates clean and testable business logic implementations where external resources can be easily mocked out in tests.
  • DI—or Inversion of Control (IoC) containers—allows components to be loosely coupled and therefore easier to extend, maintain and combine. This technique also makes it easy to mock out selected components and thereby increase testability.
  • Well-designed utility library APIs don’t force the consuming code to be tweaked; instead, they support intuitive implementation.

We’d been aware of these patterns for years and had also applied them all in various forms throughout our codebase. But a few fundamental problems had significantly limited the success of these patterns. First, the façade pattern doesn’t eliminate the need for a lot of plumbing code—it just moves it into another class and generally just means more work for the developer. Second, unless the DI container automatically discovers its components at run time (for example, via attributes), it will still require a central registration and will in reality just introduce an additional layer into the implementation. And, finally, it’s costly and extremely difficult to design and implement APIs that are intuitive, flexible and useful at the same time.

Why We Created Adaptive Access Layers

After a number of brainstorming sessions, we came up with a single flexible and powerful solution to all of these problems. The basic idea is to hide all APIs behind attributed access-layer interfaces and build an implementation engine that can implement such interfaces at run time in a way that follows all rules and guidelines. We call this technique Adaptive Access Layers (AAL) because each solution defines the access-layer interfaces it needs in a highly flexible way. We’ve combined the AAL implementation engine with the open source, attribute-driven Autofac DI container and achieved a flexible Windows service framework that makes clean, intuitive and testable implementations the easiest option. Figure 1 illustrates how the size, footprint and complexity of a solution are reduced dramatically when AAL is used to decouple the core logic implementation from all the surrounding libraries and APIs. The blue boxes represent the footprints of a single solution, implemented with AAL (on the left) and without (on the right).

Adaptive Access Layers (Depicted on the Left) Dramatically Reduce Solution Complexity and Footprint
Figure 1 Adaptive Access Layers (Depicted on the Left) Dramatically Reduce Solution Complexity and Footprint

A key feature of the AAL technique is that it gives us a common central place to implement our best practices and guidelines without polluting the business logic code. In that respect, AAL is similar to aspect-oriented programming (AOP), various interception features and proxy techniques. The main difference is that AAL hides the underlying APIs from the consuming business logic code, whereas the other techniques still expose them and thus increase solution footprint significantly.

To illustrate the idea, I’ll discuss a simple access layer between some business logic and the standard Windows event log. Consider an order registration service that registers incoming orders in a database. If the database call fails, the service must write an error to the event log.

In the classic approach, this could involve a call to the .NET EventLog.WriteEntry method and could look like the code in Figure 2. This approach isn’t optimal for two reasons. First, it isn’t well-suited for unit testing, as the test would have to inspect the event log on the machine running the unit tests to validate that an entry with the correct text was actually written. And second, four lines of trivial plumbing code will “pollute” a core part of the business logic.

Figure 2 Classic Event Log Access

public OrderConfirmation RegisterOrder(Order order)
{ 
  try 
  {  
    // Call database to register order and return confirmation 
  } 
  catch (Exception ex) 
  {   
    string msg = string.Format("Order {0} not registered due to error: {1}",     
      order.OrderId,     
      ex.Message);   
    _eventLog.WriteEntry(msg, 1000, EventLogEntryType.Error); 
  }
}

Both of these problems are addressed by introducing an AAL interface between the business logic and the underlying EventLog class. Such a layer is illustrated in the following code:

[EventLogContract("OrderService")]
public interface IOrderServiceEventLog{
    [EventEntryContract(1000, EventLogEntryType.Error,
        "Order {0} not reg due to error: {1}"]
    void OrderRegistrationFailed(int orderId, string message);
}

The layer is defined by the attributed interface IOrderServiceEventLog that’s implemented via a dynamic class by the implementation engine at run time. The interface itself has the [EventLogContract] attribute to allow the implementation engine to recognize it as an event log access layer. The single parameter is the name of the event log to which to target. There are no restrictions on the name of the interface or the number of methods on it. Each method must return void (there’s no meaningful return value when writing information to the event log) and have the [EventEntryContract] attribute. The attribute takes all the fixed metadata input (id, severity and formatting) as parameters such that this no longer needs to be in the business logic.

Using the access-layer interface, the business logic from Figure 2 becomes a lot smaller and clearer:

public OrderConfirmation RegisterOrder(Order order)
{
    try  
    {
        // Call database to register order and return confirmation
    }
    catch (Exception ex)  
    {
        _logLayer.OrderRegistrationFailed(order.Id, ex.Message);   
    }
}

The sample RegisterOrder method is now simple, highly readable and a lot more testable, because validation no longer requires inspection of the event log but instead just a small mock class implementing the access-layer interface. Another advantage is that the IOrderServiceEventLog interface can map the full event log interaction by the entire order service and thus provide a simple yet complete overview of what event log entries the system writes.

(Note: As a brief aside, the recent Semantic Logging Application Block [SLAB] over the new .NET 4.5 EventSource class embraces the same ideas of moving metadata from code into attributes and exposing customized, strongly typed logging methods instead of a few general-purpose methods. To use SLAB, developers must implement a custom class derived from the EventSource class and use this class throughout the codebase. I believe our approach is as powerful as SLAB but easier to use, as it only requires developers to define the interface and not the class implementation. A key feature of the EventSource class is that it supports structured event logging via a configurable set of sinks. Our access-layer implementation doesn’t currently support structured logging but could easily be extended to do so because it has access to the structured information via the access-layer method parameters.)

I haven’t yet considered the real body of the RegisterOrder method, namely the call to some SQL Server database stored procedure to persist the order for further processing. If my team implemented this using the .NET SqlClient API, it would be at least 10 lines of trivial code to create a SqlConnection instance and SqlCommand instance, populate the command with parameters from the Order properties, execute the command and read back the result set. If we were to meet additional requirements such as automatic retries in the case of database deadlocks or time-outs, we could easily end up with 15 to 20 lines of code just to make a fairly simple call. And all of this would be required just because the target of the call happened to be a stored procedure rather than an in-process .NET method. From a business logic perspective, there’s absolutely no reason why our core implementation should be so cluttered and complex just because the processing crosses from one system to another.

By introducing an adaptive database access layer similar to the event log access layer, we can implement the body as simple and testable:

public OrderConfirmation RegisterOrder(Order order)
{
  try
  {
    return _ordersDbLayer.RegisterOrder(order);
  }
  catch (Exception ex)
  {
    _logLayer.OrderRegistrationFailed(order.Id, ex.Message);
   }
}

So far, I’ve illustrated the ideas, flexibility and power of AAL. I’ll now move on with a more detailed walk-through of the access layers we’ve developed and found useful. I’ll start with the aforementioned database access layer.

Database Access Layers Database access is a central part of most enterprise systems, including ours. Being a key facilitator of serious online trading of financial instruments, we have to meet some strict performance and security requirements demanded by our clients and the financial authorities, and are therefore forced to safeguard our databases carefully. We generally do this by only doing database access via stored procedures, as that lets us apply fine-grained security rules and review all database queries for performance and server load before they hit our production systems.

We’ve carefully evaluated whether object-relational mapping (ORM) tools such as the Entity Framework could help us achieve simpler and more testable code without moving away from stored procedures. Our conclusion was that the Entity Framework is an extremely appealing solution, but it relies heavily on being able to compose and execute complex SQL statements at run time. It can map stored procedures, but when limited to mapping only stored procedures, it loses most of its benefits. For that reason, we decided to implement our own database-access framework as an adaptive database access layer.

Our implementation supports stored procedure calls, selects on views and effective mass inserts via the SQL Server bulk copy functionality. It can map input data directly from Data Transfer Object (DTO) class properties to stored procedure parameters and can likewise map result set columns into class properties. This facilitates a clear and direct syntax when doing database access in .NET code.

The following code shows a simple layer that suits the sample order registration service:

[DatabaseContract("Orders")]
public interface IOrdersDatabase{
  [StoredProcedureContract("dbo.RegisterOrder",
    Returns=ReturnOption.SingleRow)]
   OrderConfirmation RegisterOrder(Order order);
 }

This code maps a single stored procedure and turns the single row in the result set into an OrderConfirmation instance initialized from the result set columns. The parameters of the mapped stored procedure are set from the properties of the given Order instance. This mapping behavior is defined in the [StoredProcedure­Contract] attribute and thus is no longer required in the business logic implementation, making that clear and readable.

We’ve implemented some quite advanced features in the database access layer because we concluded it’s a simple and efficient way of offering standard functionality to our developers without restricting their freedom to implement their business logic in the most natural and intuitive way.

One of the supported features is seamless support for bulk inserting rows via the SQL bulk copy functionality. Our support allows our developers to define a simple method that takes an enumerable collection of a DTO class representing the rows to insert as input. The access layer handles all the details and thereby relieves the business logic of 15 to 20 lines of complex, database-centric code. This bulk copy support is a perfect example of a conceptually simple operation—to insert rows into a table in an efficient way—that normally ends up being rather complex to implement simply because the underlying .NET Framework SqlBulkCopy class happens to work on an IDataReader rather than directly on our DTO class.

The database access layer was the first one we implemented, and it’s been a huge success from the beginning. Our experience is that we write fewer and simpler lines of code with it and our solutions naturally become highly unit testable. Based on these positive results, we quickly realized we could benefit from introducing AAL between our business logic code and several other external resources.

Service Access Layers Our trading system implementation is highly service-oriented, and robust inter-service communication is essential for us. Our standard protocol is Windows Communication Foundation (WCF) and we have a lot of code focused on making WCF calls.

Most of these implementations follow the same overall pattern. First, the addresses of the endpoints are resolved (we typically run our services either in active-active or active-passive setups). Then the ChannelFactory .NET class is used to create a channel class implementation on which the desired method is invoked. If the method succeeds, the channel is closed and disposed, but if it fails, the exception has to be inspected. In some cases it makes sense to retry the method on the same endpoint, while in other scenarios it’s  better to do an automatic failover and retry on one of the other available endpoints. On top of this, we often want to quarantine a failed endpoint for a short time so as to not overload it with connection attempts and failing method calls.

It’s far from trivial to write a correct implementation of this pattern, and it can easily take 10 to 15 lines of code. And again, this complexity is introduced only because the business logic we need to call happens to be hosted in another service and not in-process. We’ve implemented an adaptive service access layer to eliminate this complexity and make it as simple and safe to call a remote method as it is to call an in-process method.

The principles and workflow are identical to that of the database access layer. The developer writes an attributed interface that maps only the methods she needs to call, and our implementation engine creates a runtime type that implements the interface with the best practice behavior as specified in the attributes.

The following code shows a small service access layer that maps a single method:

[ServiceAccessLayer(typeof(IStatisticsSvc), 
  "net.tcp", "STATISTICS_SVC"]
public interface ISalesStatistics{
  [ServiceOperationContract("GetTopSellingItems")]
  Product[] GetTopProducts(int productCategory);
}

The interface attribute identifies the underlying plain [ServiceContract] attributed interface (to be used with the inner call to ChannelFactory), the protocol to use and the id of the service to call. The latter is used as a key into our service locator to resolve the actual endpoint addresses at call time. The access layer will by default use the default WCF binding for the given protocol, but this can be customized by setting additional properties on the [ServiceAccessLayer] attribute.

The single parameter to the ServiceOperationContract is the action verb that identifies the mapped method in the underlying WCF service contract. Other optional parameters to the attribute specify whether service call results shall be cached and whether it’s always safe to automatically failover the operation even if the first WCF endpoint call fails after code has been executed on the target service.

Other Access Layers We’ve also built similar AAL for trace files, performance counters and our message bus. They’re all based on the same principles illustrated by the preceding examples—namely, to allow the business logic to express its resource access in the simplest possible way by moving all metadata into attributes.

Dependency Injection Integration

With access layers, our developers no longer need to implement a lot of trivial plumbing code, but there still must be a way to invoke the AAL implementation engine to obtain an instance of the runtime implemented type whenever the mapped external resource has to be called. The implementation engine can be called directly, but that would go against our principles of keeping business logic clean and testable.

We’ve addressed this problem by registering our implementation engine as a dynamic registration source with Autofac such that it gets called whenever Autofac can’t resolve a dependency with any of the static registrations. In this case, Autofac will ask the implementation engine whether it can resolve a given combination of type and id. The engine will inspect the type and provide an instance of the type if the type is an attributed access-layer interface.

With this in place, we’ve established an environment where business logic implementations simply can declare their access-layer interface types and take them as parameters (for example, in class constructors) and then trust that the DI container will be able to resolve these parameters by invoking the implementation engine behind the scenes. Such implementations will naturally work on interfaces and be easy to test as it only takes a few mock classes to implement these interfaces.

Implementation

All of our access layers are implemented using the same techniques. The overall idea is to implement all of the functionality in plain C# code in an abstract base class and then only use emit to generate a thin class that derives from the base class and implements the interface. The emitted body of each interface method simply forwards the execution to a general Execute method in the base class.

The signature of this general method is:

object Execute(Attribute, MethodInfo, object[], TAttributeData)

The first parameter is the method attribute of the access-layer interface method from which the Execute method is called. It generally holds all the metadata (for example, stored procedure name, retry specification and so on) needed for the Execute method to provide the correct runtime behavior.

The second parameter is the reflected MethodInfo instance for the interface method. It holds complete information about the implemented method—including the types and names of method parameters—and is used by the Execute method to interpret the third parameter. It holds the values of all parameters to the current interface method call. The Execute method typically forwards these values into the underlying resource API, for example, as stored procedure parameters.

The fourth parameter is a custom type that holds fixed data to be used at every invocation of the method in order to make it as efficient as possible. The fixed data are initialized once (by a method in the abstract base class) when the engine implements the runtime class. Our database access layers use this feature to inspect stored procedures once only and prepare a SqlCommand template ready for use when the method is invoked.

The Attribute and MethodInfo parameters passed to the Execute method are also reflected only once and reused in every method invocation, again to minimize the per-call overhead.

The return value of Execute is used as the return value for the implemented interface method.

This structure is quite simple and has turned out to be both flexible and powerful. We’ve reused it in all our access layers via an abstract, common base AccessLayerBase class. It implements all the required logic to inspect an attributed interface and drive the process of emitting a new runtime class. Each access-layer category has its own specialized abstract base class derived from AccessLayerBase. It holds the actual implementation of accessing the external resource, for example, making a stored procedure call according to all our best practices. Figure 3 shows the implementation class hierarchy for a sample database access-layer interface. The blue section is the AAL framework; the red section is the attributed interface defined by the business logic feature solution; and the green section is the runtime class emitted by the AAL implementation engine.

An Access-Layer Implementation Outline
Figure 3 An Access-Layer Implementation Outline

Figure 3 also illustrates how we’ve let the base classes implement a set of public interfaces (deriving from IAccessLayer) to expose key behavioral information. This isn’t intended to be used by business logic implementations but rather by infrastructure logic—for example, to track whenever a stored procedure invocation fails.

These access-layer interfaces are also useful in the few special cases where business or technical requirements demand that access to the underlying resource behind the access layer is done in a way not fully supported by AAL. With these interfaces, our developers can use AAL but intercept and adjust the underlying operations to meet special requirements. A good example of this is the IDatabaseAccessLayer.ExecutingCommand event. This is raised right before a SqlCommand is executed and allows us to customize it by altering things such as time-out values or parameters.

Reporting and Checking

The attributed AAL interfaces of a business logic solution also allow us to reflect the compiled binaries at build time and extract a number of useful reports. Our team has incorporated this in our Team Foundation Server (TFS) builds such that every build output now includes a few informative, small XML files.

Build-Time Reporting The database access layer reports the complete list of all stored procedures, views and bulk inserts being accessed. We use this to simplify reviews and to check that all required database objects have been properly deployed and configured before we release the business logic.

Likewise, our event log access layer reports the full list of event log entries a service can generate. Our post-build steps take this information and transform it into a management pack for our Microsoft System Center Operations Manager production environment surveillance. This is smart because it ensures that Operations Manager is always up-to-date with proper information on how to best handle production issues.

Automated Microsoft Installer Packages We’ve applied the same reflection techniques to harvest valuable input to the Microsoft Installer (MSI) packages we generate for our Windows services as the final step in our TFS builds. A key point for these packages is to install and configure event log and performance counters to ensure they match the business logic being deployed. The build extracts the event log names and performance counter definitions from the binaries and automatically generates an MSI package that installs these names and definitions.

Runtime Checking One of the most common errors reported from our production environment used to be that a service had attempted to call a stored procedure that didn’t exist or existed with the wrong signature on the production database. These kinds of errors happened because we missed the deployment of all the required database objects when we deployed a Windows service to production. The critical issue here wasn’t the missing deployment itself, as that could be fixed fairly easily, but more the fact that the error didn’t happen during deployment but typically later on when the stored procedure was first called during business hours. We’ve used the reflection-based list of all accessed database objects to address this problem by letting our Windows services validate the existence and validity of all objects during service startup. The service simply runs through the list of objects, then queries the database for each one to check that it will be able to access the object when needed. This way, we’ve moved all such errors from business hours to deployment time, when it’s a lot safer and easier to fix them.

I’ve listed these additional usages to illustrate a key benefit of AAL. With almost-complete information about service behavior easily available via reflection, a whole new dimension of intelligent reporting, building, automation and monitoring opens up. Our team has harvested a few of these benefits so far, but we see a number of additional interesting applications ahead.

Productivity and Quality

AAL, designed and implemented over the past two years, has proven to be an extremely powerful facilitator of higher developer productivity and increased solution quality for our company’s dev team. We’ve reduced our costs for preparing a new Windows service from weeks to hours and for extending existing services from days to minutes. This has improved our agility and thereby made it cheaper for us to develop our customer offerings.

Our access layers are suitable when implementing the vast majority of our business solutions. However, we do have a few special cases where they don’t fit—typically in highly configurable scenarios, such as when the name of the stored procedure to call is read from a configuration table and not known at compile time. Our team has deliberately chosen not to support such cases to avoid the additional framework complexity that would be introduced. Instead, we allow our developers to use the plain .NET APIs in such isolated cases.

The AAL solution itself isn’t large and has been developed within a few man-months over a two-year period. Thus, our initial investment hasn’t been very high and has already reached break-even status via lots of saved development and support hours.

Of course, the challenge of having a widely used and highly versatile platform is that it can become a single point of failure. We’ve mitigated this by having complete unit-test coverage of the AAL solution and by rolling out new versions of it in a governed service-by-service way. You could also argue that the AAL approach in itself introduces additional complexity into our system and forces our developers to learn a new abstraction layer. But we believe this is more than compensated for by the increased overall productivity and quality.

Another concern that we’re aware of is that we must keep focus on overall design and architecture and not just do Windows services for everything simply because that approach has become so cheap and easy. Sometimes a third-party solution or an IIS-hosted Windows Process Activation Service (WAS) offers a better overall solution, even though it adds to the diversity of our production environment.


Ulrik Born holds a Master of Science degree in Information Technology from the Technical University of Denmark and has been working for the past 15 years as developer and architect on the Windows platform. He’s a lead developer for Saxo Bank, an Internet-based investment bank.

Thanks to the following technical experts for reviewing this article: Jonas Gudjonsson (Saxo Bank), James McCaffrey (Microsoft), Grigori Melnick (Microsoft) and Fernando Simonazzi (Microsoft)
Jonas Gudjonsson, VP and enterprise architect, is part of the CTO Office responsible for the overall IT architecture of Saxo Bank, including the strategy, principles and guidelines of the four architectural domains

Dr. James McCaffrey works for Microsoft at the Redmond, Wash., campus. He has worked on several Microsoft products including Internet Explorer and MSN Search. He’s the author of “.NET Test Automation Recipes” (Apress, 2006), and can be reached at jammc@microsoft.com.

Dr. Grigori Melnik is a Principal Program Manager on Microsoft patterns & practices team. These days he drives the Microsoft Enterprise Library, Unity, CQRS Journey and NUI patterns projects. He also promotes the design for IT efficiency. Prior to that, he was a researcher and software engineer long enough ago to remember the joy of programming in Fortran. Grigori speaks around the world on the topics of code reuse, cloud computing, agile methods and software testing. He also serves on the IEEE Software Advisory board. He blogs at http://blogs.msdn.com/agile

Fernando Simonazzi is a software developer and architect with over 15 years of professional experience. He has been a contributor to Microsoft patterns & practices’ projects, including several releases of the Enterprise Library, Unity, CQRS Journey and Prism. Fernando is an Associate at Clarius Consulting.