August 2010

Volume 25 Number 08

Cutting Edge - Don’t Worry, Be Lazy

By Dino Esposito | August 2010

Dino EspositoIn software development, the term laziness refers to delaying certain expensive activity as long as possible much more than it relates to idleness. Software laziness is still about doing things, but it means that any action will take place only when it’s needed to complete a certain task. In this regard, laziness is an important pattern in software development and can be successfully applied to a variety of scenarios, including design and implementation.

For example, one of the fundamental coding practices of the Extreme Programming methodology is summarized simply as “You Aren’t Gonna Need It,” which is an explicit invitation to be lazy and incorporate in the codebase only the features you need—and only when you need them.

On a different note, during the implementation of a class, you might want to be lazy when it comes to loading data from a source that isn’t so cheap to access. The lazy loading pattern, in fact, illustrates the commonly accepted solution of having a member of a class defined but kept empty until its content is actually required by some other client code. Lazy loading fits perfectly in the context of object-relational mapping (ORM) tools such as the Entity Framework and NHibernate. ORM tools are used to map data structures between the object-oriented world and relational databases. In this context, lazy loading refers to a framework’s ability to load, for
example, Orders for a customer only when some code is trying to read the exposed Orders collection property on a Customer class.

Lazy loading, though, isn’t limited to specific implementation scenarios such as ORM programming. Moreover, lazy loading is about not getting an instance of some data before it actually becomes useful. Lazy loading, in other words, is about having special factory logic that tracks what has to be created and creates it silently when the content is eventually requested.

In the Microsoft .NET Framework, we developers have long had to implement any lazy behavior manually in our classes. There’s never been built-in machinery to help with this task. Not until the advent of the .NET Framework 4, that is, where we can start
leveraging the new Lazy<T> class.

Meet the Lazy<T> Class

The Lazy<T> is a special factory you use as a wrapper around an object of a given type T. The Lazy<T> wrapper represents a live proxy for a class instance that doesn’t yet exist. There are many reasons for using such lazy wrappers, the most important of which is improving performance. With lazy initialization of objects, you avoid any computation that isn’t strictly needed, so you reduce memory consumption. If applied appropriately, lazy instantiation of objects can also be a formidable tool to make applications start much faster. The following code shows how to initialize an object in a lazy manner:

var container = new Lazy<DataContainer>();

In this example, the DataContainer class indicates a plain data container object that references arrays of other objects. Right after invoking the new operator on a Lazy<T> instance, all you have back is a live instance of the Lazy<T> class; in no case will you have an instance of the specified type T. If you need to pass an instance of DataContainer to members of other classes, you must change the signature of these members to use Lazy<DataContainer>, like this:

void ProcessData(Lazy<DataContainer> container);

When does the actual instance of DataContainer get created so that the program can work with the data it needs? Let’s have a look at the public programming interface of the Lazy<T> class. The public interface is fairly slim as it includes only two properties: Value and IsValue­Created. The property Value returns the current value of the instance associated with the Lazy type, if any. The property is defined as follows:

public T Value 
{
  get { ... }
}

The property IsValueCreated returns a Boolean value and indicates whether or not the Lazy type has been instantiated. Here’s an excerpt from its source code:

public bool IsValueCreated
{
  get
  {
    return ((m_boxed != null) && (m_boxed is Boxed<T>));
  }
}

The m_boxed member is an internal private and volatile member of the Lazy<T> class that contains the actual instance of the T type, if any. Therefore, IsValueCreated simply checks whether a live instance of T exists and returns a Boolean answer. As mentioned, the m_boxed member is private and volatile, as shown in this snippet:

private volatile object m_boxed;

In C#, the volatile keyword indicates a member that can be modified by a concurrently running thread. The volatile keyword is used for members that are available in a multithread environment but lack (essentially for performance reasons) any protection against possible concurrent threads that could access such members at the same time. I’ll return to the threading aspects of Lazy<T> later. For now, suffice it to say that public and protected members of Lazy<T> are thread-safe by default. The actual instance of the type T is created the first time any code attempts to access the Value member. The details of the object creation depend on the threading attributes optionally specified via the Lazy<T> constructor. It should be clear that implications of the threading mode are only important for when the boxed value is actually initialized or accessed for the first time.

In the default case, an instance of the type T is obtained via reflection by placing a call to Activator.CreateInstance. Here’s a quick example of the typical interaction with the Lazy<T> type:

var temp = new Lazy<DataContainer>();
Console.WriteLine(temp.IsValueCreated);
Console.WriteLine(temp.Value.SomeValue);

Note that you are not strictly required to check IsValueCreated before invoking Value. You typically resort to checking the value of IsValueCreated only if—for whatever reason—you need to know whether a value is currently associated with the Lazy type. There’s no need for you to check IsValueCreated to avoid a null reference exception on Value. The following code works just fine:

var temp = new Lazy<DataContainer>();
Console.WriteLine(temp.Value.SomeValue);

The getter of the Value property checks whether a boxed value already exists; if not, it triggers the logic that creates an instance of the wrapped type and returns that.

The Process of Instantiation

Of course, if the constructor of the Lazy type—DataContainer in the previous example—throws an exception, your code is responsible for handling that exception. The exception captured is of type TargetInvocationException—the typical exception you get when .NET reflection fails to create an instance of a type indirectly.

The Lazy<T> wrapper logic simply ensures that an instance of type T is created; in no way does it also guarantee that you won’t get a null reference exception as you access any of the public members on T. For example, consider the following code snippet:

public class DataContainer
{
  public DataContainer()
  {
  }
  public IList<String> SomeValues { get; set; }
}

Imagine now that you attempt to call the following code from a client program:

var temp = new Lazy<DataContainer>();
Console.WriteLine(temp.Value.SomeValues.Count);

In this case, you’ll get an exception because the SomeValues property of the DataContainer object is null, not because the DataContainer is null itself. The exception raises because the DataContainer’s constructor doesn’t properly initialize all of its members; the error has nothing to do with the implementation of the lazy approach.

The Value property of Lazy<T> is a read-only property, meaning that once initialized, a Lazy<T> object always returns the same instance of the type T or the same value if T is a value type. You can’t modify the instance but you can access any public properties the instance may have.

Here’s how you can configure a Lazy<T> object to pass ad hoc parameters to the T type:

temp = new Lazy<DataContainer>(() => new Orders(10));

One of the Lazy<T> constructors takes a delegate through which you can specify any action required to produce proper input data for the T constructor. The delegate isn’t run until the Value property of the wrapped T type is accessed for the first time.

Thread-Safe Initialization

By default, Lazy<T> is thread-safe, meaning that multiple threads can access an object and all threads will receive the same instance
of the T type. Let’s look at aspects of threading that are important only for the first access to a Lazy object.

The first thread to hit the Lazy<T> object will trigger the instantiation process for type T. All following threads that gain access to Value receive the response generated by the first—whatever that is. In other words, if the first thread causes an exception when invoking the constructor of the type T, then all subsequent calls—regardless of the thread—will receive the same exception.
By design, different threads can’t get different responses from the same instance of Lazy<T>. This is the behavior you get when you choose the default constructor of Lazy<T>.

The Lazy<T> class, however, also features an additional constructor:

public Lazy(bool isThreadSafe)

The Boolean argument indicates whether or not you want thread safety. As mentioned, the default value is true, which will offer the aforementioned behavior.

If you pass false instead, the Value property will be accessed from just one thread—the one that initializes the Lazy type. The behavior is undefined if multiple threads attempt to access the Value property.

The Lazy<T> constructor that accepts a Boolean value is a special case of a more general signature where you pass the Lazy<T>
constructor a value from the LazyThreadSafetyMode enumeration. Figure 1 explains the role of each value in the enumeration.

Figure 1 TheLazyThreadSafetyMode Enumeration

Value Description
None The Lazy<T>instance isn’t thread-safe and its behavior is undefined if it’s accessed from multiple threads.
PublicationOnly Multiple threads are allowed to concurrently try to initialize the Lazy type. The first thread to complete wins and the results generated by all others are discarded.
ExecutionAndPublication Locks are used to ensure that only a single thread can initialize a Lazy<T> instance in a thread-safe manner.

You can set the PublicationOnly mode using either of the following constructors:

public Lazy(LazyThreadSafetyMode mode)
public Lazy<T>(Func<T>, LazyThreadSafetyMode mode)

The values in Figure 1 other than PublicationOnly are implicitly set when you use the constructor that accepts a Boolean value:

public Lazy(bool isThreadSafe)

In that constructor, if the argument isThreadSafe is false, then the selected threading mode is None. If the argument isThreadSafe is set to true, then the threading mode is set to ExecutionAndPublication. ExecutionAndPublication is also the working mode when you choose the default constructor.

The PublicationOnly mode falls somewhere in between the full thread safety guaranteed by ExecutionAndPublication and the lack thereof you get with None. PublicationOnly allows concurrent threads to try to create the instance of the type T but ensures that only one thread is the winner. The T instance created by the winner is then shared among all other threads regardless of the instance that each may have computed.

There’s an interesting difference between None and ExecutionAndPublication regarding a possible exception thrown during the initialization. When PublicationOnly is set, an exception generated during the initialization isn’t cached; subsequently, each thread that attempts to read Value will have a chance to re-initialize if an instance of T isn’t available. Another difference between PublicationOnly and None is that no exception is thrown in PublicationOnly mode if the constructor of T attempts to recursively access Value. That situation will raise an InvalidOperation exception when the Lazy<T> class works in None or ExecutionAndPublication modes.

Dropping thread safety gives you a raw performance benefit, but you need to be careful to prevent nasty bugs and race conditions. Thus, I recommend you use the option LazyThreadSafetyMode.None only when performance is extremely critical.

If you use LazyThreadSafetyMode.None, it remains your responsibility to ensure the Lazy<T> instance will never be initialized from more than one thread. Otherwise, you may incur unpredictable results. If an exception is thrown during the initialization, the same exception is cached and raised for each subsequent access to Value within the same thread.

ThreadLocal Initialization

By design, Lazy<T> doesn’t let different threads manage their own personal instance of type T. However, if you want to allow that
behavior, you must opt for a different class—the ThreadLocal<T> type. Here’s how you use it:

var counter = new ThreadLocal<Int32>(() => 1);

The constructor takes a delegate and uses it to initialize the thread-local variable. Each thread holds its own data that’s completely out of reach of other threads. Unlike Lazy, the Value property on ThreadLocal is read-write. Each access is therefore independent from the next and may produce different results, including throwing (or not) an exception. If you don’t provide an action delegate via the ThreadLocal constructor, the embedded object is initialized using the default value for the type—null if T is a class.

Implementing Lazy Properties

Most of the time, you use Lazy for properties within your own classes, but which classes, exactly? ORM tools offer lazy loading on their own, so if you’re using these tools, the data access layer probably isn’t the segment of the application where you’ll find likely candidate classes to host lazy properties. If you aren’t using ORM tools, the data access layer is definitely a good fit for lazy properties.

Segments of the application where you use dependency injection might be another good fit for laziness. In the .NET Framework 4, the Managed Extensibility Framework (MEF) just implements extensibility and inversion of control using Lazy. Even if you’re not using the MEF directly, management of dependencies is a great fit for lazy properties.

Implementing a lazy property within a class doesn’t require any rocket science, as Figure 2 demonstrates.

Figure 2 Example of a Lazy Property

public class Customer
{
   private readonly Lazy<IList<Order>> orders;
   public Customer(String id)
   {
      orders = new Lazy<IList<Order>>( () =>
      {
         return new List<Order>();
      }
      );
   }
   public IList<Order> Orders
   {
      get
      {
         // Orders is created on first access
         return orders.Value;
      }
   }
}

Filling a Hole

Wrapping up, lazy loading is an abstract concept that refers to loading data only when it’s really needed. Until the .NET Framework 4, developers needed to take care of developing lazy initialization logic themselves. The Lazy<T> class extends the .NET Framework programming toolkit and gives you a great chance to avoid wasteful computation by instantiating your expensive objects only when strictly needed and just a moment before their use begins.


Dino Esposito  is the author of Programming ASP.NET MVC from Microsoft Press and has coauthored Microsoft .NET: Architecting Applications for the Enterprise (Microsoft Press, 2008). Based in Italy, Esposito is a frequent speaker at industry events worldwide. You can join his blog at weblogs.asp.net/despos.

Thanks to the following technical expert for reviewing this article: Greg Paperin