June 2010

Volume 25 Number 06

SOA Tips - Address Scalability Bottlenecks with Distributed Caching

By Iqbal Khan | June 2010

After the explosion of Web applications to accommodate high-traffic usage, the next big wave has become service-oriented architecture (SOA). SOA is destined to become a standard way for developing extremely scalable applications, and cloud computing platforms like Windows Azure represent a giant leap in moving SOA toward achieving this goal.

SOA allows users to distribute applications to multiple locations, multiple departments within an organization, and multiple businesses across the Internet. Plus, it permits reuse of existing code within an organization and, more importantly, collaboration among different business units.

A SOA application is usually deployed in a server farm in a load-balanced environment. The goal is to allow the application to handle as much load as you throw at it. The question thus becomes: What are some of the considerations you should have in mind for improving both performance and scalability of your SOA application?

Although SOA, by design, is intended to provide scalability, there are many issues you must address before you can truly achieve scalability. Some of these issues involve how you code your SOA application, but the most important bottlenecks often relate to how you store and access your data. I’ll explore those issues and provide some solutions in this article.

Find Scalability Bottlenecks

A true SOA application should scale easily as far as the application architecture is concerned. A SOA application has two components: service components and client applications. The client application may be a Web application, another service or any other application that relies on the SOA service components to do its job.

One of the key ideas behind SOA is to break up the application into small chunks so these components can be run on multiple servers as separate services.

Ideally, these services should be stateless as much as possible. Stateless means they don’t retain any data with them across multiple calls, allowing you to run the services on multiple computers. There’s no dependence on where the data was the last time, so there’s no data being kept on any particular server across multiple service calls.

As a result, the architecture of SOA applications is inherently scalable. It can easily grow onto multiple servers and across datacenters. However, as with every other application, SOA applications do have to deal with the data, and that can be a problem. This data access becomes the scalability bottleneck. Bottlenecks typically involve the application data, which is stored in some database, usually a relational database. If the SOA application is using session data, the storage of that data is also another potential scalability bottleneck.

One SOA application relying on other SOA applications is another likely area of poor performance and scalability. Say your application calls one service to do its job, but that service calls out to other services. Those services may be on the same intranet or across the WAN in other locations. Such a data trip can be costly. You can’t scale the application effectively if you’re making those calls over and over again, and these are areas where scalability bottlenecks occur, as shown in Figure 1.

Figure 1 SOA Architecture with Potential Scalability Bottlenecks
Figure 1 SOA Architecture with Potential Scalability Bottlenecks

Code for Performance

There are a number of programming techniques that can help improve your SOA application performance.

One thing you can do is design your application to use “chunky” Web method calls. Don’t make frequent calls between the SOA client application and the SOA service layer. There’s usually a great distance between those because they’re not running on the same computer or even in the same datacenter. The fewer calls you make from the client application to the service layers, the better the performance. Chunky calls do more work in one call than multiple calls to do the same work.

Another useful technique is to employ the asynchronous Web method calls supported by the Microsoft .NET Framework. This allows your SOA client application to continue doing other things while the Web method of the service layer is being called and is executing.

The cost of serialization is another aspect to factor in so you don’t serialize any unnecessary data. You should only send data that is required back and forth, allowing you to be highly selective about the type of serialization you want to perform.

Choose the Right Communication Protocol

For SOA applications developed in Windows Communication Foundation (WCF), there are three different protocols that let SOA clients talk to SOA services. These are HTTP, TCP and named pipes.

If both your client and your service are developed in WCF and are running on the same machine, named pipes offer the best performance. A named pipe uses shared memory between client and server processes.

TCP is good if both SOA client and server are developed in WCF, but are running on different computers in the same intranet. TCP is faster than HTTP, but a TCP connection stays open across multiple calls and therefore you can’t automatically route each WCF call to a different server. By employing the NetTcpBinding option that uses connection pools, you can expire TCP connections frequently to restart them so they get routed to a different server, thereby giving you a form of load balancing.

Please note that TCP can’t work reliably across the WAN because socket connections tend to break frequently. If your SOA client and service are not based on WCF or they’re hosted in different locations, then HTTP is your best option. Although HTTP is not as fast as TCP, it offers great scalability due to load balancing.

Use Caching to Improve Client Performance

Thoughtful use of caching can really improve SOA client performance. When a SOA client makes a Web method call to the service layer, you can cache the results at the client application’s end. Then, the next time this SOA client needs to make the same Web method call, it gets that data from the cache instead.

By caching data at the client end, the SOA client application reduces the number of calls it’s going to make to the service layer. This step boosts performance because it didn’t have to make an expensive SOA service call. It also reduces overall pressure on the service layer and improves scalability. Figure 2 shows a WCF client using caching.

Figure 2 WCF Client Caching

using System;
using Client.EmployeeServiceReference;

using Alachisoft.NCache.Web.Caching;

namespace Client {
  class Program {
    static string _sCacheName = "mySOAClientCache";
    static Cache _sCache = NCache.InitializeCache(_sCacheName);

    static void Main(string[] args) {
      EmployeeServiceClient client = 
        new EmployeeServiceClient("WSHttpBinding_IEmployeeService");

      string employeeId = "1000";
      string key = "Employee:EmployeeId:" + employeeId;
      // first check the cache for this employee
      Employee emp = _sCache.Get(key);

      // if cache doesn't have it then make WCF call
      if (emp == null) {
        emp = client.Load("1000");

        // Now add it to the cache for next time
       _sCache.Insert(key, emp);

In many situations, your client is physically removed from the service layer and is running across the WAN. In that case, you have no way of knowing whether the data you have cached has been updated. Therefore, you have to identify only those data elements for caching that you feel will not change for at least a few minutes to perhaps a few hours, depending on your application. You can then specify expiration for these data elements in the cache so the cache will automatically remove them at that time. This helps ensure that cached data is always fresh and correct.

Distributed Caching for Service Scalability

The real scalability gains through caching are found in the SOA service layer. Scalability bottlenecks are not always removed despite many of the programming techniques mentioned already because the major scalability bottlenecks are with data storage and access. Services often live in a load-balanced server farm, allowing the service itself to scale quite nicely—except the data storage can’t scale in the same manner. Data storage thus becomes the SOA bottleneck.

You can scale the service layer by adding more servers to the server farm, increasing the computing capacity through these additional application servers. But all those SOA transactions still deal with some data. That data has to be stored somewhere, and that data storage can easily become the bottleneck.

This data storage barrier to scalability can be improved at multiple levels. SOA services deal with two types of data. One is session-state data and the other is application data that resides in the database (see Figure 3). Both cause scalability bottlenecks.

Figure 3 How Distributed Caching Reduces Pressure on a Database
Figure 3 How Distributed Caching Reduces Pressure on a Database

Storing Session State in a Distributed Cache

One of the limitations of the default session-state storage is that it does not support Web farms because it is in-memory storage living inside the WCF service process. A much better alternative is to use ASP.NET compatibility mode and the ASP.NET session state in WCF services. This allows you to specify OutProc storage including StateServer, SqlServer, or a distributed cache as session state storage.

Enabling ASP.NET compatibility mode is a two-step process. First, you have to specify ASP.NET compatibility in your class definition, as shown in Figure 4. Then you have to specify this in your app.config file, as shown in Figure 5. Notice that Figure 4 also demonstrates how to specify a distributed cache as your SessionState storage in the same web.config file.

Figure 4 Specifying ASP.NET Compatibility for WCF Services in Code

using System;
using System.ServiceModel;
using System.ServiceModel.Activation;

namespace MyWcfServiceLibrary {
  public interface IHelloWorldService {
    string HelloWorld(string greeting);

  [ServiceBehavior (InstanceContextMode = 
  [AspNetCompatibilityRequirements (RequirementsMode = 

  public class HelloWorldService : IHelloWorldService {
    public string HelloWorld(string greeting) {
      return string.Format("HelloWorld: {0}", greeting);

Figure 5 Specifying ASP.NET Compatibility for WCF Services in Config

<?xml version="1.0" encoding="utf-8"?>
    <sessionState cookieless="UseCookies"
        <add name="DistCacheSessionProvider" 
    <identity impersonate="true"/>

    <!-- ... -->

StateServer and SqlServer session storage options do not scale well and, in the case of StateServer, it is also a single point of failure. A distributed cache is a much better alternative because it scales nicely and replicates sessions to multiple servers for reliability.

Caching Application Data

Application data is by far the heaviest data usage in a WCF service, and its storage and access is a major scalability bottleneck. To address this scalability-bottleneck problem, you can use distributed caching in your SOA service-layer implementation. A distributed cache is used to cache only a subset of the data that is in the database based on what the WCF service needs in a small window of a few hours.

Additionally, a distributed cache gives a SOA application a significant scalability boost because this cache can scale out as a result of the architecture it employs. It keeps things distributed across multiple servers—and still gives your SOA application one logical view so you think it’s just one cache. But the cache actually lives on multiple servers and that’s what allows the cache to really scale. If you use distributed caching in between the service layer and the database, you’ll improve performance and scalability of the service layer dramatically.

The basic logic to implement is that, before going to the database, check to see if the cache already has the data. If it does, take it from the cache. Otherwise, go to the database to fetch the data and put it in the cache for next time. Figure 6 shows an example.

Figure 6 WCF Service Using Caching

using System.ServiceModel;
using Vendor.DistCache.Web.Caching;

namespace MyWcfServiceLibrary {
  public class EmployeeService : IEmployeeService {
    static string _sCacheName = "myServiceCache";
    static Cache _sCache = 

    public Employee Load(string employeeId) {
      // Create a key to lookup in the cache.
      // The key for will be like "Employees:PK:1000".
      string key = "Employee:EmployeeId:" + employeeId;

      Employee employee = (Employee)_sCache[key];
      if (employee == null) {
        // item not found in the cache. 
        // Therefore, load from database.

        // Now, add to cache for future reference.
       _sCache.Insert(key, employee, null,

      // Return a copy of the object since 
      // ASP.NET Cache is InProc.
      return employee;

By caching application data, your WCF service saves a lot of expensive database trips and instead finds the frequently used transactional data in a nearby in-memory cache.

Expiring Cached Data

Expirations let you specify how long data should stay in the cache before the cache automatically removes it. There are two types of expirations you can specify: absolute-time expiration and sliding- or idle-time expiration.

If the data in your cache also exists in the database, you know that this data can be changed in the database by other users or applications that may not have access to your cache. When that happens, the data in your cache becomes stale, which you do not want. If you’re able to make a guess as to how long you think it’s safe for this data to be kept in the cache, you can specify absolute-time expiration. You can say something like “expire this item 10 minutes from now” or “expire this item at midnight today.” At that time, the cache expires this item:

using Vendor.DistCache.Web.Caching;
// Add an item to ASP.NET Cache with absolute expiration
_sCache.Insert(key, employee, null, 
  CacheItemPriority.Default, null);

You can also use idle-time or sliding-time expiration to expire an item if nobody uses it for a given period. You can specify something like “expire this item if nobody reads or updates it for 10 minutes.” This is useful when your application needs the data temporarily and when your application is done using it, you want the cache to automatically expire it. ASP.NET compatibility-mode session state is a good example of idle-time expiration.

Notice that absolute-time expiration helps you avoid situations where the cache has an older or stale copy of the data than the master copy in the database. On the other hand, idle-time expiration serves a totally different purpose. It’s meant really to simply clean up the cache once your application no longer needs the data. Instead of having your application keep track of this clean up, you let the cache take care of it.

Managing Data Relationships in the Cache

Most data comes from a relational database, and even if it’s not coming from a relational database, it’s relational in nature. For example, you’re trying to cache a customer object and an order object and both objects are related. A customer can have multiple orders.

When you have these relationships, you need to be able to handle them in a cache. That means the cache should know about the relationship between a customer and an order. If you update or remove the customer from the cache, you may want the cache to automatically remove the order object from the cache. This helps maintain data integrity in many situations.

If a cache can’t keep track of these relationships, you’ll have to do it yourself—and that makes your application more cumbersome and complex. It’s a lot easier if you just tell the cache when you add the data about this relationship. The cache then knows if that customer is ever updated or removed, the order also has to be removed.

ASP.NET has a useful feature called CacheDependency that allows you to keep track of relationships between different cached items. Some commercial caches also have this feature. Here’s an example of how ASP.NET lets you keep track of relationships among cached items:

using Vendor.DistCache.Web.Caching;
public void CreateKeyDependency() {
  Cache["key1"] = "Value 1";

  // Make key2 dependent on key1.
  String[] dependencyKey = new String[1];
  dependencyKey[0] = "key1";
  CacheDependency dep1 = 
    new CacheDependency(null, dependencyKey);

  _sCache.Insert("key2", "Value 2", dep2);

This is multi-layer dependency, meaning A can depend on B and B can depend on C. So, if your application updates C, both A and B have to be removed from the cache.

Synchronizing the Cache with a Database

The need for database synchronization arises because the database is really being shared across multiple applications, and not all of those applications have access to your cache. If your WCF service application is the only one updating the database and it can also easily update the cache, you probably don’t need the database-synchronization capability.

But, in a real-life environment, that’s not always the case. Third-party applications update data in the database and your cache becomes inconsistent with the database. Synchronizing your cache with the database ensures that the cache is always aware of these database changes and can update itself accordingly.

Synchronizing with the database usually means invalidating the related cached item from the cache so the next time your application needs it, it will have to fetch it from the database because the cache doesn’t have it.

ASP.NET has a SqlCacheDependency feature that allows you to synchronize the cache with SQL Server 2005, SQL Server 2008 or Oracle 10g R2 and later—basically any database that supports the CLR. Some of the commercial caches also provide this capability. Figure 7 shows an example of using SQL dependency to synchronize with a relational database.

Figure 7 Synchronizing Data via SQL Dependency

using Vendor.DistCache.Web.Caching;
using System.Data.SqlClient;

public void CreateSqlDependency(
  Customers cust, SqlConnection conn) {

  // Make cust dependent on a corresponding row in the
  // Customers table in Northwind database

  string sql = "SELECT CustomerID FROM Customers WHERE ";
  sql += "CustomerID = @ID";
  SqlCommand cmd = new SqlCommand(sql, conn);
  cmd.Parameters.Add("@ID", System.Data.SqlDbType.VarChar);
  cmd.Parameters["@ID"].Value = cust.CustomerID;

  SqlCacheDependency dep = new SqlCacheDependency(cmd);
  string key = "Customers:CustomerID:" + cust.CustomerID;
_  sCache.Insert(key, cust, dep);

One capability that ASP.NET does not provide, but some commercial solutions do, is polling-based database synchronization. This is handy if your DBMS doesn’t support the CLR and you can’t benefit from SqlCacheDependency. In that case, it would be nice if your cache could poll your database at configurable intervals and detect changes in certain rows in a table. If those rows have changed, your cache invalidates their corresponding cached items.

Enterprise Service Bus for SOA Scalability

Enterprise Service Bus (ESB) is an industry concept where many technologies are used to build it. An ESB is an infrastructure for Web services that mediates communication among components. Put plainly, an ESB is a simple and powerful way for multiple applications to share data asynchronously. It is not meant to be used across organizations or even across a WAN, however. Usually SOA applications are by design broken up into multiple pieces, so when they need to share data with each other, ESB is a powerful tool.

There are many ways to build an ESB. Figure 8 shows an example of an ESB created with a distributed cache. Multiple loosely coupled applications or service components can use it to share data with each other in real time and across the network.

Figure 8 An ESB Created with a Distributed Cache

A distributed cache by its nature spans multiple computers. This makes it highly scalable, which meets the first criterion of an ESB. In addition, a good distributed cache replicates all its data intelligently to ensure that no data loss occurs if any cache server goes down. (I’ll discuss this later.) Finally, a good distributed cache provides intelligent event-propagation mechanisms.

There are two types of events that a distributed cache must provide to be fit for an ESB. First, any client application of the ESB should be able to register interest in a data element on the ESB so if anybody modifies or deletes it, the client application is notified immediately. Second, the cache should allow client applications to fire custom events into the ESB so all other applications connected to the ESB that are interested in this custom event are immediately notified, no matter where they are on the network (of course, within the intranet).

With the help of an ESB, a lot of data exchange that would otherwise require SOA calls from one application to another can be done very easily through the ESB. Additionally, asynchronous data sharing is something a simple WCF service is not designed to do easily. But the ESB makes this job seamless. You can easily create situations where data is even pushed to the clients of the ESB if they have shown interest in it up front.

Cache Scalability and High Availability

Caching topology is a term used to indicate how data is actually stored in a distributed cache. There are various caching topologies that are designed to fit different environments. I’ll discuss three here: partitioned cache, partitioned-replicated cache and replicated cache.

Partitioned and partitioned-replicated are two caching topologies that play major roles in the scalability scenario. In both topologies, the cache is broken up into partitions, then each partition stored in different cache servers in the cluster. Partitioned-replicated cache has a replica of each partition stored on a different cache server.

Partitioned and partitioned-replicated caches are the most scalable topology for transactional data caching (where writes to the cache are as frequent as reads) because, as you add more cache servers to the cluster, you’re not only increasing the transaction capacity, you’re also increasing the storage capacity of the cache because all those partitions together form the entire cache.

A third caching topology, replicated cache, copies the entire cache to each cache server in the cache cluster. This means the replicated cache provides high availability and is good for read-intensive usage. It is not good for frequent updates, however, because updates are done to all copies synchronously and are not as fast as with other caching topologies.

As shown in Figure 9, partitioned-replicated cache topology is ideal for a combination of scalability and high availability. You don’t lose any data because of the replicas of each partition.

Figure 9 Partitioned-Replicated Caching Topology for Scalability

High availability can be further enhanced through dynamic cache clustering, which is the ability to add or remove cache servers from the cache cluster at run time without stopping the cache or the client applications. Because a distributed cache runs in a production environment, high availability is an important feature requirement.

Next Steps

As you’ve seen, an SOA application can’t scale effectively when the data it uses is kept in a storage that is not scalable for frequent transactions. This is where distributed caching really helps.

Distributed caching is a new concept but rapidly gaining acceptance among .NET developers as a best practice for any high-transaction application. The traditional database servers are also improving but without distributed caching, they can’t meet the exploding demand for scalability in today’s applications.

The techniques I’ve described should help take your SOA apps to new levels of scalability. Give them a try today. For more discussion on distributed caching, see the MSDN Library article by J.D. Meier, Srinath Vasireddy, Ashish Babbar, and Alex Mackman at msdn.microsoft.com/library/ms998562.          

Iqbal Khan  is president and technology evangelist at Alachisoft. Alachisoft provides NCache, an industry-leading .NET distributed cache for boosting performance and scalability in enterprise applications. Khan has a master’s in computer science from Indiana University, Bloomington. You can reach him at iqbal@alachisoft.com.

Thanks to the following technical experts for reviewing this article:  Kirill Gavrylyuk and Stefan Schackow