June 2009

Volume 24 Number 06

Velocity - Build Better Data-Driven Apps With Distributed Caching

By Aaron Dunnington | June 2009

This article is based on a prerelease version of Microsoft Velocity. All information is subject to change.

This article discusses:

  • Getting started with Velocity
  • Classifying data for distributed caching
  • A basic Velocity client application
  • Integrating with ASP.NET session state
This article uses the following technologies:
Microsoft Velocity, ASP.NET

Contents

Current Data-Centric Applications
How Velocity Helps
Data Classification
Logical Hierarchy
Partitioned and Local Cache
Configuration Store
Getting Started with Velocity
Cluster Deployment
A Basic Velocity Client Application
Designing the Cache
Named Caches
Organizing the Cache
Organizing the Reports
Organizing the Session
Programming a Cache Layer
Integrating with ASP.NET Session State
Going Forward

Data-driven applications have become pervasive. Data flows from a myriad of sources including relational sources, service-oriented applications, syndicated feeds, and data-centric documents and messages. Application architectures continue to evolve to leverage this wealth of data accessibility. Additionally, the underlying hardware trends and technological advances across processing, storage, memory, and connectivity have enabled application architectures to harness inexpensive, commodity hardware to effectively scale out these applications.

A new Microsoft project, code-named Velocity, provides a distributed, in-memory cache. Velocity enables you to create scalable, available, high-performance applications by exposing a unified view of distributed memory for client application consumption. Through Velocity, client applications can enhance performance by bringing data closer to the logic that consumes it, thus reducing pressure on the underlying data store. The Velocity cluster offers high availability to further insulate applications from data loss as well as increased load on the data tier. By leveraging distributed memory, Velocity enables these high-performance, available applications to flexibly scale out as application demand increases.

This article focuses on how key features of Velocity can be leveraged to deliver the next level of scalability, availability, and performance in new and existing distributed .NET applications.

Current Data-Centric Applications

Your typical current data-powered application could be something along the lines of a simple storefront over a database. This application enables users to browse a product catalog and purchase products through a shopping cart experience. Additionally, aggregated order data is utilized to display top-selling products as well as products commonly purchased together. This application architecture takes on the familiar separation of user interface, business logic, and data access as shown in Figure 1. This architecture, unfortunately, faces several key challenges in the areas of scalability, performance, and availability.

Figure 1

Figure 1 Typical Data-based Web App Architecture

First, there is a major scalability challenge inherent in that the application performs database operations upon every page load in the form of retrieval operations for catalog and top-seller data. As application load increases, this heavy frequency in database operations can pose severe scalability limitations and performance bottlenecks due to an increased contention for resources and the physical limitations of retrieving data from disk. Further, this architecture can dramatically impact the latency of the application as each page load results in operations that travel completely through to the data tier.

If the application utilizes traditional caching approaches in the Web tier to reduce constant pressure on the database, a second key scalability challenge that arises is rooted in coupling of state to specific server resources. Without decoupling cache state from specific servers in order to scale, the application must employ techniques such as sticky routing to tie user requests to respective server nodes. This can potentially lead to an uneven distribution of processing if particular areas of user requests result in a disproportionally greater amount of activity.

Lastly, the Web tier nodes responsible for caching do not deliver high availability to the overarching application as they are not acting in concert through an availability substrate.

How Velocity Helps

As mentioned earlier, Velocity is a distributed, in-memory cache. More specifically, Velocity exposes a unified cache tier to client applications by fusing together memory across servers. The Velocity architecture consists of a ring of cache servers running the Velocity Windows service as well as client applications that utilize the Velocity client library to communicate with the unified cache view (see Figure 2).

Velocity Architecture

Figure 2 Velocity Architecture

Client applications can access the Velocity cache tier to store and retrieve any serializable CLR object through simple put and get operations as shown here:

CacheFactory factory = new CacheFactory(); cache = factory.GetCache("catalog"); cache.Put("Product-43", new Product("Coffee")); Product p = (Product) cache.Get("Product-43");

Velocity enables the example Web app architecture to leverage a new cache tier as shown in Figure 3. By leveraging Velocity in the mid-tier, the application can achieve a new level of scalability, performance, and availability. More specifically, a high degree of scalability can be realized by minimizing the contention for database resources. More flexibility is instilled in this scalability as the architecture is freed from coupling clients to specific server nodes, respective to state, and performance improvements can be enabled by bringing data closer to the logic that consumes it, thus improving response time and reducing latency. Through Velocity, high availability is achieved by insulating the cluster with redundancy, therefore mitigating data loss as well as spikes in load on the data tier in the event of a node failure.

Figure 3

Figure 3 Using the Velocity Cache Tier

By interweaving a caching tier as shown, the application can now explicitly leverage Velocity through various layers and patterns. For example, the business logic could consume Velocity directly in the cache-aside pattern. A read-through/write-behind layer could be introduced under the business layer, thus allowing the consuming business logic to transparently communicate with Velocity. The Web tier can transparently leverage Velocity for session state management through the Velocity ASP.NET session store provider integration.

Data Classification

To fully take advantage of the Velocity feature set in your applications, it is important to understand the types of data commonly cached. These data types can be classified as reference data, activity data, and resource data.

Reference data refers to data that is primarily read-only, such as product catalog or user data. This data is written infrequently, perhaps once per day or week. However, the scale requirements on reference data demand a large number of read requests on these smaller pieces of data.

For example, in a storefront, as increasing loads of users browse the catalog, a tremendous number of product data store retrieval and list building operations are generated. As the product catalog reference data changes fairly infrequently and is highly reusable across clients, it is conducive to caching in that considerable pressure can be offloaded from the data tier by caching this data closer to the application.

Top-selling and commonly purchased-together product data can be described as reference data as well. As real-time requirements are not a necessity for these reports, the application would benefit greatly from eliminating the cost of analyzing orders upon every request to determine the current products associated with these groupings.

Activity data refers to data that is part of a business activity or transaction. The data described as part of the shopping cart experience in the storefront represents activity data. Activity data is that it is exclusively read and written, and thus is usually easily partitioned. After the lifetime of the activity—in this case upon completion of checkout—the activity data is usually retired from the cache and persisted to the underlying store for subsequent processing.

Currently, to maintain shopping cart state, the example application needs to either employ sticky routing to tie user sessions to specific server resources or persist state to the underlying data store. By caching this activity data in Velocity, you could eliminate the need for sticky routing and alleviate the underlying database from the responsibility of storing this primarily transient shopping cart data until the activity ultimately produces an order.

Resource data refers to data that is concurrently read and written, such as product inventory. During the ordering process, inventory levels may need to be frequently monitored to enforce levels or back-fill policies. However, as orders are processed, this data needs to be concurrently updated to account for the change in levels. Frequently, when dealing with resource data, to deliver performance and scale, coherence of data is relaxed. For instance, the ordering process may oversell items wherein separate processes such as back ordering could be triggered to reconcile inventory conflicts.

Before I get started integrating Velocity into the example application, it might be helpful to quickly review some core Velocity concepts.

Logical Hierarchy

The Velocity logical hierarchy consists of machines, hosts, named caches, regions, and cache items (see Figure 4). Machines can run multiple Velocity services, each service considered a cache host. Each cache host can run multiple named caches. These named caches can span across machines and are defined in configuration.

Figure 4

Figure 4 Logical Hierarchy of Velocity Elements

Each named cache stores a logical group of data, such as a product catalog. Named caches also set policies around availability, eviction, and expiration. Explicitly created regions—physically co-located data containers—may also exist within each named cache. As items are retrieved in bulk, regions become useful if the application consistently needs to address a group of cache items together. Additionally, these regions enhance usability by providing tag-based searching and enumeration of region cache items. Explicitly created regions are, however, an optional concept. Velocity will implicitly create default regions behind the scenes if no region is explicitly specified.

Lastly, within the implicitly and explicitly created regions are the cache items themselves, which are responsible for maintaining keys, objects, tags, timestamps, versions, and expiry data.

Partitioned and Local Cache

Velocity currently supports two caching types: partitioned and local. Partitioned caches distribute cache items across the hosts for particular named caches. Each overarching named cache can enable high availability, effectively causing Velocity to maintain a configurable number of backups for items within a named cache across machines in the Velocity tier.

Local caching runs on the client to speed up access to data retrieved from Velocity. When local caching is used in concert with the Velocity partitioned caches, as data is retrieved from the Velocity server ring, cache items are stored in object form within the process running the Velocity client. Further, as with server-side expiration policy, local caches can also each configure an expiration time for items.

Configuration Store

The Velocity configuration store consists of cache policies as well as the global partition map, which details the distribution of cache items across the cluster. Velocity uses the policies and global partition map defined within the configuration store to enforce the policies defined by named caches and to run the global partition manager, respectively.

The global partition manager is a component of Velocity that is responsible for conveying the landscape of cache items to the rest of the cluster. Clients ultimately utilize this information when communicating with the cluster to create routing tables that define correlations between data and Velocity service hosts. The global partition manager runs within one Velocity service host. If the host running the global partition manager goes down, another host within the Velocity cluster will assume responsibility by starting the global partition manager from the global partition map stored in the configuration store. Velocity enables the configuration store to be placed on a network share or stored within SQL Server.

Getting Started with Velocity

To get started with Velocity, let's begin with an overview of the Velocity installation process, the initial cluster deployment, and a basic Velocity client application.

Upon proceeding through the Velocity installer on a cache server, you are presented with the cache host configuration screen shown in Figure 5.

Figure 5

Figure 5 Velocity Cache Host Confi guration

The cache host configuration screen captures the cluster and service settings required to run your Velocity host. First, this screen allows you to choose whether the configuration store is placed on a network share or stored within SQL Server. For this sample, I'll use the XML Based File option to store the Velocity configuration on a network share.

Next, the cluster is assigned a name as well as an approximate node size. The cluster size configuration is used to optimize performance within Velocity by tuning internal data structures according to your cluster size. It is possible to reset the cluster size at a later time as your cluster evolves.

Finally, the service configuration values section captures the service port that the current host runs on, the port the cluster itself runs on, as well as the maximum amount of memory this Velocity host is to consume.

Click the Test Connection button to validate the connectivity of the configuration store. Click Save & Close to finish the installation. If the machine is running a firewall, make sure to allow the Velocity service through by adding an exception for DistributedCache.exe, located in the Velocity installation directory.

To install additional cache hosts, repeat these installation steps on new Velocity machines. When installing additional hosts, the cluster related settings will be detected automatically from the configuration store on the network share used during the installation of the first cluster host. For this example, I installed a three-node cluster.

Cluster Deployment

Once Velocity is installed, it can be administered through Windows PowerShell. To start the Velocity command window, run the Velocity administration shortcut placed on the desktop after installation as administrator. To start the cache cluster, run the Start-CacheCluster command. The Get-CacheHost command will display the status of hosts across the cluster. Get-CacheStatistics will display cache item statistics for a particular named cache. For more details on Velocity administration through Windows PowerShell, please see the documentation included with the Velocity installation.

After joining cache hosts to the Velocity cluster, the XML policy configuration shown in Figure 6is produced on the network share in the ClusterConfig.xml file.

Figure 6 Velocity XML Policy Configuration

<?xml version="1.0" encoding="utf-8"?> <configuration> <configSections> <section name="dcache" type="System.Data.Caching.DCacheSection, CacheBaseLibrary, Version=1.0.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91" /> </configSections> <dcache cluster="velsample" size="Small"> <caches> <cache type="partitioned" consistency="strong" name="default"> <policy> <eviction type="lru" /> <expiration defaultTTL="10" isExpirable="true" /> </policy> </cache> </caches> <hosts> <host clusterPort="22234" hostId="1916351620" size="2048" quorumHost="true" name="VELSAMPLE01" cacheHostName="DistributedCacheService" cachePort="22233" /> <host clusterPort="22234" hostId="1249604550" size="2048" quorumHost="false" name="VELSAMPLE02" cacheHostName="DistributedCacheService" cachePort="22233" /> <host clusterPort="22234" hostId="1016778186" size="2048" quorumHost="false" name="VELSAMPLE03" cacheHostName="DistributedCacheService" cachePort="22233" /> </hosts> <advancedProperties> <partitionStoreConnectionSettings providerName="System.Data.SqlServerCe.3.5" connectionString="\\velsample01\velocity\ConfigStore.sdf" /> </advancedProperties> </dcache> </configuration>

The dcache configuration element represents the velsample cluster created during installation. This configuration section holds the policy configuration information that I set during the host installations, namely named caches, hosts, and configuration store settings.

In the caches element, Velocity automatically created a default, partitioned, named cache. This named cache configuration also holds the policy settings Velocity uses to enforce expiration and eviction settings that control how long cache items within this named cache should remain in memory as well as how Velocity removes these items once they expire. In this default configuration, the defaultTTL value indicates that cache items will remain in memory for 10 minutes. Subsequently, Velocity will use the Least Recently Used (LRU) algorithm to evict objects from memory as they expire.

In certain circumstances, it may be desirable to turn these expiration and eviction settings off. For example, in cases where the data set size is known to be relatively fixed, such as the product catalog in the example application. In this case, the eviction type would be set to none while the isExpirable setting would be set to false.

Next, in the hosts configuration element, each of the Velocity hosts specifies naming, size, and address information. Additionally, each host can act as a quorum host, or lead host. Lead hosts in Velocity are responsible for maintaining the stability of the cluster. These lead hosts collaborate to prevent split brain syndrome, a circumstance where a loss of communication within the cluster causes independently operating sub-clusters. A majority of Velocity lead hosts must be running for the entire cluster to run, so it is critical to insulate your cluster with enough lead hosts to ensure a majority of these lead hosts do not fail simultaneously.

There is some additional overhead involved in lead host communication. It is a good practice to run your cluster with as few lead hosts as necessary to maintain a quorum of leads. For small clusters, ranging from one to three nodes in size, it is acceptable to run all nodes as lead nodes as the amount of additional overhead generated by a small grouping of lead hosts will be relatively low. For larger clusters, however, to minimize overhead involved in ensuring cluster stability, it is recommended to use a smaller percentage of lead hosts—for example, 5 lead hosts for a 10-node cluster.

For my example app, to enable all three hosts to run as lead hosts, I can run a Stop-CacheCluster command, update the remaining quorumHost values to true, and then run Start-CacheCluster.

Finally, the advancedProperties configuration element specifies the storage provider for and location of the partition map. Here, the configuration specifies the network location of the partition map database. This partition map database is used by the host currently running the global partition manager.

A Basic Velocity Client Application

To set up your development environment for creating Velocity applications, first create a new console project. Add references to the following assemblies in your Velocity installation directory:

  • CacheBaseLibrary.dll
  • CASBase.dll
  • CASMain.dll
  • ClientLibrary.dll

Next, add an app.config file to your console project with the following configuration:

<?xml version="1.0" encoding="utf-8" ?> <configuration> <configSections> <section name="dcacheClient" type="System.Data.Caching.DCacheClientSection, CacheBaseLibrary" allowLocation="true" allowDefinition="Everywhere"/> </configSections> <dcacheClient deployment="routing"> <localCache isEnabled="false"/> <hosts> <host name="VELSAMPLE01" cachePort="22233" cacheHostName="DistributedCacheService"/> </hosts> </dcacheClient> </configuration>

The dcacheClient section provides the Velocity client configuration. These settings specify the client deployment type, local cache options, as well as cluster address information.

The deployment can be set to either routing or simple. A routing deployment allows Velocity clients to communicate with any Velocity host in the cluster. A simple deployment directs clients to communicate with specific Velocity hosts, requiring an additional network hop within the cluster to communicate with the target host. Routing is the recommended deployment as it minimizes network overhead. However, if there are networking constraints allowing clients to only communicate with a subset of the Velocity hosts in your cluster, a simple deployment would be suitable.

The localCache element in this example specifies that local caching is disabled for the client application. If isEnabled was set to true here, as this console client application retrieved cache items from the Velocity cluster, these items would be preserved on the client-side in object form within the application process. As subsequent trips to the cluster would no longer be required, local caching could further enhance performance.

With local caching enabled, a ttlValue can specify the number of seconds objects should remain valid within the local cache. Subsequent requests for invalidated objects will cause the client to retrieve updates from the cluster.

It is important to note that clients utilizing local caching will retrieve objects locally even if these items have been updated on the cluster. As such, it is best to use local caching for infrequently changing data. In subsequent releases, the Velocity team is targeting a notifications feature that will allow the Velocity cluster to emit notifications of updated cache items to clients, thus allowing clients to refresh the local cache accordingly.

The hosts configuration settings specify addresses for the various hosts in your cluster. Multiple hosts may be specified here; though, only one host is required for the client to discover the other hosts in the cluster. Each host in the client configuration specifies name, port, and cacheHostName settings which correspond to the host settings shown previously in the cluster configuration.

Figure 7shows a simple program that puts an object into the Velocity unified cache and then retrieves that object from the cache.

Figure 7 Using the Velocity Cache

using System; using System.Data.Caching; namespace VelocityClientConsole { class Program { static void Main(string[] args) { // Create the cache factory CacheFactory factory = new CacheFactory(); // Get the default named cache Cache cache = factory.GetCache("default"); // Add a string to the cache cache.Put("MyCacheKey", "Hello World"); // Get the data from the cache string s = (string) cache.Get("MyCacheKey"); Console.WriteLine(s); Console.ReadLine(); } } }

First, in the using statement, I reference the types within the System.Data.Caching namespace. In the Main method, I create a CacheFactory and retrieve the named cache entitled default. Then I use the Cache client instance to put the string object into the default cache and retrieve it.

The CacheFactory also provides additional constructors to create Cache objects that communicate with Velocity clusters not referenced in the client configuration. The CacheFactory will create the appropriate concrete Cache client implementation respective of the deployment and local cache settings that are specified either through configuration or programmatically via the CacheFactory constructor.

After running the console application through Windows PowerShell, I can verify that the cache item has been created through the Get-Cache and Get-CacheStatistics commands (see Figure 8). Get-Cache shows that an implicit region has been created to hold the item on the VELSAMPLE03 host. Additionally, Get-CacheStatistics shows size, region, and access information for the default named cache.

Figure 8

Figure 8 Viewing Cache Statistics in Windows PowerShell

Designing the Cache

Now that Velocity is up and running, let's begin integrating Velocity into the storefront application by designing the cache storage. In this example, the storefront application utilizes the Northwind database for product catalog and report reference data as well as sales order activity data. This data model is illustrated in Figure 9.

Figure 9

Figure 9 Storefront Data Model

The application utilizes transfer object classes to represent the underlying data store. The current storefront data access layer utilizes System.Data.SqlClient to populate transfer objects. If your application leverages object-relational mapping (O/RM) technologies for more powerful type mappings, objects produced by these frameworks could be cached in Velocity just the same. As long as your data access layer produces serializable CLR objects, Velocity will be capable of caching these objects.

Named Caches

As the properties that govern these types of data vary among reference and activity classifications, an organization strategy that demarcates these logical units into distinct named caches would benefit from caching policies suitable to the needs of each data type. In particular, by creating separate named caches for the catalog, report, and shopping cart data, you can tune the policies around expiration, eviction, and availability accordingly.

Since the product catalog can be classified as reference data, or shared, read-only data, it is very conducive to caching. This data, moreover, represents a logical unit of reference data, so it is a good candidate for defining a separate named cache.

To create this new named cache, use the New-Cache command.

New-Cache -CacheName catalog -Eviction none -NotExpirable

For the catalog cache, as the data changes fairly infrequently and the catalog size is predictable, you can turn off eviction as well as disable expiration. However, in scenarios where the cache size may grow unpredictably, it is necessary to utilize eviction and expiration to prevent out of memory conditions.

As the product reference reports, such as top-selling products, may fluctuate more frequently than the catalog itself, aggregating identifier listings for reports into a separate named cache has benefit in that these product listings can be decoupled from the underlying catalog named cache policies. Specifically, for these product reference related reports, you could create a separate named cache called reports with a more aggressive expiration and eviction strategy.

New-Cache -CacheName reports -Eviction LRU -TTL 1440

The reports named cache is created such product identifier report listings will expire after one day (1,440 minutes), and Velocity will use LRU algorithm to evict objects from the cache. Using a separate named cache for aggregated product identifier listings will allow the report related data and the product catalog data to vary independently according to different expiration and eviction policies. As such, top-selling products can be retrieved from the catalog cache using product identifier listings from the reports named cache. The top-selling report listing object will expire daily, whereas the product catalog remains cached, perhaps until a background process updates at a regular or triggered interval.

As session-related data is traditionally read and written, it can be classified as activity data. Since shopping cart data is critical to the completion of orders, there is also a strong requirement for high availability across this data. In the original application, to mitigate the loss of shopping cart data, this state would have been persisted through to the underlying data store until the completion of respective orders. In a Velocity-enabled storefront, as users progress through the shopping cart experience, activities based on cart data can be exclusively read from and written to the Velocity cluster.

Through a newly created session named cache, you can tune policies around the properties of activity-related data. In particular, for applications that previously cached this data in default InProc ASP.NET session state, more flexible scaling is enabled through Velocity SessionStoreProvider integration. Sticky routing is no longer need to couple user requests to specific servers according to state. Further, by enabling high availability on the session named cache, data loss is mitigated without persisting data to the database. Velocity insulates the cluster by redundantly storing cart data.

This New-Cache command below creates the session named cache.

New-Cache -CacheName session -Secondaries 1 -TTL 1440

The –Secondaries switch specifies the number of backups for each item stored within this named cache. In this example, Velocity will insulate 1 additional backup copy across the cluster for each shopping cart stored in the session named cache.

Organizing the Cache

The application needs to be able to address categories and products both independently as well as in category and product lists. There are several approaches you could take to organize these objects in the catalog named cache, including region-based organization and list-based organization. Both approaches focus on a degree of object distribution across the cluster. However, depending on the needs of your application, another organization approach might be more appropriate. For example, you could cache object graphs by key or you could employ page caching.

The product catalog could be organized using Velocity Regions. Regions are advantageous in the example scenario because they allow the application to address particular products in a category all together or independently. For instance, the application may need to present an entire product category listing in some cases, while it may need to display only a product detail page in other cases. Organizing the catalog cache using regions for product category associations would yield the hierarchy shown in Figure 10.

Figure 10

Figure 10 Region-based Organization

In this cache storage model, the category objects themselves would be stored independently whereas product objects would be stored in regions, relative to category membership. Another potential benefit of a region-based approach for the catalog involves co-location: as all objects in a region are guaranteed to reside on the same Velocity host machine, the entire product category region will be retrieved in bulk. A list-based approach requires multiple cache requests. Regions also enable the retrieval of items according to tags, which are secondary indexes for cache items.

Although a region-based approach presents benefit in bulk retrieval optimization, individual access for products, and tag-based retrieval, the co-location of items could also present a bottleneck. If the client application consumes a particular category region more actively than others, this region hot spot would prevent the inherent benefits in throughput provided by scaling out the data through the distribution of products across the cluster.

Using a list-based approach for product category organization would promote even distribution of items across the cluster, thus fully realizing the scalability benefits possible through the distribution of data. Where a region-based approach could potentially develop a hot spot of activity in a physically co-located grouping of products, a list-based approach allows you to distribute the load of any particular product category listing across the cluster. This organizational approach requires the storage of category objects, product objects, and category list objects that identify the independently stored product objects. The cache organization for this list-based approach is shown in Figure 11.

Figure 11

Figure 11 List-based Organization

Although a list-based approach to organizing the catalog distributes the load of catalog activity evenly across the cluster, to address a product category all together, multiple cache requests are necessary. On this topic, bulk read/write is a feature that the Velocity team is targeting for future releases.

Organizing the Reports

You can employ a similar list-based organization for the reports named cache. Decoupling these reference reports from the catalog named cache allows the policies that govern report and catalog data to vary independently. Effectively, this allows you to tune the expiration and eviction policies according to the respective data types. The fairly static catalog reference data disables expiration and eviction, while the more dynamic reports data enforces a more aggressive expiration and eviction policy. Organizationally, product key list objects within the reports named cache will identify product objects within the catalog cache, similar to the method used for product category listings, yet employed across named caches.

To retrieve products based on these aggregated reference reports, the list-based method is used across the reports and catalog named caches.

Organizing the Session

For activity-based scenarios that are exclusively read/write, approaching cache organization through the storage of object hierarchies may be an optimal choice. For example, in the session named cache, you can organize shopping cart objects that contain information about each product a particular user has selected: product, price, and quantity.

Whereas the catalog organization promoted a high degree of scalability by distributing items across the cache, the shopping cart cache items are not as highly reusable across requests. Performance can be increased by caching each user's shopping cart under a single key rather than distributing shopping cart items across the named cache as top-level cache items. More specifically, as the storefront application generates requests to the session named cache, each shopping cart can be retrieved in bulk through a single request. Additionally, since the session named cache was created using high availability, Velocity will protect the session data by maintaining copies of each shopping cart across the cluster. The session organizational approach is illustrated in Figure 12.

Figure 12

Figure 12 Session Cache Organization

Applications may maintain activity-based items directly through Velocity APIs, and ASP.NET applications may leverage Velocity's integration with the session store provider framework to transparently realize Velocity as a cache tier through the ASP.NET session API.

Programming a Cache Layer

The current architecture tiers are demarcated as the user interface (Web), the business logic layer (BLL), and the data access layer (DAL). As the current application deals primarily with data transfer objects produced by the data tier, caching data in these respective shapes facilitates delivering the application tiers with representations they understand. The storefront example introduced a cache layer in between the business and data tiers. By interweaving caching logic under the business tier, the layers that depend on the business logic uniformly benefit from validation of business rules before generating calls to the caching logic, while the business logic itself transparently consumes the cache tier. Additionally, the Web tier utilizes the Velocity ASP.NET session store provider integration.

To see this layering in action, let's walk through the cache layer starting by using the product category listing view. To begin, the category listing view calls into the business logic layer that validates the request and then consumes the cache layer by calling the GetProductsByCategory method on VelocityCatalogCache (see Figure 13). This coordinates marshaling data among the Velocity cache and the data store in a manner transparent to the business logic.

Figure 13 GetProductsCatalog

public IList < Product > GetProductsByCategory(int id) { IList < Product > products = null; IList < int > productIDs = (IList < int > ) _catalogCache.Get("ProductCategoryList-" + id); if (productIDs != null) { products = new List < Product > (productIDs.Count); foreach(int productID in productIDs) { products.Add((Product) _catalogCache.Get("Product-" + productID)); } } else { products = DataLayer.CatalogProvider.GetProductsByCategory(id); if (products != null && products.Count > 0) { productIDs = new List < int > (products.Count); foreach(Product p in products) { productIDs.Add(p.ProductID); _catalogCache.Put("Product-" + p.ProductID, p); } _catalogCache.Put("ProductCategoryList-" + id, productIDs); } } return products; }

In this product category listing, the cache layer will act as a read-through provider to simplify the business layer. This is only one layering approach to programming caching logic using Velocity. For example, for fine-grain control at the business level, leveraging Velocity in a cache-aside pattern directly in the business layer would present a viable approach as well. In future releases, the Velocity team is targeting read-through/write-behind support wherein callbacks may be registered to coordinate marshaling data among the cache and underlying store, thus allowing clients to utilize this behavior through the Velocity cache APIs rather than a wrapper layer.

In the GetProductsByCategory cache layer method implementation, it is important to touch on the implication to concurrency. In this method, no concurrency model is currently being utilized. As the catalog is primarily read-only reference data that changes infrequently, the impact to concurrency will not be too severe. However, it would be possible for multiple threads or clients to perform identical updates, one after another, if they arrive at cache misses simultaneously. Alternatively, for scenarios where concurrency is critical—for example when caching various types of resource data—Velocity supports both optimistic and pessimistic concurrency models.

Integrating with ASP.NET Session State

Velocity provides the ability to transparently leverage the cache cluster through the ASP.NET SessionStoreProvider framework. This capability allows Web applications to access items in a specified name cache through the HttpContext.Session property.

This integration presents great benefit in scalability by decoupling cache state from specific servers. By leveraging Velocity for session, or activity-related data, more flexible scaling is enabled as the architecture is freed from coupling clients to specific server nodes, respective to state.

For scenarios that require high availability, such as the shopping activity data, Velocity enables you to easily insulate the cluster with redundancy as shown during the creation of the session named cache, thus delivering availability closer to the consuming logic.

To enable Velocity SessionStoreProvider integration in ASP.NET, the following configuration is added to the web.config file in the storefront ASP.NET application. This configuration associates the storefront ASP.NET session state with the previously created session named cache.

<system.web> <sessionState mode="Custom" customProvider="SessionStoreProvider"> <providers> <add name="SessionStoreProvider" type="System.Data.Caching.SessionStoreProvider, ClientLibrary" cacheName="session" /> </providers> </sessionState> </system.web>

Then, to access the session named cache through ASP.NET, the storefront maintains user shopping cart activity data as in the Product.aspx page shown in Figure 14.

Figure 14 Accessing Cache Through ASP.NET

protected void AddToCartButton_Click(object sender, EventArgs e) { ShoppingCart cart = (ShoppingCart) Session["ShoppingCart"]; if (cart == null) { cart = new ShoppingCart(); Session["ShoppingCart"] = cart; } int productID = Convert.ToInt32(Request.Params["id"]); ShoppingCartItem item = cart[productID]; if (item != null) { item.Quantity++; } else { Product p = ServiceLayer.CatalogService.GetProduct(productID); cart[productID] = new ShoppingCartItem() { ProductID = productID, Quantity = 1, UnitPrice = p.UnitPrice }; } Response.Redirect("~/Cart.aspx"); }

Upon order completion, activity-oriented shopping carts can be removed from the session named cache and propagated to the underlying data store as shown here from the Purchase.aspx page.

protected void CheckoutButton_Click(object sender, EventArgs e) { ShoppingCart cart = (ShoppingCart) Session["ShoppingCart"]; ServiceLayer.OrderService.ProcessOrder(cart, NameTextBox.Text, AddressTextBox.Text, CityTextBox.Text, RegionTextBox.Text, PostalCodeTextBox.Text, CountryTextBox.Text); Session["ShoppingCart"] = null; Response.Redirect("~/Congrats.aspx"); }

Through the ASP.NET session API, the application now seamlessly leverages Velocity to realize activity-oriented data closer to the consuming logic, eliminate the need for techniques such as sticky routing, and protect shopping cart activity data through the high-availability support previously enabled across the session named cache.

Going Forward

Velocity harnesses the evolution in distributed architectures as well as the underlying hardware trends to realize this pervasiveness of data closer to consuming applications. This article has only provided a first look at the features of and rationales driving Velocity.

The example storefront applications now leverages Velocity to cache reference catalog and report data as well as shopping cart activity data using policies suitable to each data type. Whereas scale was formerly inhibited by duplication of catalog data operations upon every page load, now by both minimizing load on the underlying data store and realizing our catalog reference data closer to the logic that consumes it, a high degree of scalability and performance is enabled. Response time is also improved, and you get more flexible scaling by freeing the underlying architecture from coupling resources to requests based on state. Without going to the underlying data store as needed before, data critical to the completion of orders is protected through high-availability support in Velocity.

If you would like additional insight into the rationale for distributed, in-memory caching, an overview of the entire Velocity feature set, and information about downloads, see the whitepaper " Microsoft Project Code Named Velocity" and the Velocity Developer Center.

Aaron Dunnington is a Program Manager on the Data Programmability team at Microsoft. He can be reached at http://blogs.msdn.com/velocity.