January 2010

Volume 25 Number 01

Cloud Storage - Fueling Your Application's Engine with Microsoft Azure Storage

By Kevin Hoffman | January 2010

Download the Code Sample

Developers tend to cling to their physical, tangible infrastructure like a safety blanket. They know how to use it, they know how to operate it, and when something goes wrong, they know where it is. This often creates a barrier that slows down developer adoption of newer technologies, such as cloud computing.

One of the biggest questions skeptical developers ask is how they can still run background processes in the cloud—how will their engine continue to work. This article aims to dispel myths about lack of background processing in the cloud by showing you how you can build an application engine as well as implement asynchronous messaging and processing using Azure Storage.

To prove that developers can shed the safety blanket of their physical infrastructure and put their application engines in the cloud, we’re going to walk through the implementation of a small subset of an e-commerce application, Hollywood Hackers, where you can buy all of the magical technology that Hollywood uses to completely ignore the laws of physics and old-fashioned common sense.

The two main scenarios we’ll cover are:

  • Sending asynchronous text messages (“toasts”) to users of the application to notify them of important events such as their cart being submitted, or to send messages between employees. This scenario uses Azure Queue, Azure Table and a Azure Worker Role.
  • Submitting a shopping cart to a fulfillment engine using Azure Queue and a Azure Worker Role.

Intra-Application Messaging with Queue Storage

Before we get to the specific scenarios, we need to cover some basics about Azure Queue. Queues in the cloud don’t work quite like queues in your plain-vanilla .NET application. When you work with data in an AppDomain, you know there’s only one copy of that data and it’s sitting comfortably within a single managed process.

In the cloud, one piece of your data might be in California and another might be in New York, and you might have a worker role doing processing on that data in Texas and another worker role doing processing in North Dakota.

Adjusting to this kind of distributed computing and distributed data brings up issues many developers are unfamiliar with, such as coding for potential failure, building in the concept of multiple retries for data commits and, finally, the idea of idempotence.

The way Azure Queues work is fairly straightforward, so long as you don’t treat them like in-process regular CLR queues. First, your application will ask the queue for some number of messages (though never more than 20 at a time; keep that mind) and supply a timeout. This timeout governs how long those messages will be hidden from other queue-processing clients. When your application has successfully completed whatever processing needs to be done on the queue message, it should delete the message.

If your application throws an exception or otherwise fails to process the queue message, the message will become visible to other clients again after the timeout period. This allows additional worker roles to continue processing when one fails. Submitting a message to a queue is very straightforward: your application forms the appropriate HTTP POST message (either directly or with the aid of a client library) and submits either string or an array of bytes. Queues are designed specifically for intra-application messaging and not permanent storage, so the messages need to be kept fairly small.

As previously mentioned, you could conceivably have multiple worker roles all attempting to process the same messages. The invisibility timeout that hides messages that are currently being processed is helpful, but it isn’t a guarantee. To completely avoid conflict, you should design your engine processing so that it’s idempotent. In other words, the same queue message should be able to be processed multiple times by one or more worker roles without putting the application into an inconsistent state.

Ideally, you want the worker role to be able to detect if work has already been completed on a given message. As you write your worker roles to process queue messages, keep in mind that there is a chance your code could be attempting to process a message that already has been processed, however slim that chance may be.

The code snippet in Figure 1 shows how to create and submit a message to a Azure Queue using the StorageClient assembly that’s provided with the Azure SDK. The StorageClient library is really just a wrapper around the Azure Storage HTTP interface.

Figure 1 Creating and Submitting a Message to a Azure Queue

string accountName;
string accountSharedKey;
string queueBaseUri;
string StorageCredentialsAccountAndKey credentials;

if (RoleEnvironment.IsAvailable)
{
// We are running in a cloud - INCLUDING LOCAL!
  accountName = 
  RoleEnvironment.GetConfigurationSettingValue("AccountName");  
  accountSharedKey =   
  RoleEnvironment.GetConfigurationSettingValue("AccountSharedKey");
  queueBaseUri = RoleEnvironment.GetConfigurationSettingValue
 ("QueueStorageEndpoint");
}
else
{
  accountName = ConfigurationManager.AppSettings["AccountName"];
  accountSharedKey = 
  ConfigurationManager.AppSettings["AccountSharedKey"];
  queueBaseUri = 
  ConfigurationManager.AppSettings["QueueStorageEndpoint"];
}
credentials = 
new StorageCredentialsAccountAndKey(accountName, accountSharedKey);
CloudQueueClient client = 
new CloudQueueClient(queueBaseUri, credentials);
CloudQueue queue = client.GetQueueReference(queueName);
CloudQueueMessage m = new CloudQueueMessage(
  /* string or byte[] representing message to enqueue */);
Queue.AddMessage(m);

For other samples throughout this article, we’ve used some wrapper classes (available on the CodePlex site for Hollywood Hackers: hollywoodhackers.codeplex.com/SourceControl/ListDownloadableCommits.aspx) that simplify this process.

Asynchronous Messaging (Toasts)

Interactive Web sites aren’t just the rage these days, they’re a requirement. Users have become so accustomed to fully interactive Web sites that they think something’s wrong when they encounter a static, non-interactive page. With that in mind, we want to be able to send notifications to our users as they are using the site.

To do this, we’ll utilize Azure Queue and Table storage mechanisms to build a message delivery framework. The client side will use jQuery combined with the jQuery Gritter plugin to display notifications in the user’s browser as a toast, similar to the messages that fade in above the Windows system tray when you receive a new Outlook e-mail, instant message or tweet.

When a user needs to be sent a notification, it will be inserted into the queue. As the worker role processes each item in the queue, it will dynamically determine how to handle each one. In our case, there is only one thing for the engine to do, but in a complex CRM Web site or helpdesk site, the possibilities are endless.

When the worker role comes across a user notification in the queue, it will store the notification in table storage and delete it from the queue. This allows messages to be persisted long-term and wait for the user to log in. Messages in queue storage have a short maximum lifespan and will never last more than a few days. When the user accesses the Web site, our jQuery script will asynchronously pull any messages out of the table and display them in the browser by invoking a method on a controller that returns JavaScript Object Notation (JSON) in a well-known shape.

Although the queue only handles strings or byte arrays, we can store any type of structured data in the queue by serializing it to binary and then converting it back when we need to use it. This becomes a powerful technique for passing strongly typed objects to the queue. We’ll build this into the base class for our queue messages. Then our system message class can contain our data and the entire object can be submitted into the queue and utilized as needed (see Figure 2).

Figure 2 Storing Structured Data in the Queue

namespace HollywoodHackers.Storage.Queue
{
    [Serializable]
    public class QueueMessageBase
    {
        public byte[] ToBinary()
        {
            BinaryFormatter bf = new BinaryFormatter();
            MemoryStream ms = new MemoryStream();
            ms.Position = 0;
            bf.Serialize(ms, this);
            byte[] output = ms.GetBuffer();
            ms.Close();
            return output;
        }
        public static T FromMessage<T>(CloudQueueMessage m)
        {
            byte[] buffer = m.AsBytes();
            MemoryStream ms = new MemoryStream(buffer);
            ms.Position = 0;
            BinaryFormatter bf = new BinaryFormatter();
            return (T)bf.Deserialize(ms);
        }
    }
    [Serializable]
    public class ToastQueueMessage : QueueMessageBase
    {
      public ToastQueueMessage()
            : base()
        {
        }
        public string TargetUserName { get; set; }        
        public string MessageText { get; set; }
        public string Title { get; set; }
        public DateTime CreatedOn { get; set; }       
   }

Keep in mind that in order to use the BinaryFormatter class, your Azure worker role needs to be running in full-trust mode (you can enable this through your service configuration file).

Now we’ll need a simple wrapper to interact with our queue. At its core, we need the ability to insert a message into the queue, get any pending messages, and clear the queue (see Figure 3).

Figure 3 Wrapper to Interact with Queue

namespace HollywoodHackers.Storage.Queue
{
    public class StdQueue<T> : 
    StorageBase where T : QueueMessageBase, new()
    {
        protected CloudQueue queue;
        protected CloudQueueClient client;

        public StdQueue(string queueName)
        {
            client = new CloudQueueClient
            (StorageBase.QueueBaseUri, StorageBase.Credentials);
            queue = client.GetQueueReference(queueName);
            queue.CreateIfNotExist();
        }
        public void AddMessage(T message)
        {
            CloudQueueMessage msg = 
            new CloudQueueMessage(message.ToBinary());
            queue.AddMessage(msg);
        }
        public void DeleteMessage(CloudQueueMessage msg)
        {
            queue.DeleteMessage(msg);
        }
        public CloudQueueMessage GetMessage()
        {
            return queue.GetMessage(TimeSpan.FromSeconds(60));
        }
    }
    public class ToastQueue : StdQueue<ToastQueueMessage>
    {
        public ToastQueue()
            : base("toasts")
        {
        }
    }
}

We also need to set up a wrapper for our table storage so user notifications can be stored until they log in to the site. Table data is organized using a PartitionKey, which is the identifier for a collection of rows, and a RowKey, which uniquely identifies each individual row in a certain partition. The choice of what data you use for a PartitionKey and a RowKey could be one of the most important design decisions you make when using table storage.

These features allow load-balancing across storage nodes and provide built-in scalability options in your application. Regardless of the data-center affinity of your data, rows in table storage with the same partition key will be kept within the same physical data store. Because messages are stored for each user, the partition key will be the UserName and the RowKey will be a GUID that identifies each row (see Figure 4).

Figure 4 Wrapper for Table Storage

namespace HollywoodHackers.Storage.Repositories
{
    public class UserTextNotificationRepository : StorageBase
    {
        public const string EntitySetName = 
        "UserTextNotifications";
        CloudTableClient tableClient;
        UserTextNotificationContext notificationContext;
        public UserTextNotificationRepository()
            : base()
        {
            tableClient = new CloudTableClient
            (StorageBase.TableBaseUri, StorageBase.Credentials);
            notificationContext = new UserTextNotificationContext 
            (StorageBase.TableBaseUri,StorageBase.Credentials);

            tableClient.CreateTableIfNotExist(EntitySetName);
        }
        public UserTextNotification[] 
        GetNotificationsForUser(string userName)
        {
            var q = from notification in 
                    notificationContext.UserNotifications
                    where notification.TargetUserName == 
                    userName select notification;
            return q.ToArray();
        }
        public void AddNotification 
       (UserTextNotification notification)
        {
            notification.RowKey = Guid.NewGuid().ToString();
            notificationContext.AddObject
           (EntitySetName, notification);
            notificationContext.SaveChanges();
        }
    }
}

Now that our storage mechanisms are in place, we need a worker role that acts as our engine; processing messages in the background of our e-commerce site. To do this, we define a class that inherits from the Microsoft.ServiceHosting.ServiceRuntime.RoleEntryPoint class and associate it with the worker role in our cloud service project (see Figure 5).

Figure 5 Worker Role Acting as Engine

public class WorkerRole : RoleEntryPoint
{
    ShoppingCartQueue cartQueue;
    ToastQueue toastQueue;
    UserTextNotificationRepository toastRepository;

    public override void Run()
    {
        // This is a sample worker implementation. 
        //Replace with your logic.
        Trace.WriteLine("WorkerRole1 entry point called", 
        "Information");
        toastRepository = new UserTextNotificationRepository();
        InitQueue();
        while (true)
        {
            Thread.Sleep(10000);
            Trace.WriteLine("Working", "Information");

            ProcessNewTextNotifications();
            ProcessShoppingCarts();
        }
    }
    private void InitQueue()
    {
        cartQueue = new ShoppingCartQueue();
        toastQueue = new ToastQueue();
    }
    private void ProcessNewTextNotifications()
    {
        CloudQueueMessage cqm = toastQueue.GetMessage();
        while (cqm != null)
        {
            ToastQueueMessage message = 
            QueueMessageBase.FromMessage<ToastQueueMessage>(cqm);

            toastRepository.AddNotification(new 
            UserTextNotification()
            {
                MessageText = message.MessageText,
                MessageDate = DateTime.Now,
                TargetUserName = message.TargetUserName,
                Title = message.Title
            });
            toastQueue.DeleteMessage(cqm);
            cqm = toastQueue.GetMessage();
        }
    }
    private void ProcessShoppingCarts()
    {
        // We will add this later in the article!
    }
    public override bool OnStart()
    {
        // Set the maximum number of concurrent connections 
        ServicePointManager.DefaultConnectionLimit = 12;

        DiagnosticMonitor.Start("DiagnosticsConnectionString");
        // For information on handling configuration changes
        // see the MSDN topic at 
        //http://go.microsoft.com/fwlink/?LinkId=166357.
        RoleEnvironment.Changing += RoleEnvironmentChanging;
        return base.OnStart();
    }
    private void RoleEnvironmentChanging(object sender, RoleEnvironmentChangingEventArgs e)
    {
        // If a configuration setting is changing
        if (e.Changes.Any(change => change is RoleEnvironmentConfigurationSettingChange))
        {
            // Set e.Cancel to true to restart this role instance
            e.Cancel = true;
        }
    }
}

Let’s walk through the worker role code. After initializing and setting up the required queue and table storage, the code will enter a loop. Every 10 seconds, it will process messages in the queue. Each time we pass through the processing loop, we will get messages from the queue until we finally return null, indicating that we’ve emptied the queue.

It’s worth reiterating that you can never look at more than 20 messages from the queue. Anything that does processing on a queue has a limited amount of time to do something meaningful with each queue message before the queue message is considered timed out and shows back up in the queue—making itself available for processing by other workers. Each message gets added as a user notification in table storage. An important thing to remember about worker roles is that once the entry point method finishes, that worker role is done. This is why you need to keep your logic running inside a loop.

From the client side, we need to be able to return the messages as JSON so jQuery can asynchronously poll and display new user notifications. To do this, we’ll add some code to the message controller so we can access the notifications (see Figure 6).

Figure 6 Returning Messages as JSON

public JsonResult GetMessages()
{
     if (User.Identity.IsAuthenticated)
     {
UserTextNotification[] userToasts = 
        toastRepository.GetNotifications(User.Identity.Name);
object[] data = 
(from UserTextNotification toast in userToasts
            select new { title = toast.Title ?? "Notification",
 text = toast.MessageText }).ToArray();
            return Json(data, JsonRequestBehavior.AllowGet);
     }
     else
         return Json(null);
}

In ASP.NET MVC 2 under Visual Studio 2010 beta 2 (the environment we used to write this article), you cannot return JSON data to jQuery or any other client without the JsonRequestBehavior.AllowGet option. In ASP.NET MVC 1, this option is not necessary. Now we can write the JavaScript that will call the GetMessages method every 15 seconds and display the notifications as toast-style messages (see Figure 7).

Figure 7 Notifications as Toast-Style Messages

$(document).ready(function() {

    setInterval(function() {
        $.ajax({
            contentType: "application/json; charset=utf-8",
            dataType: "json",
            url: "/SystemMessage/GetMessages",
            success: function(data) {
                for (msg in data) {
                    $.gritter.add({
                        title: data[msg].title,
                        text: data[msg].text,
                        sticky: false
                    });
                }
            }
        })
    }, 15000)
});

Submitting and Processing a Shopping Cart

For our sample application, another key scenario that we wanted to enable using queue storage was submitting shopping carts. Hollywood Hackers has a third-party fulfillment system (they can’t keep all those gadgets in their little warehouse) so the engine needs to do some processing on the cart. Once the engine is finished doing its processing, it will submit a message to the user notification queue to let that user know that the shopping cart has been processed (or that something went wrong). If the user is online when the cart is processed, he will receive a pop-up toast message from the system. If he isn’t online, he will receive that pop-up message the next time he logs on to the site, as shown in Figure 8.

Figure 8 Sample User Notification

image: Sample User Notification

What we need first are some wrapper classes to allow us to interact with the shopping cart queue. These wrappers are fairly simple and if you want to see the source code for them, you can check them out on the CodePlex site.

 Unlike a standard CRUD (create, read, update, delete) repository, read operations on a queue aren’t simple read operations. Remember that whenever you get a message from a queue, you’ve got a limited amount of time during which to process that message and either fail the operation or delete the message to indicate completed processing. This pattern doesn’t translate well into the repository pattern so we’ve left that abstraction off the wrapper class.

Now that we have the code to interact with the shopping cart queue, we can put some code in the cart controller to submit the cart contents to the queue (see Figure 9).

Figure 9 Submitting Shopping Cart to Queue

public ActionResult Submit()
    {
        ShoppingCartMessage cart = new ShoppingCartMessage();
        cart.UserName = User.Identity.Name;
        cart.Discounts = 12.50f;
        cart.CartID = Guid.NewGuid().ToString();
        List<ShoppingCartItem> items = new List<ShoppingCartItem>();
        items.Add(new ShoppingCartItem() 
             { Quantity = 12, SKU = "10000101010", 
             UnitPrice = 15.75f });
        items.Add(new ShoppingCartItem() 
             { Quantity = 27, SKU = "12390123j213", 
             UnitPrice = 99.92f });
        cart.CartItems = items.ToArray();
        cartQueue.AddMessage(cart);
        return View();
    }

In a real-world scenario, you would get the shopping cart from some out-of-process state like a session store, a cache or from a form post. To keep the article code simple, we’re just fabricating the contents of a cart.

Finally, with the shopping cart contents sitting in the queue, we can modify our worker role so that it periodically checks the queue for pending carts. It will pull each cart out of the queue one at a time, allow itself a full minute for processing, and then submit a message to the user notification queue telling the user that the shopping cart has been processed (see Figure 10).

Figure 10 Checking the Queue for Pending Shopping Carts

private void ProcessShoppingCarts()
{
    CloudQueueMessage cqm = cartQueue.GetMessage();            

    while (cqm != null)
    {             
        ShoppingCartMessage cart =  
        QueueMessageBase.FromMessage<ShoppingCartMessage>(cqm);

        toastRepository.AddNotification(new UserTextNotification()
        {
            MessageText = String.Format
            ("Your shopping cart containing {0} items has been processed.",   
            cart.CartItems.Length),
            MessageDate = DateTime.Now,             
            TargetUserName = cart.UserName
        });
        cartQueue.DeleteMessage(cqm);
         cqm = cartQueue.GetMessage();
    }
}

With the queue message being pulled off and put into the user notification table, the jQuery Gritter code sitting in the master page will then detect the new message on the next 15-second poll cycle and display the shopping cart toast notification to the user.

Summary and Next Steps

The goal of this article is to get developers to shed the safety blanket of their physical datacenters and realize that you can do more with Azure than create simple “Hello World” Web sites. With the power of Azure Queues and Azure table storage, and using this power for asynchronous messaging between the application and its worker role(s), you truly can fuel your application’s engine with Azure.

To keep the article clear and easy to read, we left quite a bit of the code as is, without refactoring. As an exercise to flex your new Azure muscles, try to refactor some of the code in this article to make the use of queues more generic, and even create a stand-alone assembly that contains all the code necessary to do asynchronous messaging and notifications for any ASP.NET MVC Web site.

The main thing is to roll up your sleeves, create some sites and see what you can do. The code for this article can be found on the CodePlex site for Hollywood Hackers: hollywoodhackers.codeplex.com.


You can find the authors blogging and ranting nonstop about new technology at http://www.caffeinedi.com.  Kevin Hoffman and Nate Dudek are the co-founders of Exclaim Computing, a company specializing in developing solutions for the cloud.