February 2014

Volume 29 Number 2

Windows Azure Insider : The Windows Azure Service Bus and the Internet of Things

Bruno Terkaly | February 2014

Bruno Terkaly and Ricardo VillalobosMachine-to-machine (M2M) computing is fast becoming a technology that all developers and architects need to embrace. Numerous studies suggest a coming world of tens of billions of devices (a half-dozen for every human on earth) by 2020 (bit.ly/M2qBII). One reason this is happening is because it has never been easier for garage tinkerers and hobbyists to prototype a consumer or commercial device for later manufacture and sale. Getting started with both the hardware and software is incredibly inexpensive. For less than $100, you can order an Arduino or a Raspberry PI.

But you do get what you pay for and these devices—especially the Arduino—represent the low end of the device-capability spectrum. With 32KB of Flash storage and 4KB of RAM, they lack the power to run anything but the simplest aspects of a Web stack. The Arduino is an open source microcontroller fitted with a small chip, memory and storage. The software consists of a standard programming-language compiler and a boot loader that executes on the microcontroller. You can add expansion boards that plug into the device, hooking up motor controls, GPS, Ethernet, an LCD display, sensors, actuators and more. Pay a little more money and you can get the more powerful Raspberry PI, which comes with a version of the GNU/Linux OS and up to 512MB of RAM.

These devices rarely have value unless they’re connected to something else, perhaps a cloud back end that receives data and sends commands. But connecting to, communicating with and managing these devices from an application running in the cloud presents some special challenges. The sheer number of devices, as well as their limited battery life and bandwidth, forces a cloud developer to carefully consider all options.

In this article, we’ll take a look at how developers are trying to overcome the key challenges of addressability, bandwidth and security. We will debunk the myth that IPv6 and virtual private networks (VPNs) are simple, efficient, and secure, and will propose that the Windows Azure Service Bus is the perfect product to elegantly overcome these challenges. To drive home some of the concepts, we’ll present four patterns you can use on your client device when communicating with the cloud back end. Finally, we’ll present a brief introduction to Service Bus support for the Advanced Message Queuing Protocol (AMQP) 1.0—an interoperable and efficient protocol for device-to-cloud communication.

For many years, secure connectivity meant using TCP/IP with IPv4, combined with VPNs. This worked reasonably well, but is now showing signs of age. For starters, it’s difficult to get a unique IP address out on the public Internet to use with a device—we’ve pretty much run out of IP addresses. Diehard fans of this approach promise that IPv6 will come to the rescue. The conventional wisdom is that if you give the device a unique IP address, all your difficult problems are solved. Unfortunately, this solves only a small part of the overall problem. Giving each device its own unique IP address is definitely not the silver bullet many had hoped.

Just to be clear, IPv6 and VPNs are fraught with problems in a crowded, connected-device world. Bandwidth, in particular, is a challenge. Chatty connectivity between device and network can lead to excessive traffic. Moreover, using typical HTTP request/response approaches for all messaging drains battery life on many devices. Perhaps most important is that security can’t be guaranteed. VPNs are definitely insecure in some scenarios. Before presenting a solution, let’s dive a little deeper and explore these problems.

Devices that create excessive network traffic communicating with the cloud back end are problematic. Bandwidth costs money, potentially a lot of money if there are many chatty devices. Furthermore, more data means more CPU use, which means more energy, a precious resource on a mobile device. Most battery-powered devices equipped with a Wi-Fi transmitter and a SIM card need to enter a low-power “sleep” mode during periods when they aren’t transmitting or receiving data. The IEEE 802.11 standard defines this power-save polling feature. The data gets buffered when the device is sleeping. Once it awakens, the buffered data is delivered. Chatty networks fill buffers and prematurely awaken sleeping devices.

HTTP request/reply approaches can be ridiculously wasteful, given the size of the payload relative to the overall HTTP request-­response infrastructure. Suppose a device simply needs to report a number to the cloud, such as temperature, pressure or GPS coordinate. That binary data is probably only a few bytes in size, but the HTTP POST is generally at least 500 to 1,000 bytes, with the request header alone ranging from 200 to 2,000 bytes. Sure, some developers use tricks, like stuffing everything into the HTTP header to avoid the overhead of the body part of the HTTP request. But that isn’t sufficient, and the size of the HTTP request only gets bigger when you have to transmit security credentials.

Here’s some simple math to convince you. Imagine your device has to send temperature data every 5 seconds and the payload for the temperature data is a generous 20 bytes. In a 24-hour period, the temperature data by itself would transmit from the device to the cloud about 350,000 bytes. If you add in the HTTP request/response envelope, you raise each transmission by 800 bytes, a factor of 41, sending more than 14MB to the cloud instead of just the 350KB of temperature data. This can get prohibitively expensive if you’re supporting thousands of devices.

Perhaps the biggest misconception is that VPNs are inherently safe. The reality is that VPN networks can be risky, especially when devices connected to a VPN are outside the manufacturer’s or operator’s immediate physical control. Once a single device is breached, all devices connected to the same VPN are vulnerable. Once an untrusted user gets access to a connected device, he or she can use the device to explore and attack your internal resources. Despite these shortcomings, VPNs are often the only option offered by many carriers.

The Windows Azure Service Bus Approach

Windows Azure Service Bus offers some great solutions to these challenges. Leveraging the Service Bus is more secure, because the device is only an endpoint on the Internet where it can place messages into a queue. The device can’t reach other protected network resources inside cloud services. In addition, using the Service Bus for device connectivity costs less in terms of power, because the device can sleep more often, waking up periodically to pull any waiting messages from the queue.

The Service Bus provides even more value because it can:

  • Decouple device communication and interaction from your cloud service
  • Enable load leveling and load balancing among several instances of your back-end service
  • Identify duplicate messages
  • Gather messages into logical groups (called Sessions)
  • Implement transactional behavior and atomicity
  • Support ordered delivery of messages and provide a time-to-live for each message
  • Extend to publish-subscribe scenarios easily using Topics and Subscriptions

To get a more concrete idea of how a device connects to and communicates with the cloud back end, take a look at Figure 1, which depicts how a special-purpose device might fit into a bigger architecture. We’ll use the canonical example of OpenSprinkler, an open source, Internet-based sprinkler/irrigation valve controller that’s capable of checking the weather before deciding to turn on the water. It’s built using Arduino parts. Note in Figure 1 that the Arduino controls the sprinkler system using a home network as the Internet connection and communicates with a cloud back end.

Arduino Sprinkler System Reference Architecture
Figure 1 Arduino Sprinkler System Reference Architecture

Solving Connectivity Problems

Windows Azure Service Bus does a great job of solving the addressability and network connectivity challenges. The Arduino device would probably sit behind some NAT layer, making it difficult to reach from a cloud service. Fortunately, the Service Bus dramatically simplifies connectivity by acting as a relay service and serving as a proxy for a cloud back end. Moreover, it can provide Queues, Topics and Subscriptions, which enables it to act as an event hub for messages sent between the cloud and the device. The decoupled nature of a queue acting as a relay lets the device asynchronously send and receive messages to and from the cloud, even if only occasionally connected. For security, the device can be authenticated with SharedAccessSignature, SharedSecret, SAML or SimpleWebToken.

Notice in Figure 1that one or more worker roles may be reading from the Service Bus message queue. Worker roles can make decisions and issue commands back to the device. Other worker roles might be getting weather information from other systems, such as the National Weather Service. Worker roles may also be saving all the unprocessed incoming events into a NoSQL database, such as MongoDB.

Figure 1 also shows mobile users interacting with Web roles to schedule watering. Mobile users can receive push notifications from Windows Azure Mobile Services (WAMS), which supports all the major notification networks, such as Windows Notification Services (WNS), Microsoft Push Notification Service (MPNS), Apple Push Notification Service (APNS) and Google Cloud Messaging for Android (GCM). WAMS makes it easy to support Windows, iOS and Android.

You can even envisage a machine-learning part of the architecture. Windows Azure can support Linux VMs and it’s quite simple to configure PyMongo (a Python driver for MongoDB) to read the event stream produced by various devices and use machine-learning techniques in PyML to find patterns or make predictions about the event stream data. Based on certain predictions or patterns, the cloud service can choose to send commands to the device, such as turning on or shutting off the water.

A messaging system that’s the primary endpoint for sending and receiving data is extensible, because devices can continue to send a single message stream while new Subscriptions can be added to a Service Bus Topic for each new system that will consume that message stream. These systems can be for real-time analytics and machine-learning as well as other scenarios described earlier.

Communicating with the Cloud

There are four patterns that can be used on the client to communicate with the cloud service. These are outlined in Figure 2.

Figure 2 Four Patterns for Device-Cloud Service Communication

Pattern Summary Example
Telemetry A client device sends data (one way) to a cloud service. A device publishes messages about the temperature to Topics. The cloud service subscribes to some or all of these temperature messages.
Inquiry A client device sends a query to the cloud service and receives a response. A device inquires about upcoming weather conditions by posting a weather inquiry to a topic. The cloud service subscribes to inquiries and posts a message response to a topic of its own to which the device subscribes.
Command A cloud service issues a command to a client device and the client device returns a success or failure response. The cloud service publishes a temperature message/command to a topic to which a device subscribes. The device then turns water on or off and sends a reply back to the cloud service by posting a response to a topic.
Notification A cloud service issues a one-way out-of-band notification to a client device that’s important for the device’s operation. The cloud service sends a time-reset message to a device by publishing the message to a topic to which that device subscribes.

All four patterns leverage Service Bus Topics and Subscriptions. Depending on the direction of the communication (device to cloud service or cloud service to device), the device can either subscribe to topics or publish to topics. Topics are simply a mechanism to send messages, while subscriptions are used to consume messages. You can create filter rules for subscriptions to allow more fine-grained control over which messages are retrieved. Worker roles in the cloud service can be used to publish messages to topics or to consume messages from subscriptions.

Due to space constraints, we can’t illustrate all of the patterns here in this article, so we’ll delve into just one. Figure 3 shows a reference implementation of the Command pattern. It demonstrates that devices from Buildings 1 and 2 can subscribe to messages (Commands) and post responses back to topics. Note that worker roles in the cloud service can publish messages to a Temperature and Shade Command Topic and that specific devices can separately subscribe to Temperature or Shade Control messages. Service Bus Topics and Subscriptions can be used in a wide variety of combinations to partition the message flow appropriately.

Architecture for the Command Pattern
Figure 3 Architecture for the Command Pattern

AMQP and Interconnectivity

The Windows Azure Service Bus team recently announced support for the Advanced Message Queuing Protocol (AMQP) 1.0, an open standard with a binary application layer for message-oriented middleware. Its main value is that it’s highly interoperable and that it uses a binary format on the wire to minimize payload size.

AMQP supports reliable message transfer, queuing, routing, pub/sub and more. Because it’s a wire-level protocol targeting data streaming across the network, any compliant tool can interact with the data regardless of the implementation language. This enables cross-platform, hybrid applications using an open, standard protocol. The library lets you mix and match languages, frameworks, and OSes, supporting .NET, Java, Python and PHP. You’ll find more information on the Windows Azure Samples page at aka.ms/G3izk8. Designed to be light and inter­operable, AMQP is a good fit for many of today’s devices needing connectivity to a cloud back end.

However, AMQP is too much software for today’s Arduino, which lacks the necessary memory, storage and processing power. Running AMQP requires support for Transport Layer Security (TLS), a cryptographic protocol that provides com­munication security over TCP. TLS uses X.509 certificates (asymmetric cryptography) to validate the identity of communicating parties across the wire. In addition, the Apache Qpid Proton client-based messaging library is often used to integrate with the AMQP to simplify communications across routers, bridges and proxies. All of this raises the question: How do you support low-end devices connecting to cloud back ends while enjoying the benefits of the Service Bus messaging infrastructure?

One option is to pay more money and get a Raspberry PI. If you don’t want to do that, you’ll need to be more creative. You can start by leveraging Clemens Vasters’ code at bit.ly/1acvLdS, which lets an Arduino receive a command to blink an LED light on the microcontroller. The code implements a device gateway, providing a TCP endpoint to which the Arduino connects. To maintain the connection through NATs and the Windows Azure load balancer, the cloud service needs to ping the Arduino every 235 seconds (just less than 4 minutes). See Vasters’ C# project, LedBlinkerServer.

In our next column, we’ll take a deeper dive to explain how the code works and how you can get the Arduino to send and receive messages to and from the Service Bus.

Wrapping Up

In this month’s column we presented four patterns that can be used to build a reli­able message exchange between a device and cloud services. We introduced AMQP, the open source message-queuing protocol that helps to increase interoperability and minimize bandwidth and is completely supported by the Windows Azure Service Bus. Finally, we began discussing how to support low-end devices connecting to cloud back ends while using the Service Bus messaging infrastructure, which we’ll continue in our next article.

We’d like to thank Clemens Vasters and Abhishek Lal for helping us understand the brave new world of connected devices. Clearly, the world of special-purpose devices connected to cloud services is growing rapidly. Traditional approaches to communicating with a cloud service need to be reevaluated. Security, bandwidth, network reliability, and interoperability are just some of the challenges that architects and developers face with special-purpose devices in the M2M world. Using the Windows Azure Service Bus makes those challenges far less daunting.

Bruno Terkaly is a developer evangelist for Microsoft. His depth of knowledge comes from years of experience in the field, writing code using a multitude of platforms, languages, frameworks, SDKs, libraries and APIs. He spends time writing code, blogging and giving live presentations on building cloud-based applications, specifically using the Windows Azure platform. You can read his blog at blogs.msdn.com/b/brunoterkaly.

Ricardo Villalobos is a seasoned software architect with more than 15 years of experience designing and creating applications for companies in multiple industries. Holding different technical certifications, as well as a master’s degree in business administration from the University of Dallas, he works as a cloud architect in the DPE Globally Engaged Partners team for Microsoft, helping companies worldwide to implement solutions in Windows Azure. You can read his blog at blog.ricardovillalobos.com.

Terkaly and Villalobos jointly present at large industry conferences. They encourage readers of Windows Azure Insider to contact them for availability. Terkaly can be reached at bterkaly@microsoft.com and Villalobos can be reached at Ricardo.Villalobos@microsoft.com.

Thanks to the following Microsoft technical experts for reviewing this article: Abhishek Lal and Clemens Vasters