아티클
02/08/2017

April 2015

Volume 30 Number 4

OData - Visualize Streaming Data the Easy Way with OData

The Open Data Protocol (OData) is a RESTful protocol designed to allow powerful query and modification operations on data in a backing store, typically a SQL database. Both resource expressions and queries are formed through the URL of an HTTP request, with results returned in the HTTP response. Sophisticated support for queries that can shape, order, filter and page data requests is built in through a query language. Because OData is an OASIS standard, it’s widely implemented and can be consumed across all popular client platforms, such as Web browsers, as well as phones and devices based on iOS, Android and Windows. OData is frequently viewed as a good way to provide a standards-based service that can be consumed across multiple platforms easily. For some starter links, see “Additional References.”

In this article, I’ll demonstrate why the semantics of OData make it the perfect vehicle for exposing near-real-time time-series streaming data, as well as data from SQL backing stores. (I’ll use the term time-series to differentiate from static backing stores. This avoids the use of real-time, which has different meanings in different contexts.)

This article will demonstrate a sample implementation of an industrial automation time-series data streaming capability using an OData service. Figure 1 shows a sample test client, written in C# using Windows Presentation Foundation (WPF), that connects to the service and uses simple LINQ queries to process the time-series data stream. I’ll discuss this in more detail shortly. As I mentioned, many possible clients exist, including browser-based and device apps.

Figure 1 A Sample OData Test Client Written in C# Using Windows Presentation Foundation

The client connects with an OData service to manage multiple items and subscriptions. The membership of items in each subscription, the subscription update interval and the dynamic monitoring of subscriptions can all be managed through the test client. I’ll cover the concepts of items and subscriptions in more detail later. Figure 2 shows another window of the test client, with time-series data being streamed from one of the subscriptions.

Figure 2 Time-Series Data Streaming from a Subscription in the OData Service

This technique can be extended to any time-series data from devices on the Internet of Things (IoT). As the IoT continues to build out, devices from rain sensors to point-of-sale terminals are becoming available directly over the Internet. The further this process goes, the greater the need for a simple, consistent way to access and manage all of the data. The technique I show here is ideally suited to traverse the Internet, allowing data from IoT devices to be consumed by computers and other devices, making the information accessible and meaningful.

My example is for demonstration purposes only, and is scoped in capability for purposes of clarity: all of the items it exposes are of the same integer type; the “device” providing the data is a simulation; and no attempt is made to authenticate a user or provide a user identity and per-user session. All these things would be necessary for a production service, but have been eliminated for illustration purposes.

Design the Entity Model

The first step in defining an OData service is designing the entity model. An OData service is exposed at a specific base URL, such as https://localhost:10001/TimeSeriesData.svc, and each URL segment below this base URL represents a different kind of resource within the entity model. The base URL may change, depending on how the OData service is deployed, but the entity model will dictate a fixed set of resource URL segments under it.

With typical OData services that are backed by a database, resources can be modeled using an automated tool such as ADO.NET Entity Framework. My time-series OData service, on the other hand, exposes resources as in-memory structures, so I generated the entity model by hand. A reasonable set of resources for a time-series data streaming service might be Items, Subscriptions and Samples:

Get All Items: https://localhost:10001/TimeSeriesData.svc/Items should return all the items available in the device for which the service supplies data. Each item should have properties such as id, name, type, state, read-write capabilities, current value and so on.

Get All Subscriptions: https://localhost:10001/TimeSeriesData.svc/Subscriptions should return all the subscriptions that have been created and persisted by previous calls to the service. A subscription is a mechanism to group items into collections. An additional feature of a subscription is that the items grouped into each subscription are sampled together at a specific rate. Each subscription should have properties such as id, subscription interval, its collection of items, and its collection of samples for those items.

Get All Samples: https://localhost:10001/TimeSeriesData.svc/Samples should return all the samples exposed globally by the service. For this article, this query never returns anything, because in real-life scenarios it’s a meaningless query. Samples are always associated with a subscription, so should not be queried at the root. However, for WCF Data Services to expose samples as entities, the root query must be valid.

In addition to the three basic entities, I added associations between them. The Subscription resource is associated one-to-many with both Samples and Items.

Get All Samples Within a Subscription: https://localhost:10001/TimeSeriesData.svc/Subscriptions(25)/Samples should return all the samples available for subscription 25. This assumes the client has previously created a subscription entity with SubscriptionId = 25. Each sample should have properties such as id; the id of the item that was sampled; and the value, time and quality of the item when it was sampled.

Get All Items Within a Subscription: https://localhost:10001/TimeSeriesData.svc/Subscriptions(25)/Items should return the collection of items currently associated with subscription 25. This assumes that the client previously created a subscription entity with SubscriptionId = 25 and associated items with it.

Figure 3 shows the entity model of the time-series OData service. Items are physical entities in the device, usually addressable through a string name or numeric id. Items typically contain a value exposed by the device, such as temperature or pressure, along with metadata such as the timestamp of the most recent sample and quality of the measurement. As Figure 3 shows, they’re exposed as OData entities by their numeric ItemId.

Figure 3 The Entity Model Exposed by the Time-Series OData Service

As an entity, the Subscription is a fiction invented to group subsets of device items together, and to collect samples from that subset at a fixed period. Because subscriptions don’t really exist, there must be some mechanism in the service to generate the entities in such a way that they appear to be backed by a store.

Sample entities are even more ephemeral than Subscriptions. The sample entity is invented to capture the value and metadata of a single item at a specific time. When a subscription is operating, it typically accumulates Sample entities as needed to capture the recent samples of items associated with that subscription. The fiction with Sample entities is that a vast table of entities exists to fulfill OData requests. In reality, Sample entities are deleted from storage as fast as they are returned to the client. Because clients are typically not interested in obtaining the same sample more than once, this is not an issue.

Dealing with these in-memory abstractions in a manner that removes statefulness from the service runtime is the crux of a solution that allows an OData service to scale—sequential calls from a client shouldn’t depend on the service remembering details about the processing of subscriptions or samples in previous calls.

This means abstract entities such as Subscriptions and Samples must never be remembered as state in the OData service. Instead, they must be considered part of the backing service or device. In my example OData service, I implemented a simulation layer to remove any dependency on real hardware. The entities are provided by the simulated hardware, and simply exposed by the OData service. In particular, Item and Subscription entities are persisted by the simulation layer to an XML file.

The Sample entity is a special case. Samples are virtual in the sense that no Sample entity exists in the simulated hardware, but Samples are also ephemeral. Typical time-series clients—as opposed to historical clients—are interested only in the most recent Samples. Once those Samples are read, they can be discarded in the service. In my sample code, I provide each Sample entity with a unique Id, and maintain the fiction that the OData client might at any time ask for really old Samples. However, older Samples are forgotten by the simulated hardware layer, so any client that actually did this would be disappointed.

Implement the Sample OData Service Using WCF Data Services

Now that the entity model is defined, the next step is to implement an OData service. The ASP.NET Web API framework is a powerful modern choice, but I fell back on the older WCF Data Services for its simpler implementation of the basics.

The class TimeSeriesDataService exposes the service by deriving from the template class DataService:

public class TimeSeriesDataService : DataService<TimeSeriesEntities>
{
  public static void InitializeService(DataServiceConfiguration config)
  {
    // Initialize the service here...
  }
}

The template parameter of the DataService base class specifies the class that defines the entity model for this service, TimeSeriesEntities. To allow for the queries shown in the earlier design section, the TimeSeriesEntities class exposes the Sample, Item and Subscription entities as queryable collections, as shown in Figure 4.

Figure 4 The Entity Model in Code

public partial class TimeSeriesEntities
{
  public IQueryable<Sample> Samples
  {
    get { return (new List<Sample>()).AsQueryable<Sample>(); }
  }
  public IQueryable<Item> Items
  {
    get
    {
      return (from item in ItemRegistry.Items
              select new Item(item.Id,
                              item.Name,
                              item.Value,
                              item.Time,
                              (int)item.Quality))
               .AsQueryable<Item>();
      }
  }
  public IQueryable<Subscription> Subscriptions
  {
    get
    {
      return (from subscription in SubscriptionRegistry.Subscriptions
              select new Subscription() { SubscriptionId =
              subscription.Id })
             .AsQueryable<Subscription>();
    }
  }
}

By providing public properties that return IQueryable<T> collections, you enable the WCF Data Services framework to process all the powerful OData queries that can be expressed as HTTP URLs. Your code provides the queryable collections, and the framework performs the actual filtering, sorting, pattern matching and so forth.

Make the OData Service Writable

To allow entities to be modified in the service, you have to add a little more code. This code will be called by the WCF Data Services framework in response to modification commands sent by the HTTP client using POST, PUT or DELETE verbs. A typical time-series data service allows the client to modify the state of the service in various ways: write the values of items; create and delete subscriptions; add and remove items from subscriptions; and more.

When using WCF Data Services, this modification ability is provided by implementing the IUpdatable interface. When a client sends an HTTP POST with a description of an entity to create, the Data Services framework handles the request, passing requests to the 12 IUpdatable interface methods, as needed.

My first service modification code allows the creation and deletion of subscriptions. You’ll need to implement other IUpdatable interface methods to support other modification capabilities, such as writing a value to an item.

I enable creation and deletion of subscriptions by implementing the CreateResource and DeleteResource IUpdatable methods, as shown in Figure 5.

Figure 5 Creating and Deleting Subscriptions

public partial class TimeSeriesEntities : IUpdatable
{
  object IUpdatable.CreateResource(string containerName,
    string fullTypeName)
  {
    Type t = Type.GetType(fullTypeName, true);
    return Activator.CreateInstance(t);
  }
  void IUpdatable.DeleteResource(object targetResource)
  {
    MethodInfo deleteMethod =
      targetResource.GetType().GetMethod("Delete");
    if (deleteMethod != null && deleteMethod.IsPublic)
    {
      deleteMethod.Invoke(targetResource, null);
    }
  }
}

The WCF Data Services framework manages the bulk of the POST, PUT, and DELETE client calls, and simply needs my code to deal with any types specific to my internal implementation. You can choose to implement static factory methods, or use reflection to create new instances, as shown in Figure 5. My choice of mechanisms to delete resource objects relies on an implementation detail—my resource objects all expose a method called Delete, which can be called from DeleteResource. You can choose other implementations as desired.

OData Client Made Easy with LINQ

No matter how a time-series OData service is implemented and exposed, its power lies in what a consuming client can do with it. This is especially true with the Microsoft .NET Framework, because LINQ has a set of extensions that directly interface to OData services, passing as much of a query to the service as possible. The use of LINQ makes it easier for client code to manage complex queries, including filters, sorts and others with relative ease.

Create an OData Client in Visual Studio

With my sample OData service running, I started Visual Studio and pointed the Add Service Reference tool to the base URL of the OData service. The tool examined the OData metadata and created a class derived from System.Data.Services.Client.DataServiceContext, and called it TimeSeriesEntities. This class is customized for the OData service, and includes properties that return typed DataServiceQuery<TElement> objects, customized for each of the IQueryable<TElement> collections in the WCF Data Services code:

public DataServiceQuery<Sample> Samples { get; }
public DataServiceQuery<Item> Items { get; }
public DataServiceQuery<Subscription> Subscriptions { get; }

Each of these DataServiceQuery<TElement> instances implements IEnumerable, so each is a target for LINQ queries. Where possible, LINQ queries formed against DataServiceQuery<TElement> objects will translate into OData queries, allowing much of the filtering, sorting and pattern matching to occur at the service.

In the examples that follow, I’ll be using the service base URL https://localhost:10001/TimeSeriesData.svc, contained in the variable dataAddress. For all the query examples that follow, a TimeSeriesEntities object named dataContext is assumed to exist, and to have been initialized with the service base URL:

TimeSeriesDataFeed.TimeSeriesEntities dataContext =
  new TimeSeriesDataFeed.TimeSeriesEntities(dataAddress);

As a starter example, the OData URL https://localhost:10001/TimeSeriesData.svc/Items is generated by the following LINQ query:

var serviceQuery = from item in dataContext.Items
                   select item;

The returned LINQ query can be executed with:

serverItemsList = serviceQuery.ToList();

sending the URL and receiving and parsing the OData response, resulting in an enumeration of all Item entity objects available at the root service level.

Note that this query is similar to browsing all items in the device namespace. In other protocols, the ability to browse like this is an added feature, and much effort is expended to define and implement the capability. With OData, it comes for free!

A more nuanced query might be to enumerate all items currently associated with a specific subscription, for example, the subscription with SubscriptionId = 25. The LINQ query:

var subscriptionQuery =
  (from subscription in dataContext.Subscriptions.Expand("Items")
  where subscription.SubscriptionId == 25
  select subscription).FirstOrDefault();

generates the OData URL:

https://localhost:10001/TimeSeriesData.svc/Subscriptions(25)/?$expand=Items

The call to FirstOrDefault dereferences the returned LINQ query and executes it, sending the URL and receiving and parsing the response. This time, the result is a single Subscription entity object with all Item entity objects associated with the subscription. The subscription items can be enumerated with:

serverItemsList = serverSubscription.Items.ToList();

Client code pulls published samples from a subscription using a LINQ query like this:

var subscriptionWithSamplesQuery =
  (from subscription in dataContext.Subscriptions.Expand("Samples")
  where subscription.SubscriptionId == 25
  select subscription).FirstOrDefault();

which generates the OData URL:

https://localhost:10001/TimeSeriesData.svc/Subscriptions(25)/?$expand=Samples

In this case, Sample entities associated with the Subscription entity contain sampled values for many Item entities, possibly one or more sample for each item associated with the subscription. As long as I’m using LINQ, I’ll group samples by item with a local LINQ query operating on the IEnumerable collection returned by the previous LINQ query:

var samplesInSubs = from sample in subscriptionWithSamplesQuery.Samples
                    group sample by sample.ItemIdSampled into sampleSubs
                    select sampleSubs;

This example is the basic-course use case of time-series data acquisition. It demonstrates how naturally LINQ queries that generate OData transactions can segue into in-memory LINQ queries that manipulate local data. At this point, samplesInSubs is an IEnumerable of IEnumerable, a collection of groups of samples, grouped by ItemId. It’s natural for the client to process the data in this form.

These examples of OData queries don’t even begin to touch the expressiveness of OData. Much more filtering, sorting and sampling is possible. See“Additional References” for links to more examples.

A Simple WPF OData Test Client

Putting everything together, I wrote a small WPF application to showcase the concept of time-series process data from an OData service. The OData Test Client shown in Figures 6-10 uses the LINQ queries I explored earlier, plus a few others, to read data from and write data to a sample OData service.

When the main screen first comes up (see Figure 6), it makes several OData queries, allowing it to fill in the list boxes with all subscriptions in the service, as well as names and Ids of all items in the service.

Figure 6 The OData Test Client Main Window

In Figure 7, you see that selecting a subscription in the left list box reads the item names and Ids of the items in that subscription into the right list box, which allows you to read and set the update interval of that subscription, and enables several command buttons for that subscription, which I’ll describe as I go.

Figure 7 A Subscription Is Selected

No discussion of time-series streaming data is complete without a strip-chart recorder, so I found the excellent Dynamic Data Display project at CodePlex, and used it to build a multi-pen dynamic display. (Note that the Dynamic Data Display for Silverlight library is a Microsoft Research product and its license states that the library is for non-commercial use only.) Data for all items in a single subscription roll as expected from right to left. You get a subscription monitor dialog like the one in Figure 8 every time you select a subscription and press Monitor.

Figure 8 Monitoring a Single Subscription

In addition to monitoring time-series data from many items, the same OData queries can be used to perform item browsing, as shown in Figure 9. All the items from either a single subscription or the service as a whole are displayed in the list box on the right, but browsing also means choosing which items to include in a subscription. To get a browsing window, select a subscription on the left and press the Browse button. The changes made to that subscription by moving items left (out of the subscription) or right (into the subscription) are updated in the service and persisted each time a service call is made, so no Save button is needed.

Figure 9 Browsing Items to be Included in a Subscription

Finally, many subscriptions can be monitored at once, as shown in Figure 10.

Figure 10 Monitoring Four Subscriptions at Once

Challenges and Limitations of OData for Time-Series Streaming Data

Although my company specializes in high-density, high-throughput time-series streaming services and clients, no attempt has been made (yet) to quantify the performance of OData for this use. As I compare my OData approach against other approaches common in my industry, I see potential for some downsides:

No data acquisition mechanism can be safely used in today’s networking environment without security, including mandatory user authentication and optional encryption. The obvious way to provide these features is to shift from HTTP to HTTPS. The additional burden of public key infrastructure (PKI) administration and loading of both CPU and bandwidth resources may make this approach unworkable for some applications.
The approach detailed in this article requires that the service persist some state, such as the definition of multiple subscriptions. When combined with user authentication, this means some state must be persisted at the service side for each user. For scalability reasons, care must be taken to ensure any required state information will be incorporated into the entity model, and not be implemented as part of the service itself. As long as statelessness in the service itself is carefully preserved, the service should scale linearly with added clients, and multi-tenant cloud hosting and server farms should be possible.
Caching responses for clients that make rapid repeated service requests could be a performance enhancement. This issue comes in two flavors: the Subscriptions(x)/Samples entity set and everything else. Samples from a specific subscription are likely to be the most common request by far, because that’s how streaming data is acquired. This works well with the service design, because subscription samples are by definition always in-memory. If other request types occur with enough frequency to be a problem, additional caching might need to be considered.
OData wire performance may limit its effective item density or throughput. At my company, we haven’t yet made measurements to determine this, and it’s only a possibility. When compared to SOAP-based transports, the JSON payload from OData may compare favorably to SOAP envelope overhead.
OData and other REST-based services provide no mechanism to report by exception, or to push notifications of important events to the client. All data transfers must be requested by the client, so only a pull model is possible. This limitation is comparable to most other protocols used by the industry that are capable of traversing the Internet. For example, OPC-UA is an industry-standard SOAP-based protocol, and requires notifications to be pumped using a publish-request loop driven by the client. A valuable exploration for future development of this technique would be to explore the use of server-side push technologies such as SignalR.

From the examples in this article, it seems clear that OData is a feasible approach with several benefits:

Client code is easier to manage and deploy.
Multiple client platforms—from phones to desktops—can consume the same data.
OData is a viable approach for densities of a few dozen to a few hundred items, and throughputs up to a few hundred samples per second, but is untested for high performance.

In my next article, I’ll follow up with an exploration of how to consume “data in flight” from OData services with Reactive Extensions (Rx).

Additional References

OData Web Site: odata.org

OASIS OData Standard: bit.ly/163s1gZ

OData Query Language: See the canonical description at bit.ly/15PVBXv or Microsoft’s take, including limitations of using LINQ to generate the queries, at bit.ly/1zZYMqq.

OData, CKAN AND Microsoft Azure: How OData typically provides access to large data sets (bit.ly/1LrzmYp).

LINQ Queries Mapped to OData Queries: There are other third-party extensions that extend LINQ beyond the WCF Data Services Client Library. The basic WCF LINQ queries, and how they map to OData queries, are detailed at bit.ly/1zf8Gp7.

Dynamic Data Display: I found this really excellent set of controls for dynamic data display at bit.ly/1CI73B2. Using it for strip-chart recorders barely scratches the surface of its capabilities.

Louis Ross is a multi-technology architect for Wonderware by Schneider Electric, where he has been developing device-integration technologies for 17 years. Previously, he freelanced as a designer of embedded controls, specializing in hardware and firmware design.

Thanks to the following technical experts for reviewing this article and providing valuable feedback: Ming Fong (Schneider Electric) and Jason Young (Microsoft)
Ming Fong (Schneider Electric) is Development Manager at Wonderware by Schneider Electric.
Jason Young (Microsoft) is a Senior Program Manager, Microsoft DX, and a co-host of the MS Dev Show (msdevshow.com)