Common workloads on Azure Service Fabric

I recently finished reading Programming Microsoft Azure Service Fabric. I think it's the best resource for anything on Service Fabric. I particularly liked the discussion on common workloads that can benefit from running on SF. This post is a synopsis of that discussion. If you need to know more, I will advise to buy this book.

Now back to workloads on Service Fabric. Below are some of the main workloads that you can run on Azure Service Fabric.

Web Applications (e-Commerce, Enterprise Portal, Mass-sourcing sites, etc.)

This workloads benefits from following Service Fabric capabilities.

Stateful Services: These services allows state to be maintained alongside compute resources. It cuts down any additional caching component that is needed for low-latency and hyper-scale operations.

Heterogeneous clustering: SF allows to create different node types and have different durability and reliability characteristics on them. Its easy to set different scaling model depending upon the node type. In its typical implementation, web front end nodes can have a different scaling than middle tier nodes.

Rolling/Partial upgrades: This is a key differentiator in SF.  This is great in scenarios where a faulty codebase somehow makes into production and causes cluster instability. There is a very good demo  on how this works.

Actor programming model: SF provides a very prescriptive actor programming model to design microservices. This is a good option for teams that are just starting to learn write/manage microservices.

Real-time data streaming applications(Actor Model based Big Data Solutions)

SF's programming model can be used along with other real-time data streaming/processing services such as Azure Stream Analytics or Azure Functions.

While Azure Stream Analytics predominantly uses SQL-like query language to gather, process and generate output. SF's support for stateless and stateful services can leverage .Net framework's full API capabilities such as LINQ, HTTPClient, File System, etc. You can also use any EXE based applications such as NodeJS, C, C++ applications and run it on top of SF to participate in the real-time data streaming pipeline.

A typical real time data processing workflow that uses SF will look something like below.



IoT Solutions (Device orchestration, Command and Control Patterns)

SF's Actor Programming model is a natural fit for IoT solutions. Any typical IoT solution involves large amount of devices/sensors. Each of these device/sensor can potentially represent an Actor. Stateful Actor specifically benefits from quick and reliable state updates, moving average data tracking to alleviate momentary drift of the reading from devices, easy recovery in cases of crashes.

A typical implementation of such an Actor may look something like below.



Overall, SF can fit into any IoT solution as depicted below.


Just like in the real-time data streaming solution, SF can be used for Transformation and analysis phase. It can connect and read sensor/device data from Event/IoT Hub and pushing processed data into some kind of permanent storage to be made available via a reporting solution.

Multi-tenant applications

Multi-tenant applications are characterized by sharing of resources. In this model, an application can be shared with multiple customers (or tenants). Multi-tenancy brings its own challenges such as data isolation, service boundaries, fault tolerance and many others.

SF provides an underlying platform on top of which such applications can be built. A typical implementation may look like something below.


A single instance of SF cluster can host multiple end-point, each of which serves a single tenant or customer. A client program written using SF SDK can connect with the desired tenant endpoint and interact with application functionality. Each tenant benefits from the health monitoring, automatic failover and billing capabilities provided by SF. This model facilitates easier on-boarding for new tenants without affecting already running tenants.

A big advantage of using this approach is that different tenants can be served by different service model. Common example of where such requirements become essential is that of customer wanting to try an application before buying it. During this trial period, reliability and availability requirements aren't very stringent. This means that you can place applications under trial in different "node types", another SF feature, which allows running applications on different cluster depending upon their reliability and durability needs. Such an arrangement might look something like below.


As can be seen, an application running in production can be placed on node type Gold, which has highest reliability and durability specifications. On the other hand, applications in trial period can go on node type Bronze, which has lower reliability and durability specifications as well as costs.

A multi-tenant model can be further enhanced by leveraging other Azure services such as Machine Learning and Power BI to fine tune resource allocation/reporting to tenants.  While running applications in trial period on a different node type provides a good flexibility, you may want to leverage same model for application that are already running in production. A typical usage could be to understand application usage, trends, hotspots, etc. Based on such an analysis enhancements to billing, provision for increased cluster capacity, etc. can be made proactively before things become difficult to manage. Such an arrangement may look something like below.



Multi-player gaming

This is another application category where SF really shines. The compute (and latency) requirements for complex multi-player games are elaborate. Stateful services and actors provide high degree of resiliency and fault tolerance which are especially demanding in case of multiplayer games. SF handles state retention during scale up and down operations which means performance will not be affected when there is a surge in number of players playing the game or even when not many players are playing.  A very popular web based multiplayer game, Age of Ascent, heavily uses SF. Use this build session to understand how SF can be used to implement these kind of application to understand more.