De-batch and filter serverless event processing with Event Hubs

Event Hubs
Functions
Cosmos DB

Solution Idea

If you'd like to see us expand this article with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know with GitHub Feedback!

This solution idea shows a variation of a serverless event-driven architecture using Azure Event Hubs and Azure Functions to ingest and process a stream of data. Once de-batched and filtered, the results are written to a database for storage and future review.

To learn more about the basic concepts, considerations, and approaches for serverless event processing, consult the Serverless event processing reference architecture.

Potential use cases

A popular use case for implementing an end-to-end event stream processing pattern includes the Event Hubs streaming ingestion service to receive and process events per second using a de-batching and transformation logic implemented with highly scalable, event hub–triggered functions.

Architecture

Diagram showing the data flow and key processing points in the architecture described in this article

  1. Events arrive at the Input Event Hub.
  2. The De-batching and Filtering Azure Function is triggered to handle the event. This step filters out unwanted events and de-batches the received events before submitting them to the Output Event Hub.
  3. If the De-batching and Filtering Azure Function fails to store the event successfully, the event is submitted to the Deadletter Event Hub 1.
  4. Events arriving at the Output Event Hub trigger the Transforming Azure Function. This Azure Function transforms the event into a message for the Cosmos DB.
  5. The event is stored in a Cosmos DB database.
  6. If the Transforming Azure Function fails to store the event successfully, the event is saved to the Deadletter Event Hub 2.

Components

  • Event Hubs ingests the data stream. Event Hubs is designed for high-throughput data streaming scenarios.
  • Azure Functions is a serverless compute option. It uses an event-driven model, where a piece of code (a function) is invoked by a trigger.
  • Azure Cosmos DB is a multi-model database service that is available in a serverless, consumption-based mode. For this scenario, the event-processing function stores JSON records, using the Cosmos DB SQL API.

Next steps