Use Azure Schema Registry in Event Hubs from Apache Kafka and other apps

In many event streaming and messaging scenarios, the event or message payload contains structured data. Schema-driven formats such as Apache Avro are often used to serialize or deserialize such structured data.

An event producer uses a schema to serialize event payload and publish it to an event broker such as Event Hubs. Event consumers read event payload from the broker and deserialize it using the same schema. So, both producers and consumers can validate the integrity of the data with the same schema.

Image showing producers and consumers serializing and deserializing event payload using schemas from the Schema Registry.

What is Azure Schema Registry?

Azure Schema Registry is a feature of Event Hubs, which provides a central repository for schemas for event-driven and messaging-centric applications. It provides the flexibility for your producer and consumer applications to exchange data without having to manage and share the schema. It also provides a simple governance framework for reusable schemas and defines relationship between schemas through a grouping construct (schema groups).

Image showing a producer and a consumer serializing and deserializing event payload using a schema from the Schema Registry.

With schema-driven serialization frameworks like Apache Avro, moving serialization metadata into shared schemas can also help with reducing the per-message overhead. It's because each message doesn't need to have the metadata (type information and field names) as it's the case with tagged formats such as JSON.

Note

The feature isn't available in the basic tier.

Having schemas stored alongside the events and inside the eventing infrastructure ensures that the metadata that's required for serialization or deserialization is always in reach and schemas can't be misplaced.

Next steps