Introduction

Suppose you work in the analytics department of a large airline company. You're part of a team that's analyzing the reasons for flight delays based on real-time data from flights around the world. Your job is to analyze the continuous flow of incoming data and stream it to Azure Data Lake storage. That data includes details such as airline, reason for delays, and departure time. You're using Azure Event Hubs to process your data.

Note

This module's labs can be completed for free using the Databricks 14-day trial, but you cannot use an Azure free trial subscription to create a Databricks workspace. To switch a free trial subscription to pay-as-you-go, go to your profile and change your subscription offer to pay-as-you-go. You may also need to remove the spending limit, and request a quota increase for vCPUs in your region. When you create your Azure Databricks workspace, you can select the Trial (Premium - 14-Days Free DBUs) pricing tier to give the workspace access to free Premium Azure Databricks DBUs for 14 days.

Learning objectives

In this module, you will:

  • Use Spark Structured Streaming, Azure Event Hubs, and Databricks Delta to read from and write to streams.
  • Process streaming data by using Azure Databricks.