Create data pipelines by using Databricks Delta

Intermediate
Developer
Data Engineer
Data Scientist
Azure
Databricks

Learn how to use Databricks Delta in Azure to manage the flow of data (a data pipeline) to and from a data lake. This system includes mechanisms to create, append, and upsert data to Apache Spark tables, taking advantage of built-in reliability and optimizations. Learn how Databricks Delta architecture helps speed up reads, and how it lets multiple writers modify a dataset simultaneously and see consistent views. Finally, implement a Lambda Architecture by processing batch and streaming data with Delta.

In this module, you will:

  • Use Databricks Delta to create, append, and upsert tables.
  • Work with streaming data.
  • Perform optimizations in Delta.
  • Implement a Lambda Architecture by processing batch and streaming data with Delta.

Prerequisites

You'll need an Azure subscription. If you don't have an Azure subscription, create an account and add a subscription before you begin. The Azure free trial subscription type will not work with Databricks, but the exercises can be completed for free using the Databricks 14 day free trial offer on a pay-as-you-go subscription. Instructions for using the free trial offer are included in the exercises.