Databricks data engineering

Article
03/19/2024

Databricks data engineering features are a robust environment for collaboration among data scientists, data engineers, and data analysts. Data engineering tasks are also the backbone of Databricks machine learning solutions.

Note

If you are a data analyst who works primarily with SQL queries and BI tools, you might prefer Databricks SQL.

Name	Use this when you want to…
Delta Live Tables	Learn how to build data pipelines for ingestion and transformation with Databricks Delta Live Tables.
Structured Streaming	Learn about streaming, incremental, and real-time workloads powered by Structured Streaming on Databricks.
Apache Spark	Learn how Apache Spark works on Databricks and the Databricks platform.
Compute	Learn about Databricks clusters and how to create and manage them.
Notebooks	Learn what a Databricks notebook is, and how to use and manage notebooks to process, analyze, and visualize your data.
Workflows	Learn how to orchestrate data processing, machine learning, and data analysis workflows on the Databricks platform.
Libraries	Learn how to make third-party or custom code available in Databricks using libraries. Learn about the different modes for installing libraries on Databricks.
Git folders	Learn how to use Git to version control your notebooks and other files for development in Databricks.
DBFS	Learn about Databricks File System (DBFS), a distributed file system mounted into a Databricks workspace and available on Databricks clusters
Files	Learn about options for working with files on Databricks.
Migration	Learn how to migrate data applications such as ETL jobs, enterprise data warehouses, ML, data science, and analytics to Databricks.
Optimization & performance	Learn about optimizations and performance recommendations on Databricks.

Databricks data engineering

Feedback

Additional resources