Ingest data into Delta Lake
Azure Databricks offers a variety of ways to help you ingest data into Delta Lake.
Partner data integrations enable you to load data into Azure Databricks from partner product UIs. This enables low-code, easy-to-implement, and scalable data ingestion from a variety of sources into Azure Databricks. For details, see Partner data integrations.
COPY INTO SQL command
COPY INTO SQL command lets you load data from a file location into a Delta table. This is a re-triable and idempotent operation—files in the source location that have already been loaded are skipped. For details, see
- Databricks Runtime 7.0 and above: COPY INTO (Delta Lake on Azure Databricks)
- Databricks Runtime 6.x and below: Copy Into (Delta Lake on Azure Databricks)
Auto Loader incrementally and efficiently processes new data files as they arrive in cloud storage without any additional setup. Auto Loader provides a new Structured Streaming source called
cloudFiles. Given an input directory path on the cloud file storage, the
cloudFiles source automatically processes new files as they arrive, with the option of also processing existing files in that directory. For details, see Load files from Azure Blob storage or Azure Data Lake Storage Gen2 using Auto Loader.