Connect to data sources from Azure Databricks
This article provides links to all the different data sources in Azure that can be connected to Azure Databricks. Follow the examples in these links to extract data from the Azure data sources (for example, Azure Blob Storage, Azure Event Hubs, etc.) into an Azure Databricks cluster, and run analytical jobs on them.
- You must have an Azure Databricks workspace and a Spark cluster. Follow the instructions at Get started with Azure Databricks.
Data sources for Azure Databricks
The following list provides the data sources in Azure that you can use with Azure Databricks. For a complete list of data sources that can be used with Azure Databricks, see Data sources for Azure Databricks.
This link provides the DataFrame API for connecting to SQL databases using JDBC and how to control the parallelism of reads through the JDBC interface. This topic provides detailed examples using the Scala API, with abbreviated Python and Spark SQL examples at the end.
This link provides examples on how to use the Azure Active Directory service principal to authenticate with Azure Data Lake Storage. It also provides instructions on how to access the data in Azure Data Lake Storage from Azure Databricks.
This link provides examples on how to directly access Azure Blob Storage from Azure Databricks using access key or the SAS for a given container. The link also provides info on how to access the Azure Blob Storage from Azure Databricks using the RDD API.
This link provides instructions on how to use the Azure Cosmos DB Spark connector from Azure Databricks to access data in Azure Cosmos DB.
This link provides instructions on how to use the Azure Event Hubs Spark connector from Azure Databricks to access data in Azure Event Hubs.
This link provides instructions on how to use the Azure SQL Data Warehouse connector to connect from Azure Databricks.
To learn about sources from where you can import data into Azure Databricks, see Data sources for Azure Databricks.