Developer tools and guidance

Learn about tools and guidance you can use to work with Azure Databricks assets and data and to develop Azure Databricks applications.

Use an IDE

You can connect many popular third-party IDEs to an Azure Databricks cluster. This allows you to write code on your local development machine by using the Spark APIs and then run that code as jobs remotely on an Azure Databricks cluster.

Databricks recommends that you use dbx by Databricks Labs for local development.

Databricks also provides a code sample that you can explore to use an IDE with dbx.

Note

Databricks also supports a tool named Databricks Connect. However, Databricks plans no new feature development for Databricks Connect at this time. Also, Databricks Connect has several limitations.

Use a connector or driver

You can use connectors and drivers to connect your code to an Azure Databricks cluster or a Databricks SQL warehouse. These connectors and drivers include:

For additional information about connecting your code through JDBC or ODBC, see the JDBC and ODBC configuration guidance.

Use the command line or a notebook

Databricks provides additional developer tools.

Name Use this tool when you want to…
Databricks CLI Use the command line to work with Data Science & Engineering workspace assets such as cluster policies, clusters, file systems, groups, pools, jobs, libraries, runs, secrets, and tokens.
Databricks SQL CLI Use the command line to run SQL commands and scripts on a Databricks SQL warehouse.
Databricks Utilities Run Python, R, or Scala code in a notebook to work with credentials, file systems, libraries, and secrets from an Azure Databricks cluster.

Call Databricks REST APIs

You can use popular third-party utilities such as curl, tools such as Postman, and Python APIs to work with Azure Databricks resources directly through the Databricks REST APIs.

Category Use this API to work with…
REST API (latest) Data Science & Engineering workspace assets such as clusters, global init scripts, groups, pools, jobs, libraries, permissions, secrets, and tokens, by using the latest version of the Databricks REST API.
REST API 2.1 Data Science & Engineering workspace assets such as jobs, by using version 2.1 of the Databricks REST API.
REST API 2.0 Data Science & Engineering workspace assets such as clusters, global init scripts, groups, pools, jobs, libraries, permissions, secrets, and tokens, by using version 2.0 of the Databricks REST API.
REST API 1.2 Command executions and execution contexts by using version 1.2 of the Databricks REST API.

Provision infrastructure

You can use an infrastructure-as-code (IaC) approach to programmatically provision Azure Databricks infrastructure and assets such as workspaces, clusters, cluster policies, pools, jobs, groups, permissions, secrets, tokens, and users. For details, see:

Use CI/CD

To manage the lifecycle of Azure Databricks assets and data, you can use continuous integration and continuous delivery (CI/CD) and data pipeline tools.

Area Use these tools when you want to…
Continuous integration and delivery on Azure Databricks using Azure DevOps Develop a CI/CD pipleine for Azure Databricks that uses Azure DevOps.
Continuous integration and delivery on Azure Databricks using GitHub Actions Develop a CI/CD workflow on GitHub that uses GitHub Actions developed for Azure Databricks.
Continuous integration and delivery on Azure Databricks using Jenkins Develop a CI/CD pipeline for Azure Databricks that uses Jenkins.
Managing dependencies in data pipelines Manage and schedule a data pipeline that uses Apache Airflow.
Service principals for CI/CD Use service principals, instead of users, with CI/CD systems.

Use a SQL database tool

You can use these tools to run SQL commands and scripts and to browse database objects in Azure Databricks.

Tool Use this when you want to:
Databricks SQL CLI Use a command line to run SQL commands and scripts on a Databricks SQL warehouse.
DataGrip integration with Azure Databricks Use a query console, schema navigation, smart code completion, and other features to run SQL commands and scripts and to browse database objects in Azure Databricks.
DBeaver integration with Azure Databricks Run SQL commands and browse database objects in Azure Databricks by using this client software application and database administration tool.
SQL Workbench/J Run SQL scripts (either interactively or as a batch) in Azure Databricks by using this SQL query tool.

Use other tools

You can connect many popular third-party tools to clusters and SQL warehouses to access data in Azure Databricks. See the Databricks integrations.

To authenticate automated scripts, tools, apps, and systems with Azure Databricks workspaces and resources, Databricks recommends that you use authentication credentials for service principals instead of Azure Databricks workspace user credentials. See Service principals for Azure Databricks automation.