Developer tools and guidance

Learn about tools and guidance you can use to work with Azure Databricks assets and data and to develop Azure Databricks applications.

Develop code in an IDE

You can connect many popular third-party IDEs to an Azure Databricks cluster. This allows you to write code on your local development machine by using the Spark APIs and then run that code as jobs remotely on an Azure Databricks cluster.

These third-party IDEs include:

To connect other IDEs through JDBC or ODBC, see the JDBC and ODBC configuration guidance.

If you code in Python, you can also use pyodbc.

Work with Azure Databricks resources from the command line or a notebook

Databricks provides additional developer tools. These tools include:

Name Use this tool when you want to…
Databricks CLI Use the command line to work with Data Science & Engineering workspace assets such as cluster policies, clusters, file systems, groups, pools, jobs, libraries, runs, secrets, and tokens.
Databricks Utilities Run Python, R, or Scala code in a notebook to work with credentials, file systems, libraries, and secrets from an Azure Databricks cluster.

Call the Databricks REST APIs

You can use popular third-party utilities such as curl and tools such as Postman to work with clusters and SQL endpoints directly through the Databricks REST APIs. These APIs include:

Category Use this API to work with…
REST API 2.0 Data Science & Engineering workspace assets such as clusters, file systems, global init scripts, groups, pools and profiles, jobs, libraries, permissions, secrets, and tokens.
Databricks SQL API reference Databricks SQL assets such as queries, dashboards, query history, and SQL endpoints.

Provision Azure Databricks infrastructure and assets

You can use an infrastructure-as-code (IaC) approach to programmatically provision Azure Databricks infrastructure and assets such as workspaces, clusters, cluster policies, pools, jobs, groups, permissions, secrets, tokens, users, and more. For details, see Databricks Terraform provider.

Follow Databricks development lifecycle patterns and best practices

To manage the lifecycle of Azure Databricks assets and data, you can use continuous integration and delivery (CI/CD), data pipeline, and data engineering tools.

Area Use these patterns and best practices when you want to…
Continuous integration and delivery on Azure Databricks using Jenkins Develop a CI/CD pipeline for Azure Databricks that uses Jenkins.
Managing dependencies in data pipelines Manage and schedule a data pipeline that uses Apache Airflow.
dbt integration with Azure Databricks Transform data in Azure Databricks by simply writing select statements. dbt turns these select statements into tables and views.
DataGrip integration with Azure Databricks Use this integrated development environment (IDE) for database developers that provides a query console, schema navigation, smart code completion, and other features.
DBeaver integration with Azure Databricks Run SQL commands and browse database objects in Azure Databricks by using this client software application and database administration tool.

Use business intelligence tools with data in Azure Databricks

You can connect many popular business intelligence (BI) tools to clusters and SQL endpoints to access data in Azure Databricks. These tools include:

To connect other BI tools through JDBC or ODBC, see the JDBC and ODBC configuration instructions.