JDBC and ODBC drivers and configuration parameters

You can connect business intelligence (BI) tools to Azure Databricks clusters to query data in tables. Every Azure Databricks cluster runs JDBC and ODBC servers on the driver node. This article describes how to get JDBC and ODBC drivers and configuration parameters to connect to Azure Databricks clusters. For tool-specific connection instructions, see Business intelligence tools.

Cluster requirements

To access a cluster using JDBC or ODBC you must have Can Attach To permission.

If you connect to a terminated cluster using JDBC or ODBC and have Can Restart permission, the cluster will be restarted.

Step 1: Download and install a JDBC or ODBC driver

For all BI tools, you need a JDBC or ODBC driver to make a connection to Azure Databricks clusters.

  1. Go to the Databricks JDBC / ODBC Driver Download page.
  2. Fill out the form and submit it. The page will update with links to multiple download options.
  3. Select a driver and download it.
  4. Install the driver. For JDBC, a JAR is provided which does not require installation. For ODBC, an installation package is provided for your chosen platform that must be installed on your system.

Step 2: Gather JDBC or ODBC connection parameters

To configure a JDBC or ODBC driver, you must collect connection parameters from Azure Databricks. Here are some of the parameters a JDBC or ODBC driver might require:

Parameters Value
Username and password See Username and password.
Host, port, HTTP path, JDBC URL See Server hostname, port, HTTP path, and JDBC URL.

The following are usually specified in the httpPath for JDBC and the DSN conf for ODBC:

Parameters Value
Spark Server Type Spark Thrift Server
Schema/Database default
Authentication Mechanism (AuthMech) Username and password authentication
Thrift Transport http
SSL 1

Username and password

No JDBC/ODBC driver supports either credential passthrough or Azure Active Directory username-and-password authentication to Azure Databricks.

Server hostname, port, HTTP path, and JDBC URL

  1. On the cluster configuration page, click the Advanced Options toggle.

  2. Click the JDBC/ODBC tab. It contains the hostname, port, protocol, HTTP path, and JDBC URL.

    JDBC-ODBC tab

  3. Copy the parameters required by your BI tool.

  4. If your tool requires the JDBC URL, replace <personal-access-token> with the token you created in Username and password. For example:

    jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=dapideadbeefdeadbeefdeadbeefdeadbeef
    
  5. JDBC and ODBC drivers accept SQL queries in ANSI SQL-92 dialect and translate the queries to Spark SQL. If your application generates Spark SQL directly or your application uses any non-ANSI SQL-92 standard SQL syntax specific to Databricks Runtime, Databricks recommends that you add ;UseNativeQuery=1 to the connection configuration. With that setting, drivers pass the SQL queries verbatim to Databricks Runtime.

ODBC Data Source Name configuration for the Simba ODBC driver

The Data Source Name (DSN) configuration contains the parameters for communicating with a specific database. BI tools like Tableau usually provide a user interface for entering these parameters. If you have to install and manage the Simba ODBC driver yourself, you might need to create the configuration files and also allow your Driver Manager (ODBC Data Source Administrator on Windows and unixODBC/iODBC on Unix) to access them. Create two files: /etc/odbc.ini and /etc/odbcinst.ini.

/etc/odbc.ini

The content of /etc/odbc.ini can be:

[Databricks-Spark]
Driver=Simba
Server=<server-hostname>
HOST=<server-hostname>
PORT=<port>
SparkServerType=3
Schema=default
ThriftTransport=2
SSL=1
AuthMech=3
UID=token
PWD=<personal-access-token>
HTTPPath=<http-path>

Set <personal-access-token> to the token you retrieved in Username and password and the other parameters to the ones you retrieved in Server hostname, port, HTTP path, and JDBC URL.

/etc/odbcinst.ini

The content of /etc/odbcinst.ini can be:

[ODBC Drivers]
Simba = Installed
[Simba Spark ODBC Driver 64-bit]
Driver = <driver-path>

Set <driver-path> according to the operating system you chose when you downloaded the driver in Step 1:

  • MacOs /Library/simba/spark/lib/libsparkodbc_sbu.dylib
  • Linux (64-bit) /opt/simba/spark/lib/64/libsparkodbc_sb64.so
  • Linux (32-bit) /opt/simba/spark/lib/32/libsparkodbc_sb32.so

Configure paths

Specify the paths of the two files in environment variables so that they can be used by the Driver Manager:

export ODBCINI=/etc/odbc.ini
export ODBCSYSINI=/etc/odbcinst.ini
export SIMBASPARKINI=<simba-ini-path>/simba.sparkodbc.ini # (Contains the configuration for debugging the Simba driver)

where <simba-ini-path> is

  • MacOS /Library/simba/spark/lib
  • Linux (64-bit) /opt/simba/sparkodbc/lib/64
  • Linux (32-bit) /opt/simba/sparkodbc/lib/32

Troubleshooting

See Troubleshooting JDBC and ODBC connections.