JDBC and ODBC drivers and configuration parameters

You can connect business intelligence (BI) tools to Azure Databricks clusters and SQL endpoints to query data in tables. This article describes how to get JDBC and ODBC drivers and configuration parameters to connect to Azure Databricks clusters and SQL endpoints.

For tool-specific connection instructions, see Connect BI tools to Azure Databricks or Connect BI tools to SQL endpoints.

Permission requirements

The permissions required to access compute resources using JDBC or ODBC depend on whether you are connecting to a an Azure Databricks cluster or SQL endpoint.

Cluster requirements

To access a cluster, you must have Can Attach To permission.

If you connect to a terminated cluster and have Can Restart permission, the cluster is started.

SQL endpoint requirements

To access a SQL endpoint, you must have Can Use permission.

If you connect to a stopped endpoint and have Can Use permission, the SQL endpoint is started.

Prepare to connect BI tools

This section describes the steps you typically follow to prepare to connect to BI tools:

Step 1: Download and install a JDBC or ODBC driver

For some BI tools, you use a JDBC or ODBC driver to make a connection to Azure Databricks compute resources.

  1. Go to the Databricks JDBC or ODBC driver download page and download it. For ODBC, pick the right driver for your operating system.
  2. Install the driver. For JDBC, a JAR is provided which does not require installation. For ODBC, an installation package is provided for your chosen platform.

Step 2: Collect JDBC or ODBC connection information

To configure a JDBC or ODBC driver, you must collect connection information from Azure Databricks. Here are some of the parameters a JDBC or ODBC driver might require:

Parameter Value
Authentication See Username and password authentication.
Host, port, HTTP path, JDBC URL See Get server hostname, port, HTTP path, and JDBC URL.

The following are usually specified in the httpPath for JDBC and the DSN conf for ODBC:

Parameter Value
Spark server type Spark Thrift Server
Schema/Database default
Authentication mechanism AuthMech See Username and password authentication.
Thrift transport http
SSL 1

Username and password authentication

This section describes how to collect the credentials for authenticating BI tools to Azure Databricks compute resources.

When you configure authentication from your BI tool to Azure Databricks resources, you either set the fields Username and Password or use direct Azure Active Directory authentication. Set Username to the string token and the Password to a token as follows:

Get server hostname, port, HTTP path, and JDBC URL

The procedure for retrieving JDBC and ODBC parameters depends on whether you are using Azure Databricks clusters and SQL endpoints.

Get connection details for cluster

  1. Click compute icon Compute in the sidebar.

  2. Click a cluster.

  3. Click the Advanced Options toggle.

  4. Click the JDBC/ODBC tab.

    JDBC-ODBC tab

  5. Copy the parameters required by your BI tool.

Get connection details for SQL endpoint

  1. Click Endpoints Icon SQL Endpoints in the sidebar.

  2. Click an endpoint.

  3. Click the Connection Details tab.

    Connection details

  4. Copy the parameters required by your BI tool.

Configure JDBC URL

The steps for configuring the JDBC URL depend on whether you are using an Azure Databricks cluster or a SQL endpoint.

Configure JDBC URL for cluster

Personal access token authentication

In the following URL, replace <personal-access-token> with the token you created in Username and password authentication. For example:

jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=<personal-access-token>

Azure Active Directory token authentication

  1. In the following URL

     jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=<personal-access-token>
    

    remove the existing authentication parameters (AuthMech, UID, PWD):

     AuthMech=3;UID=token;PWD=<personal-access-token>
    
  2. Add the following parameters, replacing <Azure AD token> with the Azure AD token you obtained in Username and password authentication.

    AuthMech=11;Auth_Flow=0;Auth_AccessToken=<Azure AD token>
    

Also see Refresh an Azure Active Directory token.

Configure JDBC URL for SQL endpoint

Replace <personal-access-token> with the token you created in Username and password authentication. For example:

jdbc:spark://<server-hostname>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/0/xxxx-xxxxxx-xxxxxxxx;AuthMech=3;UID=token;PWD=<personal-access-token>

Configure connection for native query syntax

JDBC and ODBC drivers accept SQL queries in ANSI SQL-92 dialect and translate the queries to Spark SQL. If your application generates Spark SQL directly or your application uses any non-ANSI SQL-92 standard SQL syntax specific to Azure Databricks, Databricks recommends that you add ;UseNativeQuery=1 to the connection configuration. With that setting, drivers pass the SQL queries verbatim to Azure Databricks.

Configure ODBC Data Source Name for the Simba ODBC driver

The Data Source Name (DSN) configuration contains the parameters for communicating with a specific database. BI tools like Tableau usually provide a user interface for entering these parameters. If you have to install and manage the Simba ODBC driver yourself, you might need to create the configuration files and also allow your Driver Manager (ODBC Data Source Administrator on Windows and unixODBC/iODBC on Unix) to access them. Create two files: /etc/odbc.ini and /etc/odbcinst.ini.

In this section:

/etc/odbc.ini

Username and password authentication

  1. Set the content of /etc/odbc.ini to:

    [Databricks-Spark]
    Driver=Simba
    Server=<server-hostname>
    HOST=<server-hostname>
    PORT=<port>
    SparkServerType=3
    Schema=default
    ThriftTransport=2
    SSL=1
    AuthMech=3
    UID=token
    PWD=<personal-access-token>
    HTTPPath=<http-path>
    
  2. Set <personal-access-token> to the token you retrieved in Username and password authentication.

  3. Set the server, port, and HTTP parameters to the ones you retrieved in Get server hostname, port, HTTP path, and JDBC URL.

Azure Active Directory token authentication

  1. Set the content of /etc/odbc.ini to:

    [Databricks-Spark]
    Driver=Simba
    Server=<server-hostname>
    HOST=<server-hostname>
    PORT=443
    HTTPPath=<http-path>
    SparkServerType=3
    Schema=default
    ThriftTransport=2
    SSL=1
    AuthMech=11
    Auth_Flow=0
    Auth_AccessToken=<Azure AD token>
    
  2. Set <Azure AD token> to the Azure Active Directory token you retrieved in Username and password authentication.

  3. Set the server, port, and HTTP parameters to the ones you retrieved in Get server hostname, port, HTTP path, and JDBC URL.

Also see Refresh an Azure Active Directory token.

/etc/odbcinst.ini

Set the content of /etc/odbcinst.ini to:

[ODBC Drivers]
Simba = Installed
[Simba Spark ODBC Driver 64-bit]
Driver = <driver-path>

Set <driver-path> according to the operating system you chose when you downloaded the driver in Step 1:

  • MacOs /Library/simba/spark/lib/libsparkodbc_sbu.dylib
  • Linux (64-bit) /opt/simba/spark/lib/64/libsparkodbc_sb64.so
  • Linux (32-bit) /opt/simba/spark/lib/32/libsparkodbc_sb32.so

Configure paths of ODBC configuration files

Specify the paths of the two files in environment variables so that they can be used by the Driver Manager:

export ODBCINI=/etc/odbc.ini
export ODBCSYSINI=/etc/odbcinst.ini
export SIMBASPARKINI=<simba-ini-path>/simba.sparkodbc.ini # (Contains the configuration for debugging the Simba driver)

where <simba-ini-path> is

  • MacOS /Library/simba/spark/lib
  • Linux (64-bit) /opt/simba/sparkodbc/lib/64
  • Linux (32-bit) /opt/simba/sparkodbc/lib/32

Refresh an Azure Active Directory token

An Azure Active Directory token expires after 1 hour. If the access token is not refreshed using the refresh_token, you must manually replace the token with a new one. This section describes how to programmatically refresh the token of an existing session without breaking the connection.

  1. Follow the steps in Refresh an access token.
  2. Update the token as follows:
    • JDBC: Call java.sql.Connection.setClientInfo with the new value for Auth_AccessToken .
    • ODBC: Call SQLSetConnectAttr for SQL_ATTR_CREDENTIALS with the new value for Auth_AccessToken.

Troubleshooting

See Troubleshooting JDBC and ODBC connections.

See also

Databricks SQL Connector for Python Connect Python and pyodbc to Azure Databricks