Databricks SQL Driver for Go
The Databricks SQL Driver for Go is a Go library that allows you to use Go code to run SQL commands on Azure Databricks compute resources. This article supplements the Databricks SQL Driver for Go README, API reference, and examples.
Requirements
- A development machine running Go, version 1.20 or above. To print the installed version of Go, run the command
go version
. Download and install Go. - An existing cluster or SQL warehouse.
- The Server Hostname, Port, and HTTP Path values for the existing cluster or SQL warehouse.
Getting started with the Databricks SQL Driver for Go
On your development machine with Go 1.20 or above already installed and an existing Go code project already created, create a
go.mod
file to track your Go code’s dependencies by running thego mod init
command, for example:go mod init sample
Take a dependency on the Databricks SQL Driver for Go package by running the
go mod edit -require
command, replacingv1.5.2
with the latest version of the Databricks SQL Driver for Go package as listed in the Releases:go mod edit -require github.com/databricks/databricks-sql-go@v1.5.2
Your
go.mod
file should now look like this:module sample go 1.20 require github.com/databricks/databricks-sql-go v1.5.2
In your project, create a Go code file that imports the Databricks SQL Driver for Go. The following example, in a file named
main.go
with the following contents, lists all the clusters in your Azure Databricks workspace:package main import ( "database/sql" "os" _ "github.com/databricks/databricks-sql-go" ) func main() { dsn := os.Getenv("DATABRICKS_DSN") if dsn == "" { panic("No connection string found. " + "Set the DATABRICKS_DSN environment variable, and try again.") } db, err := sql.Open("databricks", dsn) if err != nil { panic(err) } defer db.Close() if err := db.Ping(); err != nil { panic(err) } }
Add any missing module dependencies by running the
go mod tidy
command:go mod tidy
Note
If you get the error
go: warning: "all" matched no packages
, you forgot to add a Go code file that imports the Databricks SQL Driver for Go.Make copies of all packages needed to support builds and tests of packages in your
main
module, by running thego mod vendor
command:go mod vendor
Modify your code as needed to set the
DATABRICKS_DSN
environment variable for Azure Databricks authentication. See also Connect with a DSN connection string.Run your Go code file, assuming a file named
main.go
, by running thego run
command:go run main.go
If no errors are returned, you have successfully authenticated the Databricks SQL Driver for Go with your Azure Databricks workspace and connected to your running Azure Databricks cluster or SQL warehouse in that workspace.
Connect with a DSN connection string
To access clusters and SQL warehouses, use sql.Open()
to create a database handle through a data source name (DSN) connection string. This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN
:
package main
import (
"database/sql"
"os"
_ "github.com/databricks/databricks-sql-go"
)
func main() {
dsn := os.Getenv("DATABRICKS_DSN")
if dsn == "" {
panic("No connection string found. " +
"Set the DATABRICKS_DSN environment variable, and try again.")
}
db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()
if err := db.Ping(); err != nil {
panic(err)
}
}
To specify the DSN connection string in the correct format, see the DSN connection string examples in Authentication. For example, for Azure Databricks personal access token authentication, use the following syntax, where:
<personal-access-token>
is your Azure Databricks personal access token from the requirements.<server-hostname>
is the Server Hostname value from the requirements.<port-number>
is the Port value from the requirements, which is typically443
.<http-path>
is the HTTP Path value from the requirements.<paramX=valueX>
is one or more Optional parameters listed later in this article.
token:<personal-access-token>@<server-hostname>:<port-number>/<http-path>?<param1=value1>&<param2=value2>
For example, for a cluster:
token:dapi12345678901234567890123456789012@adb-1234567890123456.7.azuredatabricks.net:443/sql/protocolv1/o/1234567890123456/1234-567890-abcdefgh
For example, for a SQL warehouse:
token:dapi12345678901234567890123456789012@adb-1234567890123456.7.azuredatabricks.net:443/sql/1.0/endpoints/a1b234c5678901d2
Note
As a security best practice, you should not hard-code this DSN connection string into your Go code. Instead, you should retrieve this DSN connection string from a secure location. For example, the code example earlier in this article used an environment variable.
Optional parameters
- Supported optional connection parameters can be specified in
<param=value>
. Some of the more frequently used ones include:catalog
: Sets the initial catalog name in the session.schema
: Sets the initial schema name in the session.maxRows
: Sets up the maximum number of rows fetched per request. The default is10000
.timeout
: Adds the timeout (in seconds) for the server query execution. The default is no timeout.userAgentEntry
: Used to identify partners. For more information, see your partner’s documentation.
- Supported optional session parameters can be specified in
param=value
. Some of the more frequently used ones include:ansi_mode
: A Boolean string.true
for session statements to adhere to rules specified by the ANSI SQL specification. The system default is false.timezone
: A string, for exampleAmerica/Los_Angeles
. Sets the timezone of the session. The system default is UTC.
For example, for a SQL warehouse:
token:dapi12345678901234567890123456789012@adb-1234567890123456.7.azuredatabricks.net:443/sql/1.0/endpoints/a1b234c5678901d2?catalog=hive_metastore&schema=example&maxRows=100&timeout=60&timezone=America/Sao_Paulo&ansi_mode=true
Connect with the NewConnector
function
Alternatively, use sql.OpenDB()
to create a database handle through a new connector object that is created with dbsql.NewConnector()
(connecting to Azure Databricks clusters and SQL warehouses with a new connector object requires v1.0.0 or higher of the Databricks SQL Driver for Go). For example:
package main
import (
"database/sql"
"os"
dbsql "github.com/databricks/databricks-sql-go"
)
func main() {
connector, err := dbsql.NewConnector(
dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
dbsql.WithPort(443),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
)
if err != nil {
panic(err)
}
db := sql.OpenDB(connector)
defer db.Close()
if err := db.Ping(); err != nil {
panic(err)
}
}
To specify the correct set of NewConnector
settings, see the examples in Authentication.
Note
As a security best practice, you should not hard-code your NewConnector
settings into your Go code. Instead, you should retrieve these values from a secure location. For example, the preceding code uses environment variables.
Some of the more frequently used functional options include:
WithAccessToken(<access-token>)
: Your Azure Databricks personal access token from the requirements. Requiredstring
.WithServerHostname(<server-hostname>)
: The Server Hostname value from the requirements. Requiredstring
.WithPort(<port>)
: The server’s port number, typically443
. Requiredint
.WithHTTPPath(<http-path>)
: The HTTP Path value from the requirements. Requiredstring
.WithInitialNamespace(<catalog>, <schema>)
:The catalog and schema name in the session. Optionalstring, string
.WithMaxRows(<max-rows>)
: The maximum number of rows fetched per request. The default is10000.
Optionalint
.WithSessionParams(<params-map>)
: The session parameters including “timezone” and “ansi_mode”. Optionalmap[string]string
.WithTimeout(<timeout>)
. The timeout (intime.Duration
) for the server query execution. The default is no timeout. Optional.WithUserAgentEntry(<isv-name-plus-product-name>)
. Used to identify partners. For more information, see your partner’s documentation. Optionalstring
.
For example:
connector, err := dbsql.NewConnector(
dbsql.WithAccessToken(os.Getenv("DATABRICKS_ACCESS_TOKEN")),
dbsql.WithServerHostname(os.Getenv("DATABRICKS_HOST")),
dbsql.WithPort(443),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
dbsql.WithInitialNamespace("samples", "nyctaxi"),
dbsql.WithMaxRows(100),
dbsql.SessionParams(map[string]string{"timezone": "America/Sao_Paulo", "ansi_mode": "true"}),
dbsql.WithTimeout(time.Minute),
dbsql.WithUserAgentEntry("example-user"),
)
Authentication
The Databricks SQL Driver for Go supports the following Azure Databricks authentication types:
- Databricks personal access token authentication
- Microsoft Entra ID (formerly Azure Active Directory) token authentication
- OAuth user-to-machine (U2M) authentication
- OAuth machine-to-machine (M2M) authentication
The Databricks SQL Driver for Go does not yet support the following Azure Databricks authentication types:
- Azure managed identities authentication
- Microsoft Entra ID service principal authentication
- Azure CLI authentication
Databricks personal access token authentication
To use the Databricks SQL Driver for Go with Azure Databricks personal access token authentication, you must first create an Azure Databricks personal access token, as follows:
- In your Azure Databricks workspace, click your Azure Databricks username in the top bar, and then select User Settings from the drop down.
- Click Developer.
- Next to Access tokens, click Manage.
- Click Generate new token.
- (Optional) Enter a comment that helps you to identify this token in the future, and change the token’s default lifetime of 90 days. To create a token with no lifetime (not recommended), leave the Lifetime (days) box empty (blank).
- Click Generate.
- Copy the displayed token to a secure location, and then click Done.
Note
Be sure to save the copied token in a secure location. Do not share your copied token with others. If you lose the copied token, you cannot regenerate that exact same token. Instead, you must repeat this procedure to create a new token. If you lose the copied token, or you believe that the token has been compromised, Databricks strongly recommends that you immediately delete that token from your workspace by clicking the trash can (Revoke) icon next to the token on the Access tokens page.
If you are not able to create or use tokens in your workspace, this might be because your workspace administrator has disabled tokens or has not given you permission to create or use tokens. See your workspace administrator or the following:
To authenticate the Databricks SQL Driver for Go with a DSN connection string and the code example in Connect with a DSN connection string, use the following DSN connection string syntax, where:
<personal-access-token>
is your Azure Databricks personal access token from the requirements.<server-hostname>
is the Server Hostname value from the requirements.<port-number>
is the Port value from the requirements, which is typically443
.<http-path>
is the HTTP Path value from the requirements.
You can also append one or more Optional parameters listed previously in this article.
token:<personal-access-token>@<server-hostname>:<port-number>/<http-path>
To authenticate the Databricks SQL Driver for Go with the NewConnector
function, use the following code snippet and the code example in Connect with the NewConnector function, which assumes that you have set the following environment variables:
DATABRICKS_SERVER_HOSTNAME
set to the Server Hostname value for your cluster or SQL warehouse.DATABRICKS_HTTP_PATH
, set to HTTP Path value for your cluster or SQL warehouse.DATABRICKS_TOKEN
, set to the Azure Databricks personal access token.
To set environment variables, see your operating system’s documentation.
connector, err := dbsql.NewConnector(
dbsql.WithServerHostname(os.Getenv("DATABRICKS_SERVER_HOSTNAME")),
dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")),
dbsql.WithPort(443),
dbsql.WithAccessToken(os.Getenv("DATABRICKS_TOKEN")),
)
Microsoft Entra ID (formerly Azure Active Directory) token authentication
The Databricks SQL Driver for Go supports Microsoft Entra ID (formerly Azure Active Directory) tokens for an Azure Databricks user or a Microsoft Entra ID service principal.
To create a Microsoft Entra ID access token, do the following:
For an Azure Databricks user, you can use the Azure CLI. See Get Microsoft Entra ID (formerly Azure Active Directory) tokens for users by using the Azure CLI.
- For a Microsoft Entra ID service principal, see Get a Microsoft Entra ID access token with the Azure CLI. To create a Microsoft Entra ID managed service principal, see Manage service principals.
Microsoft Entra ID tokens have a default lifetime of about 1 hour. To create a new Microsoft Entra ID token, repeat this process.
To authenticate the Databricks SQL Driver for Go with a DSN connection string and the code example in Connect with a DSN connection string, use the following DSN connection string syntax, where:
<microsoft-entra-id-token>
is your Microsoft Entra ID token.<server-hostname>
is the Server Hostname value from the requirements.<port-number>
is the Port value from the requirements, which is typically443
.<http-path>
is the HTTP Path value from the requirements.
You can also append one or more Optional parameters listed previously in this article.
token:<microsoft-entra-id-token>@<server-hostname>:<port-number>/<http-path>
To authenticate the Databricks SQL Driver for Go with the
NewConnector
function, use the following code snippet and the code example in Connect with the NewConnector function, which assumes that you have set the following environment variables:DATABRICKS_SERVER_HOSTNAME
set to the Server Hostname value for your cluster or SQL warehouse.DATABRICKS_HTTP_PATH
, set to HTTP Path value for your cluster or SQL warehouse.DATABRICKS_TOKEN
, set to your Microsoft Entra ID token.
To set environment variables, see your operating system’s documentation.
connector, err := dbsql.NewConnector( dbsql.WithServerHostname(os.Getenv("DATABRICKS_SERVER_HOSTNAME")), dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")), dbsql.WithPort(443), dbsql.WithAccessToken(os.Getenv("DATABRICKS_TOKEN")), )
OAuth user-to-machine (U2M) authentication
Databricks SQL Driver for Go versions 1.5.0 and above support OAuth user-to-machine (U2M) authentication.
To use the Databricks SQL Driver for Go with a DSN connection string and the code example in Connect with a DSN connection string, use the following DSN connection string syntax, where:
<server-hostname>
is the Server Hostname value from the requirements.<port-number>
is the Port value from the requirements, which is typically443
.<http-path>
is the HTTP Path value from the requirements.
You can also append one or more Optional parameters listed previously in this article.
<server-hostname>:<port-number>/<http-path>?authType=OauthU2M
To authenticate the Databricks SQL Driver for Go with the
NewConnector
function, you must first add the following to yourimport
declaration:"github.com/databricks/databricks-sql-go/auth/oauth/u2m"
Then use the following code snippet and the code example in Connect with the NewConnector function, which assumes that you have set the following environment variables:
DATABRICKS_SERVER_HOSTNAME
set to the Server Hostname value for your cluster or SQL warehouse.DATABRICKS_HTTP_PATH
, set to HTTP Path value for your cluster or SQL warehouse.
To set environment variables, see your operating system’s documentation.
authenticator, err := u2m.NewAuthenticator(os.Getenv("DATABRICKS_SERVER_HOSTNAME"), 1*time.Minute) if err != nil { panic(err) } connector, err := dbsql.NewConnector( dbsql.WithServerHostname(os.Getenv("DATABRICKS_SERVER_HOSTNAME")), dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")), dbsql.WithPort(443), dbsql.WithAuthenticator(authenticator), )
OAuth machine-to-machine (M2M) authentication
Databricks SQL Driver for Go versions 1.5.2 and above support OAuth machine-to-machine (M2M) authentication.
To use the Databricks SQL Driver for Go with OAuth M2M authentication, follow the setup steps in OAuth machine-to-machine (M2M) authentication.
To authenticate the Databricks SQL Driver for Go with a DSN connection string and the code example in Connect with a DSN connection string, use the following DSN connection string syntax, where:
<server-hostname>
is the Server Hostname value from the requirements.<port-number>
is the Port value from the requirements, which is typically443
.<http-path>
is the HTTP Path value from the requirements.<client-id>
is the service principal’s Application (client) ID value.<client-secret>
is the Secret value for the service principal’s OAuth secret. (Microsoft Entra ID secrets are not supported for OAuth M2M authentication.)
You can also append one or more Optional parameters listed previously in this article.
<server-hostname>:<port-number>/<http-path>?authType=OAuthM2M&clientID=<client-id>&clientSecret=<client-secret>
To authenticate the Databricks SQL Driver for Go with the
NewConnector
function, you must first add the following to yourimport
declaration:"github.com/databricks/databricks-sql-go/auth/oauth/m2m"
Then use the following code snippet and the code example in Connect with the NewConnector function, which assumes that you have set the following environment variables:
DATABRICKS_SERVER_HOSTNAME
set to the Server Hostname value for your cluster or SQL warehouse.DATABRICKS_HTTP_PATH
, set to HTTP Path value for your cluster or SQL warehouse.DATABRICKS_CLIENT_ID
, set to the service principal’s Application (client) ID value.DATABRICKS_CLIENT_SECRET
, set to the Secret value for the service principal’s OAuth secret. (Microsoft Entra ID secrets are not supported for OAuth M2M authentication.)
To set environment variables, see your operating system’s documentation.
authenticator := m2m.NewAuthenticator( os.Getenv("DATABRICKS_CLIENT_ID"), os.Getenv("DATABRICKS_CLIENT_SECRET"), os.Getenv("DATABRICKS_SERVER_HOSTNAME"), ) connector, err := dbsql.NewConnector( dbsql.WithServerHostname(os.Getenv("DATABRICKS_SERVER_HOSTNAME")), dbsql.WithHTTPPath(os.Getenv("DATABRICKS_HTTP_PATH")), dbsql.WithPort(443), dbsql.WithAuthenticator(authenticator), )
Query data
The following code example demonstrates how to call the Databricks SQL Driver for Go to run a basic SQL query on an Azure Databricks compute resource. This command returns the first two rows from the trips
table in the samples
catalog’s nyctaxi
schema.
This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN
.
package main
import (
"database/sql"
"fmt"
"os"
"time"
_ "github.com/databricks/databricks-sql-go"
)
func main() {
dsn := os.Getenv("DATABRICKS_DSN")
if dsn == "" {
panic("No connection string found." +
"Set the DATABRICKS_DSN environment variable, and try again.")
}
db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()
var (
tpep_pickup_datetime time.Time
tpep_dropoff_datetime time.Time
trip_distance float64
fare_amount float64
pickup_zip int
dropoff_zip int
)
rows, err := db.Query("SELECT * FROM samples.nyctaxi.trips LIMIT 2")
if err != nil {
panic(err)
}
defer rows.Close()
fmt.Print("tpep_pickup_datetime,",
"tpep_dropoff_datetime,",
"trip_distance,",
"fare_amount,",
"pickup_zip,",
"dropoff_zip\n")
for rows.Next() {
err := rows.Scan(&tpep_pickup_datetime,
&tpep_dropoff_datetime,
&trip_distance,
&fare_amount,
&pickup_zip,
&dropoff_zip)
if err != nil {
panic(err)
}
fmt.Print(tpep_pickup_datetime, ",",
tpep_dropoff_datetime, ",",
trip_distance, ",",
fare_amount, ",",
pickup_zip, ",",
dropoff_zip, "\n")
}
err = rows.Err()
if err != nil {
panic(err)
}
}
Output:
tpep_pickup_datetime,tpep_dropoff_datetime,trip_distance,fare_amount,pickup_zip,dropoff_zip
2016-02-14 16:52:13 +0000 UTC,2016-02-14 17:16:04 +0000 UTC,4.94,19,10282,10171
2016-02-04 18:44:19 +0000 UTC,2016-02-04 18:46:00 +0000 UTC,0.28,3.5,10110,10110
For additional examples, see the examples folder in the databricks/databricks-sql-go repository on GitHub.
Logging
Use github.com/databricks/databricks-sql-go/logger
to log messages that the Databricks SQL Driver for Go emits. The following code example uses sql.Open()
to create a database handle through a DSN connection string. This code example retrieves the DSN connection string from an environment variable named DATABRICKS_DSN
. All log messages that are emitted at the debug
level and below are written to the results.log
file.
package main
import (
"database/sql"
"io"
"log"
"os"
_ "github.com/databricks/databricks-sql-go"
dbsqllog "github.com/databricks/databricks-sql-go/logger"
)
func main() {
dsn := os.Getenv("DATABRICKS_DSN")
// Use the specified file for logging messages to.
file, err := os.Create("results.log")
if err != nil {
log.Fatal(err)
}
defer file.Close()
writer := io.Writer(file)
// Log messages at the debug level and below.
if err := dbsqllog.SetLogLevel("debug"); err != nil {
log.Fatal(err)
}
// Log messages to the file.
dbsqllog.SetLogOutput(writer)
if dsn == "" {
panic("Error: Cannot connect. No connection string found. " +
"Set the DATABRICKS_DSN environment variable, and try again.")
}
db, err := sql.Open("databricks", dsn)
if err != nil {
panic(err)
}
defer db.Close()
if err := db.Ping(); err != nil {
panic(err)
}
}
Additional resources
- The Databricks SQL Driver for Go repository on GitHub
- The database/sql package home page
- The Databricks SQL Driver for Go examples on GitHub
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for