Snowflake is a cloud-based SQL data warehouse that focuses on great performance, zero-tuning, diversity of data sources, and security. This article explains how to read data from and write data to Snowflake using the Azure Databricks Snowflake connector.
Azure Databricks and Snowflake have partnered to bring a first-class connector experience for customers of both Azure Databricks and Snowflake, saving you from having to import and load libraries into your clusters, and therefore preventing version conflicts and misconfiguration.
The Azure Databricks Snowflake connector is available in Databricks Runtime 4.2 and above.
Use the Azure Databricks Snowflake connector
The following notebooks provide simple examples of how to write data to and read data from Snowflake. See Using the Connector in the Snowflake documentation for more details.
Avoid exposing your Snowflake username and password in notebooks by using the Secrets feature, which is demonstrated in the sample notebooks below.
Snowflake Scala notebook
Snowflake Python notebook
Snowflake R notebook
Train a machine learning model using Snowflake
Demo Snowflake notebook
Frequently asked questions (FAQ)
Why don’t my Spark DataFrame columns appear in the same order in Snowflake?
The Spark - Snowflake connector doesn’t respect the order of the columns in the table being written to; you must explicitly specify the mapping between DataFrame and Snowflake columns. To specify this mapping, use the columnmap parameter.
INTEGER data written to Snowflake always read back as
Snowflake represents all
INTEGER types as
NUMBER, which can cause a change in data type when you write data to and read data from Snowflake. For example,
INTEGER data can be converted to
DECIMAL when writing to Snowflake, because
DECIMAL are semantically equivalent in Snowflake (see Snowflake Numeric Data Types).
Why are the fields in my Snowflake table schema always uppercase?
Snowflake uses uppercase fields by default, which means that the table schema is converted to uppercase.