Configure double encryption for DBFS root

Note

This feature is available only in the Premium plan.

Databricks File System (DBFS) is a distributed file system mounted into an Azure Databricks workspace and available on Azure Databricks clusters. DBFS is implemented as a storage account in your Azure Databricks workspace’s managed resource group. The default location in DBFS is known as the DBFS root.

Azure Storage automatically encrypts all data in the workspace storage account, including DBFS root storage, at the service level using 256-bit AES encryption. This is one of the strongest block ciphers available and is FIPS 140-2 compliant. If you require higher levels of assurance that your data is secure, you can also enable 256-bit AES encryption at the Azure Storage infrastructure level. When infrastructure encryption is enabled, data in a storage account is encrypted twice, once at the service level and once at the infrastructure level, with two different encryption algorithms and two different keys. Double encryption of Azure Storage data protects against a scenario where one of the encryption algorithms or keys is compromised. In this scenario, the additional layer of encryption continues to protect your data.

This article describes how to create a workspace that adds infrastructure encryption (and therefore double encryption) for a workspace storage account. You must enable infrastructure encryption at workspace creation; you cannot add infrastructure encryption to an existing workspace.

Requirements

Create a workspace with double encryption using the Azure portal

Follow the instructions for creating a workspace using the Azure portal in Quickstart: Run a Spark job on Azure Databricks Workspace using the Azure portal, adding these steps:

  1. In PowerShell, run the following commands, which will allow you to enable infrastructure encryption in the Azure portal.

    Register-AzProviderFeature -ProviderNamespace Microsoft.Storage -FeatureName AllowRequireInfraStructureEncryption
    
    Get-AzProviderFeature -ProviderNamespace Microsoft.Storage -FeatureName AllowRequireInfraStructureEncryption
    
  2. On the Create an Azure Databricks workspace page (Create a resource > Analytics > Azure Databricks), click the Advanced tab.

  3. Next to Enable Infrastructure Encryption, select Yes.

    Enable double encryption at workspace creation

  4. When you have finished your workspace configuration and created the workspace, verify that infrastructure encryption is enabled.

    In the resource page for the Azure Databricks workspace, go to the sidebar menu and select Settings > Encryption. Confirm that Enable Infrastructure Encryption is selected.

    Verify double encryption after workspace creation

Create a workspace with double encryption using PowerShell

Follow the instructions in Quickstart: Create an Azure Databricks workspace using PowerShell, adding the option -RequireInfrastructureEncryption to the command you run in the Create an Azure Databricks workspace step:

For example,

New-AzDatabricksWorkspace -Name databricks-test -ResourceGroupName testgroup -Location eastus -ManagedResourceGroupName databricks-group -Sku premium -RequireInfrastructureEncryption

After your workspace is created, verify that infrastructure encryption is enabled by running:

Get-AzDatabricksWorkspace  -Name <workspace-name> -ResourceGroupName <resource-group> | fl

RequireInfrastructureEncryption should be set to true.

For more information about PowerShell cmdlets for Azure Databricks workspaces, see the Az.Databricks module reference.

Create a workspace with double encryption using the Azure CLI

When you create a workspace using the Azure CLI, include the option --require-infrastructure-encryption.

For example,

az databricks workspace create --name <workspace-name> --location <workspace-location> --resource-group <resource-group> --sku premium --require-infrastructure-encryption

After your workspace is created, verify that infrastructure encryption is enabled by running:

az databricks workspace show --name <workspace-name> --resource-group <resource-group>

The requireInfrastructureEncryption field should be present in the encryption property and set to true.

For more information about Azure CLI commands for Azure Databricks workspaces, see the az databricks workspace command reference.