Secure cluster connectivity (No Public IP / NPIP)

With secure cluster connectivity enabled, customer virtual networks have no open ports and Databricks Runtime cluster nodes have no public IP addresses. Secure cluster connectivity is also known as No Public IP (NPIP).

  • At a network level, each cluster initiates a connection to the control plane secure cluster connectivity relay during cluster creation. The cluster establishes this connection using port 443 (HTTPS) and uses a different IP address than is used for the Web application and REST API.
  • When the control plane logically starts new Databricks Runtime jobs or performs other cluster administration tasks, these requests are sent to the cluster through this reverse tunnel.
  • The data plane (the VNet) has no open ports, and Databricks Runtime cluster nodes have no public IP addresses.

Benefits:

  • Easy network administration, with no need to configure ports on security groups or to configure network peering.
  • With enhanced security and simple network administration, information security teams can expedite approval of Databricks as a PaaS provider.

Note

All Azure Databricks network traffic between the data plane VNet and the Azure Databricks control plane goes across the Microsoft network backbone, not the public Internet. This is true even if secure cluster connectivity is disabled.

Secure cluster connectivity

Use secure cluster connectivity

To use secure cluster connectivity with a new Azure Databricks workspace, use any of the following options.

  • Azure Portal: When you provision the workspace, go to the Networking tab and set the option Deploy Azure Databricks workspace with Secure Cluster Connectivity (No Public IP) to Yes.
  • ARM Templates: For the Microsoft.Databricks/workspaces resource that creates your new workspace, set the enableNoPublicIp Boolean parameter to true.

Important

In either case, you must register the Azure Resource Provider Microsoft.ManagedIdentity in the Azure subscription that is used to launch workspaces with secure cluster connectivity. This is a one-time operation per subscription. For instructions, see Azure resource providers and types.

You cannot add secure cluster connectivity to an existing workspace. For information about migrating your resources to the new workspaces, contact your Microsoft or Databricks account team for details.

If you’re using ARM templates, add the parameter to one of the following templates, based on whether you want Azure Databricks to create a default (managed) virtual network for the workspace, or if you want to use your own virtual network, also known as VNet injection. VNet injection is an optional feature that allows you to provide your own VNet to host new Azure Databricks clusters.

Egress from workspace subnets

When you enable secure cluster connectivity, both of your workspace subnets are private subnets, since cluster nodes do not have public IP addresses.

The implementation details of network egress vary based on whether you use the default (managed) VNet or whether you use the optional VNet injection feature to provide your own VNet in which to deploy your workspace. See the following sections for details.

Important

Additional costs may be incurred due to increased egress traffic when you use secure cluster connectivity. For a smaller organization that needs a cost-optimized solution, it may be acceptable to disable secure cluster connectivity when you deploy your workspace. However, for the most secure deployment, Microsoft and Databricks strongly recommend that you enable secure cluster connectivity.

Egress with default (managed) VNet

If you use secure cluster connectivity with the default VNet that Azure Databricks creates, Azure Databricks automatically creates a NAT gateway for outbound traffic from your workspace’s subnets to the Azure backbone and public network. The NAT gateway is created within the managed resource group managed by Azure Databricks. You cannot modify this resource group or any resources provisioned within it.

The automatically-created NAT gateway incurs additional cost.

Egress with VNet injection

If you use secure cluster connectivity with optional VNet injection to provide your own VNet, ensure that your workspace has a stable egress public IP and choose one of the following options:

  • For simple deployments, choose an egress load balancer, also called an outbound load balancer. The load balancer’s configuration is managed by Azure Databricks. Clusters have a stable public IP, but you cannot modify the configuration for custom egress needs. This Azure template-only solution has the following requirements:
    • Azure Databricks expects additional fields to the ARM template that creates the workspace: loadBalancerName (load balancer name), loadBalancerBackendPoolName (load balancer backend pool name), loadBalancerFrontendConfigName (load balancer frontend configuration name) and loadBalancerPublicIpName (load balancer public IP name).
    • Azure Databricks expects the Microsoft.Databricks/workspaces resource to have parameters loadBalancerId (load balancer ID) and loadBalancerBackendPoolName (load balancer backend pool name).
    • Azure Databricks does not support changing the configuration of the load balancer.
  • For deployments that need some customization, choose an Azure NAT gateway. Configure the gateway on both of the workspace’s subnets to ensure that all outbound traffic to the Azure backbone and public network transits through it. Clusters have a stable egress public IP, and you can modify the configuration for custom egress needs. You can implement this solution using either an Azure template or from the Azure portal.
  • For deployments with complex routing requirements or deployments that use VNet injection with an egress firewall such as Azure Firewall or other custom networking architectures, you can use custom routes called user-defined routes (UDRs). UDRs ensure that network traffic is routed correctly for your workspace, either directly to the required endpoints or through an egress firewall. If you use such a solution, you must add direct routes or allowed firewall rules for the Azure Databricks secure cluster connectivity relay and other required endpoints listed at User-defined route settings for Azure Databricks.