Network traffic flow when using a secured workspace

When your Azure Machine Learning workspace and associated resources are secured in an Azure Virtual Network, it changes the network traffic between resources. Without a virtual network, network traffic flows over the public internet or within an Azure data center. Once a virtual network (VNet) is introduced, you may also want to harden network security. For example, blocking inbound and outbound communications between the VNet and public internet. However, Azure Machine Learning requires access to some resources on the public internet. For example, Azure Resource Management is used for deployments and management operations.

This article lists the required traffic to/from the public internet. It also explains how network traffic flows between your client development environment and a secured Azure Machine Learning workspace in the following scenarios:

  • Using Azure Machine Learning studio to work with:

    • Your workspace
    • AutoML
    • Designer
    • Datasets and datastores

    Tip

    Azure Machine Learning studio is a web-based UI that runs partially in your web browser, and makes calls to Azure services to perform tasks such as training a model, using designer, or viewing datasets. Some of these calls use a different communication flow than if you are using the SDK, CLI, REST API, or VS Code.

  • Using Azure Machine Learning studio, SDK, CLI, or REST API to work with:

    • Compute instances and clusters
    • Azure Kubernetes Service
    • Docker images managed by Azure Machine Learning

Tip

If a scenario or task is not listed here, it should work the same with or without a secured workspace.

Assumptions

This article assumes the following configuration:

  • Azure Machine Learning workspace using a private endpoint to communicate with the VNet.
  • The Azure Storage Account, Key Vault, and Container Registry used by the workspace also use a private endpoint to communicate with the VNet.
  • A VPN gateway or Express Route is used by the client workstations to access the VNet.

Inbound and outbound requirements

Scenario Required inbound Required outbound Additional configuration
Access workspace from studio NA
  • Azure Active Directory
  • Azure Front Door
  • Azure Machine Learning service
You may need to use a custom DNS server. For more information, see Use your workspace with a custom DNS.
Use AutoML, designer, dataset, and datastore from studio NA NA
  • Workspace service principal configuration
  • Allow access from trusted Azure services
For more information, see How to secure a workspace in a virtual network.
Use compute instance and compute cluster
  • Azure Machine Learning service on port 44224
  • Azure Batch Management service on ports 29876-29877
  • Azure Active Directory
  • Azure Resource Manager
  • Azure Machine Learning service
  • Azure Storage Account
  • Azure Key Vault
If you use a firewall, create user-defined routes. For more information, see Configure inbound and outbound traffic.
Use Azure Kubernetes Service NA For information on the outbound configuration for AKS, see How to deploy to Azure Kubernetes Service. Configure the Internal Load Balancer. For more information, see How to deploy to Azure Kubernetes Service.
Use Docker images managed by Azure Machine Learning NA
  • Microsoft Container Registry
  • viennaglobal.azurecr.io global container registry
If the Azure Container Registry for your workspace is behind the VNet, configure the workspace to use a compute cluster to build images. For more information, see How to secure a workspace in a virtual network.

Important

Azure Machine Learning uses multiple storage accounts. Each stores different data, and has a different purpose:

  • Your storage: The Azure Storage Account(s) in your Azure subscription are used to store your data and artifacts such as models, training data, training logs, and Python scripts. For example, the default storage account for your workspace is in your subscription. The Azure Machine Learning compute instance and compute clusters access file and blob data in this storage over ports 445 (SMB) and 443 (HTTPS).

    When using a compute instance or compute cluster, your storage account is mounted as a file share using the SMB protocol. This is how the compute instance/cluster accesses your data.

  • Microsoft storage: The Azure Machine Learning compute instance and compute clusters rely on Azure Batch, and access storage located in a Microsoft subscription. This storage is used only for the management of the compute instance/cluster. None of your data is stored here. The compute instance and compute cluster access the blob, table, and queue data in this storage, using port 443 (HTTPS).

Machine Learning also stores metadata in an Azure Cosmos DB instance. By default, this instance is hosted in a Microsoft subscription and managed by Microsoft. You can optionally use an Azure Cosmos DB instance in your Azure subscription. For more information, see Data encryption with Azure Machine Learning.

Scenario: Access workspace from studio

Note

The information in this section is specific to using the workspace from the Azure Machine Learning studio. If you use the Azure Machine Learning SDK, REST API, CLI, or Visual Studio Code, the information in this section does not apply to you.

When accessing your workspace from studio, the network traffic flows are as follows:

  • To authenticate to resources, Azure Active Directory is used.
  • For management and deployment operations, Azure Resource Manager is used.
  • For Azure Machine Learning specific tasks, Azure Machine Learning service is used
  • For access to Azure Machine Learning studio (https://ml.azure.com), Azure FrontDoor is used.
  • For most storage operations, traffic flows through the private endpoint of the default storage for your workspace. Exceptions are discussed in the Use AutoML, designer, dataset, and datastore section.
  • You also need to configure a DNS solution that allows you to resolve the names of the resources within the VNet. For more information, see Use your workspace with a custom DNS.

Diagram of network traffic between client and workspace when using studio

Scenario: Use AutoML, designer, dataset, and datastore from studio

The following features of Azure Machine Learning studio use data profiling:

  • Dataset: Explore the dataset from studio.
  • Designer: Visualize module output data.
  • AutoML: View a data preview/profile and choose a target column.
  • Labeling

Data profiling depends on the Azure Machine Learning managed service being able to access the default Azure Storage Account for your workspace. The managed service does not exist in your VNet, so cannot directly access the storage account in the VNet. Instead, the workspace uses a service principal to access storage.

Tip

You can provide a service principal when creating the workspace. If you do not, one is created for you and will have the same name as your workspace.

To allow access to the storage account, configure the storage account to allow a resource instance for your workspace or select the Allow Azure services on the trusted services list to access this storage account. This setting allows the managed service to access storage through the Azure data center network.

Next, add the service principal for the workspace to the Reader role to the private endpoint of the storage account. This role is used to verify the workspace and storage subnet information. If they are the same, access is allowed. Finally, the service principal also requires Blob data contributor access to the storage account.

For more information, see the Azure Storage Account section of How to secure a workspace in a virtual network.

Diagram of traffic between client, data profiling, and storage

Scenario: Use compute instance and compute cluster

Azure Machine Learning compute instance and compute cluster are managed services hosted by Microsoft. They are built on top of the Azure Batch service. While they exist in a Microsoft managed environment, they are also injected into your VNet.

When you create a compute instance or compute cluster, the following resources are also created in your VNet:

  • A Network Security Group with required outbound rules. These rules allow inbound access from the Azure Machine Learning (TCP on port 44224) and Azure Batch service (TCP on ports 29876-29877).

    Important

    If you usee a firewall to block internet access into the VNet, you must configure the firewall to allow this traffic. For example, with Azure Firewall you can create user-defined routes. For more information, see How to use Azure Machine Learning with a firewall.

  • A load balancer with a public IP.

Also allow outbound access to the following service tags. For each tag, replace region with the Azure region of your compute instance/cluster:

  • Storage.region - This outbound access is used to connect to the Azure Storage Account inside the Azure Batch service-managed VNet.
  • Keyvault.region - This outbound access is used to connect to the Azure Key Vault account inside the Azure Batch service-managed VNet.

Data access from your compute instance or cluster goes through the private endpoint of the Storage Account for your VNet.

If you use Visual Studio Code on a compute instance, you must allow other outbound traffic. For more information, see How to use Azure Machine Learning with a firewall.

Diagram of traffic flow when using compute instance or cluster

Scenario: Use Azure Kubernetes Service

For information on the outbound configuration required for Azure Kubernetes Service, see the connectivity requirements section of How to deploy to Azure Kubernetes Service.

Note

The Azure Kubernetes Service load balancer is not the same as the load balancer created by Azure Machine Learning. If you want to host your model as a secured application, only available on the VNet, use the internal load balancer created by Azure Machine Learning. If you want to allow public access, use the public load balancer created by Azure Machine Learning.

If your model requires extra inbound or outbound connectivity, such as to an external data source, use a network security group or your firewall to allow the traffic.

Scenario: Use Docker images managed by Azure ML

Azure Machine Learning provides Docker images that can be used to train models or perform inference. If you don't specify your own images, the ones provided by Azure Machine Learning are used. These images are hosted on the Microsoft Container Registry (MCR). They are also hosted on a geo-replicated Azure Container Registry named viennaglobal.azurecr.io.

If you provide your own docker images, such as on an Azure Container Registry that you provide, you do not need the outbound communication with MCR or viennaglobal.azurecr.io.

Tip

If your Azure Container Registry is secured in the VNet, it cannot be used by Azure Machine Learning to build Docker images. Instead, you must designate an Azure Machine Learning compute cluster to build images. For more information, see How to secure a workspace in a virtual network.

Diagram of traffic flow when using provided Docker images

Next steps

Now that you've learned how network traffic flows in a secured configuration, learn more about securing Azure ML in a virtual network by reading the Virtual network isolation and privacy overview article.

For information on best practices, see the Azure Machine Learning best practices for enterprise security article.