Configure inbound and outbound network traffic
In this article, learn about the network communication requirements when securing Azure Machine Learning workspace in a virtual network (VNet). Including how to configure Azure Firewall to control access to your Azure Machine Learning workspace and the public internet. To learn more about securing Azure Machine Learning, see Enterprise security for Azure Machine Learning.
Note
The information in this article applies to Azure Machine Learning workspace configured with a private endpoint.
Tip
This article is part of a series on securing an Azure Machine Learning workflow. See the other articles in this series:
- Virtual network overview
- Secure the workspace resources
- Secure the training environment
- For securing inference, see the following documents:
- If using CLI v1 or SDK v1 - Secure inference environment
- If using CLI v2 or SDK v2 - Network isolation for managed online endpoints
- Enable studio functionality
- Use custom DNS
Well-known ports
The following are well-known ports used by services listed in this article. If a port range is used in this article and isn't listed in this section, it's specific to the service and may not have published information on what it's used for:
| Port | Description |
|---|---|
| 80 | Unsecured web traffic (HTTP) |
| 443 | Secured web traffic (HTTPS) |
| 445 | SMB traffic used to access file shares in Azure File storage |
| 8787 | Used when connecting to RStudio on a compute instance |
Required public internet access
Azure Machine Learning requires both inbound and outbound access to the public internet. The following tables provide an overview of what access is required and what it is for. The protocol for all items is TCP. For service tags that end in .region, replace region with the Azure region that contains your workspace. For example, Storage.westus:
| Direction | Ports | Service tag | Purpose |
|---|---|---|---|
| Inbound | 29876-29877 | BatchNodeManagement | Create, update, and delete of Azure Machine Learning compute instance and compute cluster. It is not required if you use No Public IP option. |
| Inbound | 44224 | AzureMachineLearning | Create, update, and delete of Azure Machine Learning compute instance. It is not required if you use No Public IP option. |
| Outbound | 443 | AzureMonitor | Used to log monitoring and metrics to App Insights and Azure Monitor. |
| Outbound | 80, 443 | AzureActiveDirectory | Authentication using Azure AD. |
| Outbound | 443, 8787, 18881 | AzureMachineLearning | Using Azure Machine Learning services. |
| Outbound | 443 | AzureResourceManager | Creation of Azure resources with Azure Machine Learning. |
| Outbound | 443, 445 (*) | Storage.region | Access data stored in the Azure Storage Account for compute cluster and compute instance. (*) 445 is only required if you have a firewall between your virtual network for Azure ML and a private endpoint for your storage accounts. |
| Outbound | 443 | AzureFrontDoor.FrontEnd* Not needed in Azure China. | Global entry point for Azure Machine Learning studio. Store images and environments for AutoML. |
| Outbound | 443 | ContainerRegistry.region | Access docker images provided by Microsoft. |
| Outbound | 443 | MicrosoftContainerRegistry.regionNote that this tag has a dependency on the AzureFrontDoor.FirstParty tag | Access docker images provided by Microsoft. Setup of the Azure Machine Learning router for Azure Kubernetes Service. |
| Outbound | 443 | Keyvault.region | Access the key vault for the Azure Batch service. Only needed if your workspace was created with the hbi_workspace flag enabled. |
Tip
If you need the IP addresses instead of service tags, use one of the following options:
- Download a list from Azure IP Ranges and Service Tags.
- Use the Azure CLI az network list-service-tags command.
- Use the Azure PowerShell Get-AzNetworkServiceTag command.
The IP addresses may change periodically.
Important
When using a compute cluster that is configured for no public IP address, you must allow the following traffic:
- Inbound from source of VirtualNetwork and any port source, to destination of VirtualNetwork, and destination port of 29876, 29877.
- Inbound from source AzureLoadBalancer and any port source to destination VirtualNetwork and port 44224 destination.
You may also need to allow outbound traffic to Visual Studio Code and non-Microsoft sites for the installation of packages required by your machine learning project. The following table lists commonly used repositories for machine learning:
| Host name | Purpose |
|---|---|
| anaconda.com*.anaconda.com | Used to install default packages. |
| *.anaconda.org | Used to get repo data. |
| pypi.org | Used to list dependencies from the default index, if any, and the index is not overwritten by user settings. If the index is overwritten, you must also allow *.pythonhosted.org. |
| cloud.r-project.org | Used when installing CRAN packages for R development. |
| *pytorch.org | Used by some examples based on PyTorch. |
| *.tensorflow.org | Used by some examples based on Tensorflow. |
| update.code.visualstudio.com*.vo.msecnd.net | Used to retrieve VS Code server bits, which are installed on the compute instance through a setup script. |
| raw.githubusercontent.com/microsoft/vscode-tools-for-ai/master/azureml_remote_websocket_server/* | Used to retrieve websocket server bits, which are installed on the compute instance. The websocket server is used to transmit requests from Visual Studio Code client (desktop application) to Visual Studio Code server running on the compute instance. |
When using Azure Kubernetes Service (AKS) with Azure Machine Learning, allow the following traffic to the AKS VNet:
- General inbound/outbound requirements for AKS as described in the Restrict egress traffic in Azure Kubernetes Service article.
- Outbound to mcr.microsoft.com.
- When deploying a model to an AKS cluster, use the guidance in the Deploy ML models to Azure Kubernetes Service article.
Azure Firewall
Important
Azure Firewall provides security for Azure Virtual Network resources. Some Azure Services, such as Azure Storage Accounts, have their own firewall settings that apply to the public endpoint for that specific service instance. The information in this document is specific to Azure Firewall.
For information on service instance firewall settings, see Use studio in a virtual network.
For inbound traffic to Azure Machine Learning compute cluster and compute instance, use user-defined routes (UDRs) to skip the firewall.
For outbound traffic, create network and application rules.
These rule collections are described in more detail in What are some Azure Firewall concepts.
Inbound configuration
When using Azure Machine Learning compute instance (with a public IP) or compute cluster, allow inbound traffic from Azure Batch management and Azure Machine Learning services. Compute instance with no public IP (preview) does not require this inbound communication. A Network Security Group allowing this traffic is dynamically created for you, however you may need to also create user-defined routes (UDR) if you have a firewall. When creating a UDR for this traffic, you can use either IP Addresses or service tags to route the traffic.
Important
Using service tags with user-defined routes is now GA. For more information, see Virtual Network routing.
Tip
While a compute instance without a public IP (a preview feature) does not need a UDR for this inbound traffic, you will still need these UDRs if you also use a compute cluster or a compute instance with a public IP.
For the Azure Machine Learning service, you must add the IP address of both the primary and secondary regions. To find the secondary region, see the Cross-region replication in Azure. For example, if your Azure Machine Learning service is in East US 2, the secondary region is Central US.
To get a list of IP addresses of the Batch service and Azure Machine Learning service, download the Azure IP Ranges and Service Tags and search the file for BatchNodeManagement.<region> and AzureMachineLearning.<region>, where <region> is your Azure region.
Important
The IP addresses may change over time.
When creating the UDR, set the Next hop type to Internet. This means the inbound communication from Azure skips your firewall to access the load balancers with public IPs of Compute Instance and Compute Cluster. UDR is required because Compute Instance and Compute Cluster will get random public IPs at creation, and you cannot know the public IPs before creation to register them on your firewall to allow the inbound from Azure to specific IPs for Compute Instance and Compute Cluster. The following image shows an example IP address based UDR in the Azure portal:
For information on configuring UDR, see Route network traffic with a routing table.
Outbound configuration
Add Network rules, allowing traffic to and from the following service tags:
Service tag Protocol Port AzureActiveDirectory TCP 80, 443 AzureMachineLearning TCP 443, 8787, 18881 AzureResourceManager TCP 443 Storage.region TCP 443 AzureFrontDoor.FrontEnd* Not needed in Azure China. TCP 443 AzureContainerRegistry.region TCP 443 MicrosoftContainerRegistry.regionNote that this tag has a dependency on the AzureFrontDoor.FirstParty tag TCP 443 AzureKeyVault.region TCP 443 Tip
- AzureContainerRegistry.region is only needed for custom Docker images. Including small modifications (such as additional packages) to base images provided by Microsoft.
- MicrosoftContainerRegistry.region is only needed if you plan on using the default Docker images provided by Microsoft, and enabling user-managed dependencies.
- AzureKeyVault.region is only needed if your workspace was created with the hbi_workspace flag enabled.
- For entries that contain
region, replace with the Azure region that you're using. For example,AzureContainerRegistry.westus.
Add Application rules for the following hosts:
Note
This is not a complete list of the hosts required for all Python resources on the internet, only the most commonly used. For example, if you need access to a GitHub repository or other host, you must identify and add the required hosts for that scenario.
Host name Purpose graph.windows.net Used by Azure Machine Learning compute instance/cluster. anaconda.com*.anaconda.com Used to install default packages. *.anaconda.org Used to get repo data. pypi.org Used to list dependencies from the default index, if any, and the index isn't overwritten by user settings. If the index is overwritten, you must also allow *.pythonhosted.org. cloud.r-project.org Used when installing CRAN packages for R development. *pytorch.org Used by some examples based on PyTorch. *.tensorflow.org Used by some examples based on Tensorflow. update.code.visualstudio.com*.vo.msecnd.net Used to retrieve VS Code server bits that are installed on the compute instance through a setup script. raw.githubusercontent.com/microsoft/vscode-tools-for-ai/master/azureml_remote_websocket_server/* Used to retrieve websocket server bits that are installed on the compute instance. The websocket server is used to transmit requests from Visual Studio Code client (desktop application) to Visual Studio Code server running on the compute instance. dc.applicationinsights.azure.com Used to collect metrics and diagnostics information when working with Microsoft support. dc.applicationinsights.microsoft.com Used to collect metrics and diagnostics information when working with Microsoft support. dc.services.visualstudio.com Used to collect metrics and diagnostics information when working with Microsoft support. For Protocol:Port, select use http, https.
For more information on configuring application rules, see Deploy and configure Azure Firewall.
To restrict outbound traffic for models deployed to Azure Kubernetes Service (AKS), see the Restrict egress traffic in Azure Kubernetes Service and Deploy ML models to Azure Kubernetes Service articles.
Kubernetes Compute
Kubernetes Cluster running behind an outbound proxy server or firewall needs extra network configuration. Configure the Azure Arc network requirements needed by Azure Arc agents. The following outbound URLs are also required for Azure Machine Learning,
| Outbound Endpoint | Port | Description | Training | Inference |
|---|---|---|---|---|
| *.kusto.windows.net *.table.core.windows.net *.queue.core.windows.net |
https:443 | Required to upload system logs to Kusto. | ✓ | ✓ |
| *.azurecr.io | https:443 | Azure container registry, required to pull docker images used for machine learning workloads. | ✓ | ✓ |
| *.blob.core.windows.net | https:443 | Azure blob storage, required to fetch machine learning project scripts,data or models, and upload job logs/outputs. | ✓ | ✓ |
| *.workspace.<region>.api.azureml.ms <region>.experiments.azureml.net <region>.api.azureml.ms |
https:443 | Azure Machine Learning service API. | ✓ | ✓ |
| pypi.org | https:443 | Python package index, to install pip packages used for training job environment initialization. | ✓ | N/A |
| archive.ubuntu.com security.ubuntu.com ppa.launchpad.net |
http:80 | Required to download the necessary security patches. | ✓ | N/A |
Note
<region> is the lowcase full spelling of Azure Region, for example, eastus, southeastasia.
Other firewalls
The guidance in this section is generic, as each firewall has its own terminology and specific configurations. If you have questions, check the documentation for the firewall you're using.
If not configured correctly, the firewall can cause problems using your workspace. There are various host names that are used both by the Azure Machine Learning workspace. The following sections list hosts that are required for Azure Machine Learning.
Dependencies API
You can also use the Azure Machine Learning REST API to get a list of hosts and ports that you must allow outbound traffic to. To use this API, use the following steps:
Get an authentication token. The following command demonstrates using the Azure CLI to get an authentication token and subscription ID:
TOKEN=$(az account get-access-token --query accessToken -o tsv) SUBSCRIPTION=$(az account show --query id -o tsv)Call the API. In the following command, replace the following values:
- Replace
<region>with the Azure region your workspace is in. For example,westus2. - Replace
<resource-group>with the resource group that contains your workspace. - Replace
<workspace-name>with the name of your workspace.
az rest --method GET \ --url "https://<region>.api.azureml.ms/rp/workspaces/subscriptions/$SUBSCRIPTION/resourceGroups/<resource-group>/providers/Microsoft.MachineLearningServices/workspaces/<workspace-name>/outboundNetworkDependenciesEndpoints?api-version=2018-03-01-preview" \ --header Authorization="Bearer $TOKEN"- Replace
The result of the API call is a JSON document. The following snippet is an excerpt of this document:
{
"value": [
{
"properties": {
"category": "Azure Active Directory",
"endpoints": [
{
"domainName": "login.microsoftonline.com",
"endpointDetails": [
{
"port": 80
},
{
"port": 443
}
]
}
]
}
},
{
"properties": {
"category": "Azure portal",
"endpoints": [
{
"domainName": "management.azure.com",
"endpointDetails": [
{
"port": 443
}
]
}
]
}
},
...
Microsoft hosts
The hosts in the following tables are owned by Microsoft, and provide services required for the proper functioning of your workspace. The tables list hosts for the Azure public, Azure Government, and Azure China 21Vianet regions.
Important
Azure Machine Learning uses Azure Storage Accounts in your subscription and in Microsoft-managed subscriptions. Where applicable, the following terms are used to differentiate between them in this section:
- Your storage: The Azure Storage Account(s) in your subscription, which is used to store your data and artifacts such as models, training data, training logs, and Python scripts.>
- Microsoft storage: The Azure Machine Learning compute instance and compute clusters rely on Azure Batch, and must access storage located in a Microsoft subscription. This storage is used only for the management of the compute instances. None of your data is stored here.
General Azure hosts
| Required for | Hosts | Protocol | Ports |
|---|---|---|---|
| Azure Active Directory | login.microsoftonline.com | TCP | 80, 443 |
| Azure portal | management.azure.com | TCP | 443 |
| Azure Resource Manager | management.azure.com | TCP | 443 |
Azure Machine Learning hosts
Important
In the following table, replace <storage> with the name of the default storage account for your Azure Machine Learning workspace.
| Required for | Hosts | Protocol | Ports |
|---|---|---|---|
| Azure Machine Learning studio | ml.azure.com | TCP | 443 |
| API | *.azureml.ms | TCP | 443 |
| API | *.azureml.net | TCP | 443 |
| Model management | *.modelmanagement.azureml.net | TCP | 443 |
| Integrated notebook | *.notebooks.azure.net | TCP | 443 |
| Integrated notebook | <storage>.file.core.windows.net | TCP | 443, 445 |
| Integrated notebook | <storage>.dfs.core.windows.net | TCP | 443 |
| Integrated notebook | <storage>.blob.core.windows.net | TCP | 443 |
| Integrated notebook | graph.microsoft.com | TCP | 443 |
| Integrated notebook | *.aznbcontent.net | TCP | 443 |
Azure Machine Learning compute instance and compute cluster hosts
Tip
- The host for Azure Key Vault is only needed if your workspace was created with the hbi_workspace flag enabled.
- Ports 8787 and 18881 for compute instance are only needed when your Azure Machine workspace has a private endpoint.
- In the following table, replace
<storage>with the name of the default storage account for your Azure Machine Learning workspace. - Websocket communication must be allowed to the compute instance. If you block websocket traffic, Jupyter notebooks won't work correctly.
| Required for | Hosts | Protocol | Ports |
|---|---|---|---|
| Compute cluster/instance | graph.windows.net | TCP | 443 |
| Compute instance | *.instances.azureml.net | TCP | 443 |
| Compute instance | *.instances.azureml.ms | TCP | 443, 8787, 18881 |
| Microsoft storage access | *.blob.core.windows.net | TCP | 443 |
| Microsoft storage access | *.table.core.windows.net | TCP | 443 |
| Microsoft storage access | *.queue.core.windows.net | TCP | 443 |
| Your storage account | <storage>.file.core.windows.net | TCP | 443, 445 |
| Your storage account | <storage>.blob.core.windows.net | TCP | 443 |
| Azure Key Vault | *.vault.azure.net | TCP | 443 |
Docker images maintained by by Azure Machine Learning
| Required for | Hosts | Protocol | Ports |
|---|---|---|---|
| Microsoft Container Registry | mcr.microsoft.com*.data.mcr.microsoft.com | TCP | 443 |
| Azure Machine Learning pre-built images | viennaglobal.azurecr.io | TCP | 443 |
Tip
- Azure Container Registry is required for any custom Docker image. This includes small modifications (such as additional packages) to base images provided by Microsoft.
- Microsoft Container Registry is only needed if you plan on using the default Docker images provided by Microsoft, and enabling user-managed dependencies.
- If you plan on using federated identity, follow the Best practices for securing Active Directory Federation Services article.
Also, use the information in the inbound configuration section to add IP addresses for BatchNodeManagement and AzureMachineLearning.
For information on restricting access to models deployed to AKS, see Restrict egress traffic in Azure Kubernetes Service.
Monitoring, metrics, and diagnostics
To support logging of metrics and other monitoring information to Azure Monitor and Application Insights, allow outbound traffic to the following hosts:
Note
The information logged to these hosts is also used by Microsoft Support to be able to diagnose any problems you run into with your workspace.
- dc.applicationinsights.azure.com
- dc.applicationinsights.microsoft.com
- dc.services.visualstudio.com
- *.in.applicationinsights.azure.com
For a list of IP addresses for these hosts, see IP addresses used by Azure Monitor.
Python hosts
The hosts in this section are used to install Python packages, and are required during development, training, and deployment.
Note
This is not a complete list of the hosts required for all Python resources on the internet, only the most commonly used. For example, if you need access to a GitHub repository or other host, you must identify and add the required hosts for that scenario.
| Host name | Purpose |
|---|---|
| anaconda.com*.anaconda.com | Used to install default packages. |
| *.anaconda.org | Used to get repo data. |
| pypi.org | Used to list dependencies from the default index, if any, and the index isn't overwritten by user settings. If the index is overwritten, you must also allow *.pythonhosted.org. |
| *pytorch.org | Used by some examples based on PyTorch. |
| *.tensorflow.org | Used by some examples based on Tensorflow. |
R hosts
The hosts in this section are used to install R packages, and are required during development, training, and deployment.
Note
This is not a complete list of the hosts required for all R resources on the internet, only the most commonly used. For example, if you need access to a GitHub repository or other host, you must identify and add the required hosts for that scenario.
| Host name | Purpose |
|---|---|
| cloud.r-project.org | Used when installing CRAN packages. |
Visual Studio Code hosts
The hosts in this section are used to install Visual Studio Code packages to establish a remote connection between Visual Studio Code and compute instances in your Azure Machine Learning workspace.
Note
This is not a complete list of the hosts required for all Visual Studio Code resources on the internet, only the most commonly used. For example, if you need access to a GitHub repository or other host, you must identify and add the required hosts for that scenario.
| Host name | Purpose |
|---|---|
| update.code.visualstudio.com*.vo.msecnd.net | Used to retrieve VS Code server bits that are installed on the compute instance through a setup script. |
| raw.githubusercontent.com/microsoft/vscode-tools-for-ai/master/azureml_remote_websocket_server/* | Used to retrieve websocket server bits that are installed on the compute instance. The websocket server is used to transmit requests from Visual Studio Code client (desktop application) to Visual Studio Code server running on the compute instance. |
Next steps
This article is part of a series on securing an Azure Machine Learning workflow. See the other articles in this series:
- Virtual network overview
- Secure the workspace resources
- Secure the training environment
- For securing inference, see the following documents:
- If using CLI v1 or SDK v1 - Secure inference environment
- If using CLI v2 or SDK v2 - Network isolation for managed online endpoints
- Enable studio functionality
- Use custom DNS
For more information on configuring Azure Firewall, see Tutorial: Deploy and configure Azure Firewall using the Azure portal.
Povratne informacije
Pošalјite i prikažite povratne informacije za