Use simplified compute node communication

An Azure Batch pool contains one or more compute nodes which execute user-specified workloads in the form of Batch tasks. To enable Batch functionality and Batch pool infrastructure management, compute nodes must communicate with the Azure Batch service.

This document describes forthcoming changes with how the Azure Batch service communicates with Batch pool compute nodes, the network configuration changes which may be required, and how to opt your Batch accounts in or out of using the new simplified compute node communication feature during the public preview period.

Important

Support for simplified compute node communication in Azure Batch is currently in public preview. This preview version is provided without a service level agreement, and it's not recommended for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Opting in is not required at this time. However, in the future, using simplified compute node communication will be required for all Batch accounts. At that time, an official retirement notice will be provided, with an opportunity to migrate your Batch pools before that happens.

Compute node communication changes

The Azure Batch service is simplifying the way Batch pool infrastructure is managed on behalf of users. The new communication method reduces the complexity and scope of inbound and outbound networking connections required in baseline operations.

Batch pools in accounts which haven't been opted in to simplified compute node communication require the following networking rules in network security groups (NSGs), user-defined routes (UDRs), and firewalls when creating a pool in a virtual network:

  • Inbound:

    • Destination ports 29876, 29877 over TCP from BatchNodeManagement.region
  • Outbound:

    • Destination port 443 over TCP to Storage.region
    • Destination port 443 over TCP to BatchNodeManagement.region for certain workloads that require communication back to the Batch Service, such as Job Manager tasks

With the new model, Batch pools in accounts that use simplified compute node communication require the following networking rules in NSGs, UDRs, and firewalls:

  • Inbound:

    • None
  • Outbound:

    • Destination port 443 over TCP to BatchNodeManagement.region

Outbound requirements for a Batch account can be discovered using the List Outbound Network Dependencies Endpoints API. This API will report the base set of dependencies, depending upon the Batch account pool communication model. User-specific workloads may need additional rules such as opening traffic to other Azure resources (such as Azure Storage for Application Packages, Azure Container Registry, etc.) or endpoints like the Microsoft package repository for virtual file system mounting functionality.

Benefits of the new model

Azure Batch users who opt in to the new communication model benefit from simplification of networking connections and rules.

Simplified compute node communication helps reduce security risks by removing the requirement to open ports for inbound communication from the internet. Only a single outbound rule to a well-known Service Tag is required for baseline operation.

The new model also provides more fine-grained data exfiltration control, since outbound communication to Storage.region is no longer required. You can explicitly lock down outbound communication to Azure Storage if required for your workflow (such as AppPackage storage accounts, other storage accounts for resource files or output files, or other similar scenarios).

Even if your workloads are not currently impacted by the changes (as described in the next section), you may still want to opt in to use simplified compute node communication now. This will ensure your Batch workloads are ready for any future improvements enabled by this model.

Scope of impact

In many cases, this new communication model will not directly affect your Batch workloads. However, simplified compute node communication will have an impact for the following cases:

  • Users who specify a Virtual Network as part of creating a Batch pool and do one or both of the following:
    • Explicitly disable outbound network traffic rules that are incompatible with simplified compute node communication.
    • Use UDRs and firewall rules that are incompatible with simplified compute node communication.
  • Users who enable software firewalls on compute nodes and explicitly disable outbound traffic in software firewall rules which are incompatible with simplified compute node communication.

If either of these cases applies to you, and you would like to opt in to the preview, follow the steps outlined in the next section to ensure that your Batch workloads can still function under the new model.

Required network configuration changes

For impacted users, the following set of steps are required to migrate to the new communication model:

  1. Ensure your networking configuration as applicable to Batch pools (NSGs, UDRs, firewalls, etc.) includes a union of the models (that is, the network rules prior to simplified compute node communication and after). At a minimum, these rules would be:
    • Inbound:
      • Destination ports 29876, 29877 over TCP from BatchNodeManagement.region
    • Outbound:
      • Destination port 443 over TCP to Storage.region
      • Destination port 443 over TCP to BatchNodeManagement.region
  2. If you have any additional inbound or outbound scenarios required by your workflow, you will need to ensure that your rules reflect these requirements.
  3. Opt in to simplified compute node communication as described below.
  4. Use one of the following options to update your workloads to use the new communication model. Whichever method you use, keep in mind that pools without public IP addresses are unaffected and can't currently use simplified compute node communication. Please see the Current limitations section.
    1. Create new pools and validate that the new pools are working correctly. Migrate your workload to the new pools and delete any earlier pools.
    2. Resize all existing pools to zero nodes and scale back out.
  5. After confirming that all previous pools have been either deleted or scaled to zero and back out, query the List Outbound Network Dependencies Endpoints API to confirm that no outbound rule to Azure Storage for the region exists (excluding any autostorage accounts if linked to your Batch account).
  6. Modify all applicable networking configuration to the Simplified Compute Node Communication rules, at the minimum (please note any additional rules needed as discussed above):
    • Inbound:
      • None
    • Outbound:
      • Destination port 443 over TCP to BatchNodeManagement.region

If you follow these steps, but later want to stop using simplified compute node communication, you'll need to do the following:

  1. Opt out of simplified compute node communication as described below.
  2. Migrate your workload to new pools, or resize existing pools and scale back out (see step 4 above).
  3. Confirm that all of your pools are no longer using simplified compute node communication by using the List Outbound Network Dependencies Endpoints API. You should see an outbound rule to Azure Storage for the region (independent of any autostorage accounts linked to your Batch account).

Opt your Batch account in or out of simplified compute node communication

To opt a Batch account in or out of simplified compute node communication, create a new support request in the Azure portal.

Important

When you opt in (or opt out) of simplified compute node communication, the change only impacts future behavior. Any Batch pools containing non-zero compute nodes that were created before the request are unaffected, and will use whichever model was active before the request. Please see the migration steps for more information on how to migrate existing pools before either opting-in or opting-out.

Use the following options when creating your request.

Screenshot of a support request opting in to simplified compute node communication.

  1. For Issue type, select Technical.
  2. For Service type, select Batch Service.
  3. For Resource, select the Batch account for this request.
  4. For Summary:
    • To opt in, type "Enable simplified compute node communication".
    • To opt our, type "Disable simplified compute node communication".
  5. For Problem type, select Batch Accounts.
  6. For Problem subtype, select Other issues with Batch Accounts.
  7. Select Next, then select Next again to go to the Additional details page.
  8. In Additional details, you can optionally specify that you want to enable all of the Batch accounts in your subscription, or across multiple subscription. If you do so, be sure to include the subscription IDs here.
  9. Make any other required selections on the page, then select Next.
  10. Review your request details, then select Create to submit your support request.

After your request has been submitted, you will be notified once the account has been opted in (or out).

Current limitations

The following are known limitations for accounts that opt in to simplified compute node communication:

  • Creating pools without public IP addresses isn't currently supported for accounts which have opted in.
  • Previously created pools without public IP addresses won't use simplified compute node communication, even if the Batch account has opted in.
  • Private Batch accounts can opt in to simplified compute node communication, but Batch pools created by these Batch accounts must have public IP addresses in order to use simplified compute node communication.
  • Cloud Service Configuration pools are currently not supported for simplified compute node communication and are generally deprecated. We recommend using Virtual Machine Configuration for your Batch pools. For more information, see Migrate Batch pool configuration from Cloud Services to Virtual Machine.

Next steps