Partilhar via


Burst to Azure IaaS VM from an HPC Pack Cluster

Requirements to Add Azure IaaS Compute Nodes with Microsoft HPC Pack

This section describes the requirements to add Azure IaaS compute nodes to your HPC cluster.

Supported version of Microsoft HPC Pack cluster

To deploy Azure IaaS compute nodes on your HPC Pack cluster, you must be running Microsoft HPC Pack 2016 Update 1 or a later version.

If you want to create an new HPC Pack cluster entirely in Azure, go to Deploy an HPC Pack 2016 cluster in Azure and choose a template to deploy. Otherwise you need to create an HPC Pack Cluster on-premises first. For the installation instructions for a hybrid HPC Pack cluster, see below:

Azure subscription account

You must obtain an Azure subscription or be assigned as an Owner role of the subscription.

  • To create an Azure subscription, go to the Azure site.

  • To access an existing subscription, go to the Azure portal.

Note

There are some limits (also called quotas) for each Azure subscription. The virtual machine cores have a regional total limit as well as a regional per size series (Dv2, F, etc.) limit that are separately enforced. You can go to Azure portal to check the quotas and usage of your Azure subscription. If you want to raise the quota, open an online customer support request.

Network infrastructure

You will need to provide an Azure virtual network and subnet for the Azure IaaS compute nodes.

If you plan to create an HPC Pack cluster entirely in Azure, you must create the head node(s) and the Azure IaaS compute nodes in a single Azure virtual network.

Diagram shows an Azure virtual network with an H P C H N being added to a group of nodes.

If, however, you plan to create a hybrid HPC Pack cluster with head node(s) in your on-premises corporate network and create Azure IaaS compute nodes in Azure, you must configure a Site-to-Site VPN or ExpressRoute connection from your on-premises network to the Azure virtual network. The head node(s) must be able to connect over the Internet to Azure services as well. You might need to contact your network administrator to configure this connectivity.

Diagram shows a corporate network with an H P C H N connected by site to site V P N to an Azure virtual network.

Configure Network Security Group for Azure virtual network

It is recommended to configure a Network Security Group for the Azure virtual network subnet. The following HPC Port table lists the listening ports for each HPC node type. For more details about the ports, please refer to this document.

Role Port Protocol
Linux Compute Node 40000, 40002 TCP
Windows Compute Node 1856, 6729, 6730, 7998, 8677, 9096, 9100-9611, 42323, 42324 TCP
Broker Node 9087, 9091, 9095, 80, 443, and ports for Windows Compute Node TCP
Head Node 445, 5800, 5802, 5969, 5970, 5974, 5999, 7997, 9090, 9092, 9094, 9892-9894, and ports for Broker Node; 1433 for Local Databases; 10100, 10101, 10200, 10300, 10400 for Service Fabric Cluster (High availability) TCP
Head Node 9894 UDP

For HPC Pack cluster with head node(s) in Azure

For an HPC Pack cluster entirely in Azure, the following NSG rules must be configured.

1. Inbound security rules

The default inbound security rule AllowVNetInBound allows all inbound intra-virtual network traffic. But if you have added any rules to deny the traffic with Source VirtualNetwork or Any with higher priority, make sure the ports listed in HPC Port table are not denied.

If you want to submit jobs from on-premises client over Internet, you must add the following inbound security rules.

Name Port Protocol Source Destination Action
AllowHttpsInBound 443 TCP Any Any Allow
AllowHpcSoaInbound 9087,9090,9091,9094 TCP Any Any Allow
2. OutBound security rules

The default outbound security rule AllowVNetOutBound allows all outbound intra-virtual network traffic. But if you have added any rules to deny the traffic with Destination VirtualNetwork or Any with higher priority, make sure the ports listed in HPC Port table are not denied.

The default outbound security rule AllowInternetOutBound allows all outbound traffic to Internet. But if you have added any rules to deny the traffic with Destination Internet or Any with higher priority, the following outbound rules must be added with higher priorities:

Name Port Protocol Source Destination Action
AllowKeyVaultOutBound Any Any VirtualNetwork AzureKeyVault Allow
AllowAzureCloudOutBound Any Any VirtualNetwork AzureCloud Allow
AllowHttpsOutBound 443 TCP VirtualNetwork Any Allow

For hybrid HPC Pack cluster with on-premises head node(s)

For the hybrid HPC Pack cluster with on-premises head node(s) and broker node(s), and Azure IaaS compute nodes, the following NSG rules must be configured from the perspective of Azure IaaS compute nodes.

1. Inbound security rules

The default inbound security rule AllowVNetInBound allows all inbound intra-virtual network traffic. But if you have added any rules to deny the traffic with Source VirtualNetwork or Any with higher priority, make sure the ports for Linux compute node and Windows compute node listed in HPC Port table are not denied.

Note

If there are firewalls sitting between your Corporation network and Azure virtual network, configure outbound firewall rules to allow these ports from the perspective of head node(s).

2. OutBound security rules

The default outbound security rule AllowVNetOutBound allows all outbound intra-virtual network traffic. But if you have added any rules to deny the traffic with Destination VirtualNetwork or Any with higher priority, the following outbound rules must be added with higher priorities, so that the Azure IaaS compute nodes can connect to the on-premises head node(s).

Name Port Protocol Source Destination Action
AllowHpcIntraVNetTcpOutBound 443, 5970, 6729, 6730, 8677, 9892, 9893, 9894 TCP Any VirtualNetwork Allow
AllowHpcIntraVNetUdpOutBound 9894 UDP Any VirtualNetwork Allow

Note

If there are firewalls sitting between your Corporation network and Azure virtual network, configure inbound firewall rules to allow these ports from the perspective of head node(s).

The default outbound security rule AllowInternetOutBound allows all outbound traffic to Internet. But if you have added any rules to deny the traffic with Destination Internet or Any with higher priority, the following outbound rules must be added with higher priorities:

Name Port Protocol Source Destination Action
AllowKeyVaultOutBound Any Any VirtualNetwork AzureKeyVault Allow
AllowAzureCloudOutBound Any Any VirtualNetwork AzureCloud Allow
AllowHttpsOutBound 443 TCP VirtualNetwork Any Allow

The HPC Pack head node(s) may access the following public URLs at the Set Azure Deployment Configuration step and Create and manage Azure IaaS compute nodes step, you shall add them into the allowlist of your on-premises firewalls.

https://management.core.windows.net

https://management.azure.com

https://login.microsoftonline.com

https://login.live.com

https://login.windows.net

https://graph.windows.net

https://hpcazuresasdispatcher.azurewebsites.net

https://hpcazureconsumptionsb.servicebus.windows.net

https://*.vault.azure.net

https://*.microsoft.com

https://*.msauth.net

https://*.msftauth.net

https://*.core.windows.net

Step 1. Configure the cluster to support deployments of Azure IaaS compute nodes

Open the HPC cluster manager on a head node, in the Deployment To-do List, complete all the three Required deployment tasks. The user name and password of the installation credentials you provided will be used as the administrator user name and password of the Azure virtual machines.

Step 1.1 Set Azure Deployment Configuration

You can set Azure deployment configuration with HPC Cluster Manager or PowerShell Commands.

Set Azure Deployment Configuration with HPC Cluster Manager

Note

The Set Azure Deployment Configuration wizard in this article is based on HPC Pack 2016 Update 2 (and later version).

You can click Set Azure Deployment Configuration and follow the wizard to complete the configuration.

Screenshot shows Deployment to do list selected in the configuration pane. Set Azure Deployment Configuration is highlighted.
1. Configure Azure Service Principal

Azure Service Principal is used by HPC Pack service to provision, start, stop and delete Azure IaaS VM. To configure the Azure Service Principal, click the button Login to log into your Azure account on the Azure Service Principal page.

Screenshot of the Azure Service Principal page. Login and Next are highlighted.

Note

You should login with Microsoft Entra ID(Azure AD) account. If you login with your personal microsoft account, you will encounter the error "This username may be incorrect. Make sure you typed it correctly. Otherwise, contact your admin."

To confirm your account type, sign in to the Azure Portal, click Microsoft Entra ID -> Users and groups, seach and find your account. If its Identity is MicrosoftAccount rather than the domain name of your directory, your account is a Microsoft personal account. The workarround is to find the User principal name of your account and login with it.

If your account gives you access to more than one Azure AD tenant, click your account in the top right corner. Then set your portal session to the desired tenant. You must have permission to access resources in the directory.

Click Microsoft Entra ID in the left Services navigation pane, click Users and groups, and make sure there are user accounts already created or configured

If you have multiple Azure subscriptions associated with your Azure account, click the button Select to choose the subscription used to deploy the Azure IaaS compute nodes.

You can choose an existing Azure Service Principal from Service Principal Name list, and click the button Browse to choose the correct Management Certificate which was used to create the Azure Service principal, or you can click the button Create to create a new Azure Service Principal.

Screenshot of the Azure Service Principal page. Select, create, browse and the service principal name are highlighted.

If you choose to create a new Azure Service Principal, on the Create Azure Service Principal dialog, Specify a friendly unique Display Name for the new Azure service principal, and click Browse to choose a Certificate from Local Computer\Personal store, or click Import to import a PFX format certificate or generate a new self-signed certificate. And then click OK to create the Azure service principal.

Screenshot shows the Create Azure Service Principal dialog box. Display name, browse, import and ok are highlighted.

Note

  • The certificate for the Azure Service Principal must be different from the certificate used to secure the communication between HPC nodes.

  • To create the Azure Service Principal, your Azure account must be an Owner role of the Azure subscription, and the Azure Service Principal will be granted as Contributor role of the Azure subscription by default, you can refer to Access control for Azure resources in HPC Pack cluster to manually re-configure the access permissions for the Azure Service Principal according to your user scenario.

2. Specify Azure Virtual Network

On the Azure virtual network page, specify the information of the Azure virtual network in which your Azure IaaS compute nodes will be created.

Azure Location: The azure location in which the virtual network locates

Resource Group Name: The resource group in which the virtual network was created

Virtual Network Name: The name of the virtual network in which your Azure IaaS compute nodes will be created.

Subnet Name: The name of the subnet in which your Azure IaaS compute nodes will be created.

Screenshot shows the Azure virtual network page, Next is highlighted.

Note

The virtual network you specified must have a site to site VPN or Express Route connection to the on-premises network where your head node located.

3. Configure Azure Key Vault Certificate

HPC Pack service uses X.509 certificate to secure the HPC node communicate. Thus we need import this certificate to the Azure Key Vault so that it can be installed to the Azure IaaS VM during provisioning. On Azure Key Vault Certificate page, click the button Select to choose the Azure Key Vault Name and Secret Name if you had already created the Azure Key Vault secret. Or click the button Create to create a new one.

Screenshot shows the Azure Key Vault Certificate page, select, create, and next are highlighted.

If you choose to create a new Key Vault secret, you can select an existing Azure key vault name from the Vault Name list, or click Create to create a new Azure key vault. And then specify a friendly Secret Name, click Browse or Import to select a correct certificate.

Screenshot shows the key vault dialog box. The valult name, secret name, and certificate sections are highlighted.

Note

If you are using a self-signed certificate on the head node(s) for HPC node communication, you must upload the same certificate (the one used during head node installation) to Azure Key Vault Secret. If you fails to do so, the Azure IaaS compute nodes will be unreachable for the head node(s) due to un-trusted certificate issue. And you can use the following PowerShell command to get the certificate thumbprint that is used for node communication: Get-HPCClusterRegistry -propertyName SSLThumbprint

Review the settings and click Finish to complete the configuration.

Set Azure Deployment Configuration with PowerShell

You can also choose to run the following PowerShell commands to set Azure Deployment Configuration if you have already:

  • Created the Azure Service Principal and Azure Key Vault Certificate.
  • Installed the certificate for Azure Service Principal to Local Computer\Personal certificate store with private key on all the head node machines.
Add-PSSnapin Microsoft.Hpc
# Set Azure subscription and Service Principal information
Set-HpcClusterRegistry -PropertyName SubscriptionId -PropertyValue <subscriptionId>
Set-HpcClusterRegistry -PropertyName TenantId -PropertyValue <tenantId>
Set-HpcClusterRegistry -PropertyName ApplicationId -PropertyValue <ServiceprincipalApplicationId>
Set-HpcClusterRegistry -PropertyName Thumbprint -PropertyValue <ServiceprincipalCertThumbprint>

# Set Virtual network information
Set-HpcClusterRegistry -PropertyName VNet -PropertyValue <VNetName>
Set-HpcClusterRegistry -PropertyName Subnet -PropertyValue <SubnetName>
Set-HpcClusterRegistry -PropertyName Location -PropertyValue <VNetLocation>
Set-HpcClusterRegistry -PropertyName ResourceGroup -PropertyValue <VNetResourceGroup>

# Set Azure Key vault certificate
Set-HpcKeyVaultCertificate -ResourceGroup <KeyVaultResourceGroupName> -CertificateUrl <KeyVaultSecretUrlWithVersion> -CertificateThumbprint <KeyVaultCertificateThumbprint>

Step 1.2 Configure other cluster properties

If you plan to create non-domain joined Azure IaaS Windows compute nodes or Linux compute nodes in a different subnet where the head node(s) locate, run the following PowerShell command on a head node to make the cluster add host entries for nodes in different subnets. If you failed to do so, the nodes will be unreachable for the head node(s) because the head node(s) cannot resolve their host name.

Set-HpcClusterRegistry -PropertyName HostFileForOtherSubnet -PropertyValue 1
if($env:CCP_CONNECTIONSTRING -like "*,*,*") {
    Connect-ServiceFabricCluster
    $opId = [Guid]::NewGuid()
    Start-ServiceFabricPartitionRestart -OperationId $opId -RestartPartitionMode AllReplicasOrInstances -ServiceName fabric:/HpcApplication/ManagementStatelessService -ErrorAction Stop
}

If you are running HPC Pack 2016 Update 2 or earlier version, and plan to create Azure IaaS Linux compute nodes created with Azure IaaS node template, run the following PowerShell command on a head node to enable the communication over Http between head node(s) and Linux compute nodes.

Set-HpcClusterRegistry -PropertyName LinuxHttps -PropertyValue 0
if($env:CCP_CONNECTIONSTRING -like "*,*,*") {
    Connect-ServiceFabricCluster
    $opId = [Guid]::NewGuid()
    Start-ServiceFabricPartitionRestart -OperationId $opId -RestartPartitionMode AllReplicasOrInstances -ServiceName fabric:/HpcApplication/SchedulerStatefulService -ErrorAction Stop
} else {
    Restart-Service -Name HpcScheduler
}

Step 2. Create an Azure IaaS node template

Important

  1. If you choose to use a custom image or shared image, the operating system of the VM image must meet the requirements.
  2. Shared Image is NOT supported in HPC Pack 2016 Update 2 or earlier version.
  3. Azure Spot VMs is supported starting from HPC Pack 2019

On the Configuration panel, click Node Templates, and click New in the Actions list to create an Azure IaaS node template.

Screenshot shows node templates selected. New is highlighted in the Actions pane.

On Choose Node Template Type page, choose the node template type as Azure IaaS node template.

Screenshot shows the Choose Node Template Type page with Azure I a a S node template selected. Next is highlighted.

On the Specify Template Name page, specify a Template name and optionally specify the Description.

Screenshot shows the Specify Template Name page with a template name entered. Next is highlighted.

On the Specify VM Group information page, specify Resource Group Name of the Azure resource group in which the IaaS compute nodes will be created in. You can select an existing resource group, or specify a new resource group name. If you specify a new resource group name, HPC Pack cluster will create the resource group when deploying the first Azure IaaS compute node with this node template.

Specify whether you want to create nodes in an Azure availability set, and specify the Availability Set Name if necessary. If you specify a new availability set name, HPC Pack cluster will create it when deploying the first Azure IaaS compute node with this node template.

Screenshot shows the Specify V M Group information page with Resource Group Name highlighted. Create nodes in an Azure Availability set is unchecked.

On the Specify VM Image page, specify the VM image used to deploy the IaaS compute nodes. You can select one of the following Image Types: MarketplaceImage, CustomWindowsImage, or CustomLinuxImage.

If you choose Image Type as MarketplaceImage, select OS Type and Image Label to choose a public VM image in Azure marketplace.

If the OS Type is with Windows and your HPC Pack head node(s) are domain joined, specify whether you want to Join the nodes into domain. It is recommended to join the Windows compute nodes into domain.

Screenshot shows the Specify V M image dialog. The drop down lists are highlighted. Join the nodes into domain is checked.

If you choose Image Type as CustomImage, specify the OS Type, the Image Name of the customized VM image, and the Resource Group in which the image is stored. The VM image must have been created in the same Azure location in which the Azure IaaS compute nodes will be created, and make sure the Azure service principal you specified in Step 1.1 is granted Read permission to the custom image. Please follow Create Custom Image for creating your own customized image for you IaaS VM.

You can click the link More information about custom VM Image for HPC Pack compute node to learn how to create a customized HPC Pack compute node VM image.

Screenshot shows the Specify the V M image dialog. O S type, resource group and image name are highlighted. Join the nodes into domain is checked.

If you choose Image Type as SharedImage, specify the OS Type, the Azure Resource ID of the shared VM image in Azure Shared Image Gallery. Make sure the Azure service principal you specified in Step 1.1 is granted Read permission to the shared image gallery.

Screenshot shows the Specify the V M image dialog. O S type and resource I D are highlighted.

On the Review page, review the settings you had specified, and click Create to create the node template.

Before you create new Azure IaaS compute nodes with the node template, if you want to specify some advanced options, for example, Use Azure Spot VMs, refer to Advanced options for IaaS node template.

Step 3. Create the IaaS compute nodes and manage them

Open HPC Cluster Manager console, click Resource Management bar, and click Add Node to start the Add Node Wizard.

Screenshot shows the Resource Management page with Add Node highlighted in the Actions pane.

On the Select Deployment Method page, select Add Azure IaaS VM nodes.

Screenshot shows the Select Deployment Method page with Add Azure I a a S V M nodes selected.

On the Specify New Nodes page, select the Node template we just created in Step 2, and specify Number of nodes and VM size of nodes, and click Next.

Screenshot shows the Add Node wizard. Specify new nodes is selected.

After you click Finish, you can find two new nodes in Nodes list. The corresponding Azure virtual machines for these two nodes are in fact not yet created in Azure side.

Screenshot shows nodes selected. Two nodes in the list are highlighted.

You can then choose the nodes and click Start to create the virtual machines in Azure.

Screenshot shows the list of nodes. Start is highlighted in the actions pane.

Wait for the provisioning of the Azure IaaS compute nodes.

Screenshot shows the nodes page. The provisioning log shows the operations that are executing.

After the deployment of Azure IaaS compute nodes is completed and the Node Health becomes OK, you can submit jobs to these nodes.

You can manually stop the nodes by clicking Stop, and the virtual machines in Azure will be de-allocated.

Screenshot shows the Nodes list with two nodes selected. Stop is highlighted in the main pane and in the Actions pane.

You can also Delete the nodes if you don't need them anymore, the Azure virtual machines will be also deleted if you do so.

If you enabled the auto grow and shrink Azure nodes feature, the Azure IaaS nodes will be automatically started or stopped depending on the cluster workload, see Auto grow shrink for Azure resources.

Advanced options for IaaS node template

In most scenarios, you can directly use the node template created in Step 2 to create Azure IaaS compute nodes, there is no need to specify advanced options. If you want to specify advanced options, go to Configuration -> Node Templates, select the node template you just created, and click Edit.

Note

Azure IaaS node template can be edited only when there is no compute node created in it.

  • Use Azure Spot VMs

Creating Azure IaaS compute nodes with Azure Spot VMs is a feature introduced in HPC Pack 2019. Using Azure Spot VMs allows you to take advantage of unused Azure compute capacity at a significant cost savings. However, there is no SLA for Azure Spot VMs.

  • The deployment of IaaS compute node may fail if there is no Spot capacity available in the specified Azure region.

  • Any running Azure Spot VM can be evicted and moved to stopped-deallocated state at any point in time when Azure infrastructure needs the capacity back. In that case, the corresponding HPC compute node will be shown in Error node health state, and any tasks running on the node will be interrupted and requeued.

  • Azure infrastructure will not automatically redeploy an evicted Azure Spot VM when Spot capacity is available again. You can try to manually take the node offline and then reboot it on HPC Cluster Manager to redeploy the Azure Spot VM at a later time, but the re-deployment may still fail due to no available Spot capacity.

Screenshot shows the Node Template Editor with the Use Spot V Ms specified as True.
  • DNS Servers

By default DNS Servers option is not set, and the network interfaces of the Azure VMs obtain the DNS server settings from virtual network. If you want to explicitly set the DNS servers for the network interfaces, specify the list of DNS servers separated with commas.

Screenshot shows the Node Template Editor with DNS Servers highlighted and H P C Auto Update specified as True.
  • HPC Auto Update

HPC Pack uses Azure VM extension to deploy HPC Pack components in the Azure VMs. The HPC Auto Update option specifies whether the Azure VM agent must automatically upgrade the HPC Pack version if a new version of HPC Pack compute node VM extension is published. The default value is False, we strongly recommend that you NOT set it to True, because the running tasks will be interrupted, and the HPC jobs may fail as well.