Manage Certificates for HPC Pack 2016 or later version Cluster

There are several certificates used in HPC Pack Cluster for different purpose. Here is a full list:

Certificate Purpose and Description Install Locations
Microsoft HPC Azure Client Used by Head node(s) to communicate with Azure PaaS proxy nodes
It is a self-signed certificate auto-generated by HPC Cluster.
The
Head node(s):
LocalComputer\Personal
Microsoft HPC Azure Service Used by Azure PaaS proxy nodes to communicate with Head node(s)
It is a self-signed certificate auto-generated by HPC Cluster.
Azure proxy nodes:
LocalComputer\Personal
Microsoft HPC Azure Management Used by Head node(s) to communicate with Azure Management Service to manage Azure resources in classic mode Head node(s):
LocalComputer\Personal
LocalComputer\Trusted Root CA (self-signed only)
Azure Portal:
Subscriptions\Management certificates
Azure Service Principal Certificate Used by Head node(s) to communicate with Azure Resource Manager to manage Azure resources in resource manager mode.
You can use the same certificate with Microsoft HPC Azure Management
Head node(s):
LocalComputer\Personal
Azure Service Principal
HPC Pack Communication for Head node Used by Head node(s) to communicate with other head nodes and Compute/Broker/Workstation/Linux nodes (i.e. all other nodes except for Azure PaaS nodes and Azure Batch Pool)
If it is a self-signed certificate, and you plan to use Burst to Azure IaaS VM feature to deploy Azure IaaS compute nodes, you shall also import this certificate into Azure Key Vault so that it can be used Azure IaaS compute nodes to communicate with head node(s)
Head node(s) and IaaS compute nodes(1):
LocalComputer\Personal
LocalComputer\Trusted Root CA (self-signed only)
On-premise Windows nodes and HPC Client(2)(3):
LocalComputer\Trusted Root CA (self-signed only)
HPC Pack Communication for compute node Used by Compute/Broker/Workstation/Linux nodes to communicate with head node(s)
You can use the same certificate with HPC Pack Communication for head node
For an HPC Pack cluster entirely in Azure or a hybrid cluster with Azure IaaS compute nodes, we recommend to use same certificate with HPC Pack Communication for head node
On-premise Windows nodes:
LocalComputer\Personal
LocalComputer\Trusted Root CA (self-signed only)
Linux nodes:
/opt/hpcnodemanager/cert
Head node(s):
LocalComputer\Trusted Root CA (self-signed only)
Service Fabric Certificate Used by the head nodes to secure the Service Fabric cluster communication
By default it uses the same certificate with HPC Pack Communication for Head node.
You can add additional certificate for Service Fabric Cluster
Head nodes:
LocalComputer\Personal
LocalComputer\Trusted Root CA (self-signed only)
CurrentUser\Personal**(4)**

(1) Here the term IaaS compute nodes means the compute nodes deployed with Burst to Azure IaaS VM feature or HPC Pack cluster Deployment template. If you manually run HPC setup wizard (setup.exe) to install HPC compute node on an Azure IaaS VM, you can treat it as an On-premise compute node.

(2) For domain joined HPC Client machine, you can opt not to install the certificate HPC Pack Communication for Head node in Local Computer\Trusted Root CA store with the following two ways:

  • During HPC Client installation, choose "Skip CA and CN validation"

  • Add registry value named CertificateValidationType with DWORD value 0 under registry key HKLM\SOFTWARE\Microsoft\HPC

(3) For non-domain joined HPC Client machine, you must install the certificate HPC Pack Communication for Head node in Local Computer\Personal with private key and to CurrentUser\Trusted Root CA without private key, and then add a registry value named SSLThumbprint under registry key HKLM\SOFTWARE\Microsoft\HPC and specify the certificate thumbprint.

(4) If you want to access Service Fabric cluster portal (https://localhost:10400) on the head node, you shall install the certificate under CurrentUser\Personal as well with private key.

Rotate HPC Pack Node Communication Certificates

Microsoft HPC Pack 2016 (and later) uses certificate to secure the communication between the HPC nodes. You can use one same certificate on all HPC nodes, or use two different certificates, one for head node(s), one for other nodes. You need to rotate the certificate(s) before they expire, if you fail to do so, the HPC Pack cluster will stop working.

The certificate(s) must meet the following requirements:

  1. Have a private key capable of key exchange;
  2. Key usage includes Digital Signature and Key Encipherment;
  3. Enhanced key usage includes Client Authentication and Server Authentication;
  4. If two different certificates are used, they must have a same subject name.

If the certificate is used to secure Service Fabric Cluster as well, it must meet the following additional requirements:

  1. The certificate's provider must be Microsoft Enhanced RSA and AES Cryptographic Provider;
  2. The RSA key length must be 2048 bits.

Prepare new certificate

When you prepare new certificate, make sure that you use the same subject name with that of the old certificate, you can run the following PowerShell command line on the HPC node to get the subject name of your certificate.

$thumbprint = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\HPC -Name SSLThumbprint).SSLThumbPrint
$subjectName = (Get-Item Cert:\LocalMachine\My\$thumbprint).Subject
$subjectName

If you are using self-signed certificate, you can run the following PowerShell command (fill the correct subject name) on a machine with operating system Windows 10 or Windows Server 2016 to generate a new certificate which meets all above 6 requirements, you will get two files under a folder named with the thumbprint of the new certificate: PrivateCert.pfx with private key, and PublicCert.cer without private key.

$subjectName = "<subject-name>"
$pfxcert = New-SelfSignedCertificate -Subject $subjectName -KeySpec KeyExchange -KeyLength 2048 -HashAlgorithm SHA256 -TextExtension @("2.5.29.37={text}1.3.6.1.5.5.7.3.1,1.3.6.1.5.5.7.3.2") -Provider "Microsoft Enhanced RSA and AES Cryptographic Provider" -CertStoreLocation Cert:\CurrentUser\My -KeyExportPolicy Exportable -NotAfter (Get-Date).AddYears(10) -NotBefore (Get-Date).AddDays(-1)
$certThumbprint = $cert.Thumbprint
$null = New-Item $env:Temp\$certThumbprint -ItemType Directory
$pfxPassword = Get-Credential -UserName 'Protection password' -Message 'Enter protection password below'
Export-PfxCertificate -Cert Cert:\CurrentUser\My\$certThumbprint -FilePath "$env:Temp\$certThumbprint\PrivateCert.pfx" -Password $pfxPassword.Password
Export-Certificate -Cert Cert:\CurrentUser\My\$certThumbprint -FilePath "$env:Temp\$certThumbprint\PublicCert.cer" -Type CERT -Force
start "$env:Temp\$certThumbprint"

If you are using a certificate authority (CA) signed certificate or existing self-signed certificate, you can run the following command and check the value of KeySpec, Subject, Key Usage, Enhanced Key Usage, Public Key Length, and Provider.

CertUtil.exe -p "<password>" -v -dump <path-of-pfxFile>
  • If the value of Subject, Key Usage, Enhanced Key Usage or Public Key Length doesn't match, you must re-generate the certificate.

  • If the value of KeySpec (shall be "1 -- AT_KEYEXCHANGE") or Provider doesn't match, you don't need to re-generate the certificate, run the following command to import the certificate with modified KeySpec and Provider values, and then run certlm.msc to export the certificate (including private key) to a new PFX file which meets the requirements.

    CertUtil.exe -f -p "<password>" -csp "Microsoft Enhanced RSA and AES Cryptographic Provider" -importpfx "<path-of-pfxFile>" AT_KEYEXCHANGE
    

Rotate certificate on Broker/Compute/Workstation nodes

  1. On the HPC Cluster Manager, go to Deployment To-do List, and click Import a certificate for deployment to import the new CN certificate (for example: PrivateCert.pfx). After that, the new CN certificate will be copied to the Certificates folder under the HPC install share (i.e. \\headnode\REMINST\Certificates) with new name HpcCnCommunication.pfx.
  2. Download the PowerShell script Update-HpcNodeCertificate.ps1 and put it to HPC install share (\\<headnode>\REMINST) . Open HPC Cluster Manager, click Resource Management -> Nodes, Select all the Windows compute, broker, and workstation nodes (Make sure head nodes are NOT included), and click Run Command, and run the following command line (fill the correct values for head node and password), and click Run:
PowerShell.exe -ExecutionPolicy ByPass -Command "\\<headnode>\REMINST\Update-HpcNodeCertificate.ps1 -PfxFilePath \\<headnode>\REMINST\Certificates\HpcCnCommunication.pfx -Password <password> -RunAsScheduledTask"
  1. If you have Linux compute nodes, Open HPC Cluster Manager on the head node, and click Resource Management > Nodes. Select all the Linux nodes, click Run Command, and run the following commands in sequence.

    First, create a temp directory on all Linux nodes.

    mkdir /tmp/hpcreminst
    

    Second, mount HPC install share on all Linux nodes (fill the correct values for head node, domain name and domain user credentials).

    mount -t cifs //headnode/REMINST /tmp/hpcreminst -o vers=2.1, domain=<domainname>,username=<username>,password='<userpassword>',dir_mode=0755,file_mode=0755
    

    Third, schedule a job to rotate certificate on all Linux nodes (fill the correct values for head node, certificate protection password).

    cd /tmp/hpcreminst; echo "python /opt/hpcnodemanager/setup.py -certfile:/tmp/hpcreminst/Certificates/HpcCnCommunication.pfx -certpassword:<password>" | at now + 1 minute
    

Rotate certificate for single Head node

  1. If the new head node certificate is self-signed, you must make all the Windows cluster nodes trust this new self-signed certificate before rotating.

    • Copy the new public certificate PublicCert.cer file to the Certificates folder under the HPC install share (\\headnode\REMINST\Certificates) with new name HpcHnPublicCert.cer.
    • Open HPC Cluster Manager -> Resource Management -> Nodes, select all the Windows compute/broker/workstation nodes, and click Run Command, and run the following command line (fill the correct head node) to make them trust the new head node certificate:
    PowerShell.exe -ExecutionPolicy ByPass -Command "Import-certificate -FilePath \\<headnode>\REMINST\Certificates\HpcHnPublicCert.cer -CertStoreLocation cert:\LocalMachine\Root"
    
  2. Download the PowerShell script Update-HpcNodeCertificate.ps1 and run the following PowerShell command to apply the new certificate (PrivateCert.pfx):

    .\Update-HpcNodeCertificate.ps1 -PfxFilePath <path-of-PrivateCert.pfx> -Password <password>
    
  3. If you are using Burst to Azure IaaS VM feature, on HPC Cluster Manager, click Configuration->Set Azure Deployment Configuration to import this new certificate PrivateCert.pfx on Azure Key Vault Certificate page. Or you can refer to here to manually import the PrivateCert.pfx to Azure key vault, and then specify the values on Azure Key Vault Certificate page in Set Azure Deployment Configuration wizard.

Rotate certificate for high availability Head nodes (Service Fabric cluster or HPC Pack 2019 Built-in High availability architecture)

  1. If the new head node certificate is self-signed, you must make all the Windows cluster nodes trust this new self-signed certificate before rotating.

    • Copy the new public certificate PublicCert.cer file to the Certificates folder under the HPC install share (\\<InstallShare>\Certificates) with new name HpcHnPublicCert.cer. You can use the following PowerShell command to get the HPC install share.

      Add-PSSnapin Microsoft.HPC
      Get-HpcClusterRegistry -PropertyName InstallShare
      
    • Open HPC Cluster Manager -> Resource Management -> Nodes, select all the Windows cluster nodes (including all head nodes as well), and click Run Command, and run the following command line (fill the correct install share) to make them trust the new head node certificate:

    PowerShell.exe -ExecutionPolicy ByPass -Command "Import-certificate -FilePath \\<InstallShare>\Certificates\HpcHnPublicCert.cer -CertStoreLocation cert:\LocalMachine\Root"
    
  2. On every head node, Download the PowerShell script Update-HpcNodeCertificate.ps1 and run the following PowerShell command to import and apply the new certificate (PrivateCert.pfx):

.\Update-HpcNodeCertificate.ps1 -PfxFilePath <path-of-PrivateCert.pfx> -Password <password>
  1. On **one head node **(any one), run the following PowerShell command to apply the new certificate which shall have already been installed in all the head nodes.

    Add-PSSnapin Microsoft.HPC
    $thumbprint = (Get-ItemProperty -Path HKLM:\SOFTWARE\Microsoft\HPC -Name SSLThumbprint).SSLThumbPrint
    Set-HpcClusterRegistry -PropertyName SSLThumbprint -PropertyValue $thumbprint
    
  2. If you are using Burst to Azure IaaS VM feature, on HPC Cluster Manager, click Configuration->Set Azure Deployment Configuration to import this new certificate PrivateCert.pfx on Azure Key Vault Certificate page. Or you can refer to here to manually import the PrivateCert.pfx to Azure key vault, and then specify the values on Azure Key Vault Certificate page in Set Azure Deployment Configuration wizard.

  3. [Service Fabric cluster only] If you are using the same certificate to secure Service Fabric cluster, you shall check whether a Service Fabric cluster configuration upgrade is required. On **one head node **(any one), run the following PowerShell command to check the current security configuration of the Service Fabric cluster.

    Connect-ServiceFabricCluster
    Get-ServiceFabricClusterConfiguration | Out-File d:\sfclusterconfig.json
    

    If the security configuration is as below, a Service Fabric cluster configuration upgrade is not required if the new certificate is issued by the same issuer.

        "Security": {
          "CertificateInformation": {
            "ClusterCertificateCommonNames": {
              "CommonNames": [
                {
                  "CertificateCommonName": "[CertificateCommonName]",
                  "CertificateIssuerThumbprint": "[IssuerThumbprint]"
                }
              ],
              "X509StoreName": "My"
            },
            "ServerCertificateCommonNames": {
              "CommonNames": [
                {
                  "CertificateCommonName": "[CertificateCommonName]",
                  "CertificateIssuerThumbprint": "[IssuerThumbprint]"
                }
              ],
              "X509StoreName": "My"
            }
          },
          "ClusterCredentialType": "X509",
          "ServerCredentialType": "X509"
        },
    

    If the security configuration is as below, you need to upgrade the Service Fabric cluster configuration.

        "Security": {
          "CertificateInformation": {
            "ClusterCertificate": {
              "Thumbprint": "[Thumbprint]",
              "X509StoreName": "My"
            },
            "ServerCertificate": {
              "Thumbprint": "[Thumbprint]",
              "X509StoreName": "My"
            }
          },
          "ClusterCredentialType": "X509",
          "ServerCredentialType": "X509"
        },
    

For more information about the certificate rollover for Service Fabric cluster, refer to Upgrade Service Fabric cluster certificate configuration and Secure a standalone Service Fabric cluster

Upgrade certificate configuration for service fabric cluster

  1. Modify the file sfclusterconfig.json as below:

    • Replace the value of Thumbprint under ClusterCertificate and ServerCertificate
    • Remove the properties with name "$id" (if any) under Security and CertificateInformation
    • Change clusterConfigurationVersion to a higher version, for example from 1.0.0 to 1.0.1
  2. Run the following PowerShell command to start service fabric cluster configuration upgrade.

Connect-ServiceFabricCluster
Start-ServiceFabricClusterConfigurationUpgrade -ClusterConfigPath d:\sfclusterconfig.json
  1. Use the following command to query the upgrade status
Get-ServiceFabricClusterConfigurationUpgradeStatus