Release Notes for HPC Pack 2016 Update 2

These release notes address late-breaking issues and information for the high-performance computing (HPC) cluster administrator about Microsoft HPC Pack 2016 Update 2.

Download and install Microsoft HPC Pack 2016 Update 2

HPC Pack 2016 Update 2 is available for download from the Microsoft Download Center. Download it to your local machine which will act as the local head node. After the download, right-click the zip file of the installation package and click Properties to view the file properties. Click Unblock if there is a security warning that indicates the file might be blocked. Then, extract the installation package files to a local folder and run Setup.exe.

Known issues

Head node installation may fail if the installation package is located on a network share

Before you install the last head node of a cluster with 3 head nodes, you must copy the installation package to a local folder and run Setup.exe. If you run Setup.exe from a network share to install the last head node, the installation may fail when installing the Microsoft Service Fabric cluster. This issue only happens for a cluster with 3 head nodes. For other roles (including “Prerequisites for a new head node”) or a single head node cluster, you can still install from a network share.

The installation fails due to reboot on a computer without Microsoft .NET Framework 4.6.1 (or above) installed

Microsoft HPC Pack 2016 Update 2 has a dependency on Microsoft .NET Framework 4.6.1. If your computer does not have Microsoft .Net Framework 4.6.1 (or above) installed, HPC Pack tries to install it and reboot the computer after the installation. In this case, you can run the Setup.exe again to continue the HPC Pack installation after reboot.

Microsoft .NET Framework 4.6.1 installation fails on Windows Server 2012 R2 or Windows 8.1 without KB2919355 installed

If you are installing HPC Pack 2016 Update 2 on Windows Server 2012 R2 or Windows 8.1, KB2919355 must be installed before installing Microsoft .NET Framework 4.6.1.

iSCSI node deployment is not supported

Bare metal node deployment is supported in HPC Pack 2016 Update 2, but iSCSI node deployment is still not supported.

NAT is not supported when configuring network topology

The network topology configuration fails if you check NAT in the Network Configuration Wizard. The workaround is un-check NAT in the wizard and enable NAT manually in the system.

Not compatible to lower version of HPC Pack Compute nodes

HPC Pack 2016 Update 1 or earlier version of compute node is not supported in HPC Pack 2016 Update 2. We require you to upgrade your compute nodes to the same version of head node.

Support huge number of tasks in job

If you want to have more than 30 thousands of tasks/parametric tasks, you need to modify the existing v_TaskGroup view in the HPCScheduler DB so that it is more efficient as below. We will include this fix in our next QFE release.

if object_id('v_TaskGroup') is not null
    drop view v_TaskGroup
go

create view v_TaskGroup as
select tg.JobID, tg.ID
, COUNT(case when (t.State & 0x1 > 0) then 1 else null end) as Configuring
, COUNT(case when (t.State & 0x2 > 0) then 1 else null end) as Submitted
, COUNT(case when (t.State & 0x4 > 0) then 1 else null end) as Validating
, COUNT(case when (t.State & 0x8 > 0) then 1 else null end) as Queued
, COUNT(case when (t.State & 0x10 > 0) then 1 else null end) as Dispatching
, COUNT(case when (t.State & 0x20 > 0) then 1 else null end) as Running
, COUNT(case when (t.State & 0x40 > 0) then 1 else null end) as Finishing
, COUNT(case when (t.State & 0x80 > 0) then 1 else null end) as Finished
, COUNT(case when (t.State & 0x100 > 0) then 1 else null end) as Failed
, COUNT(case when (t.State & 0x200 > 0) then 1 else null end) as Canceled
, COUNT(case when (t.State & 0x400 > 0) then 1 else null end) as Canceling
from TaskGroup tg
left join TaskDetail td on td.GroupId = tg.id
left join Task t on td.RecordID = t.RecordID
group by tg.JobId, tg.ID
go

Some Azure IaaS compute nodes fail to start

Some Azure IaaS compute nodes may fail to start due to VM allocation failure or provisioning timeout. You can open HPC Cluster Manager, select the failed node in Resource Management and click Provisioning Log to check the detailed error. The most possible failure reasons and the workarounds are:

  • Core quota limit was exceeded: Increase the core quota of your Azure subscription.

  • Insufficient capacity for the requested VM size: If the failed IaaS nodes were created in an availability set, try to stop all the IaaS nodes in the same availability set, and start them again, see here for more explanation. If the failed IaaS nodes were not created in an availability set, the desired VM size is not available in the whole region for the moment, you can retry later or consider to add new nodes with other VM sizes.

  • The wait node starting operation timeout (by default 60 minutes): Check the node from the azure portal, if it is stuck in provisioning, try to delete if from the portal and then start the node from HPC Pack Cluster manager again.