HPC Pack 2016 - GPU Information on workstation node with Win10 1909

Thomas 21 Reputation points
2021-02-25T09:37:37.14+00:00

We have upgraded a HPC Pack 2016 workstation node with a NVIDIA RTX2080 to Windows 1909. Since this upgrade the Cluster Manger of the HPC Pack 2016 U3 didn't show any GPU information about this node. We found out that Windows 10 1909 comes with an newer NVIDIA driver where the nvidia-smi.exe isn't in place where the HPC Pack 2016 search for it. It's now in C:\Windows\System32 and not in C:\Program Files\NVIDIA Corporation\NVSMI.
Is there any way to fix this issue so that the HPC Pack 2016 software search for the nvidia-smi.exe to report the gpu to the Cluster Manager?
A workaround is to uninstall the newer driver and install an older one.

Here is the output from nvidia-smi

C:\Users\admin-hpc>nvidia-smi.exe
Thu Feb 25 10:34:33 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 432.00 Driver Version: 432.00 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2080 WDDM | 00000000:04:00.0 On | N/A |
| 30% 32C P8 17W / 225W | 541MiB / 8192MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

Could you please help us out here?

Best regards,
Thomas

Azure Virtual Machines
Azure Virtual Machines
An Azure service that is used to provision Windows and Linux virtual machines.
7,125 questions
0 comments No comments
{count} votes

Accepted answer
  1. vipullag-MSFT 24,106 Reputation points Microsoft Employee
    2021-02-26T04:53:51.057+00:00

    @Thomas

    I checked with internal team on this, you can try adding a registry key ‘NvmlPath’ under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\HPC with a string value for C:\Windows\System32, and then restart HpcManagement service on the node.

    Hope this helps.

    Please 'Accept as answer' if the provided information is helpful, so that it can help others in the community looking for help on similar topics.

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Thomas 21 Reputation points
    2021-03-04T13:14:56.427+00:00

    Perfect! That helps a lot. One futher Question. Can you correct this in the sources or put it to the documentation?