question

Micky-4218 avatar image
0 Votes"
Micky-4218 asked Micky-4218 commented

HPC 2016 Cluster - run jobs on fewer cores than available

I have a heterogenous HPC cluster; half the nodes have 56 cores, the other half has 48. When a job runs on a 56 core node, the results vary slightly from when the same job runs on a 48. This slight difference has a major impact on business decisions however.

I have tried using job templates to specify the number of cores a job should run it with, but the job fails if the numcores figure does not match the number of cores the node has.

So my question is, is there a way in HPC to make the node present a specified number of cores, less than what it actually has, to run jobs with it, preferably without doing anything in the node BIOS or some hardware fix to achieve this?

Or any other way to make this work, so maybe some combination of job template config plus some HPC setting so jobs only use the number of cores specified in numcores, no matter how many cores a node actually has.



azure-hpc-pack
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

YutongSun-5052 avatar image
0 Votes"
YutongSun-5052 answered Micky-4218 commented

Hi Micky,

Yes. HPC Pack supports under or over subscribing the cores or sockets on compute nodes. You may bring nodes offline and edit the node property in HPC Cluster Manager as shown below. The PowerShell cmdlet Set-HpcNode could also do the work.

132158-image.png




Regards,
Yutong Sun


image.png (115.0 KiB)
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you @YutongSun-5052

A few questions however:

  1. If a node has 40 cores, and I specify 32 in the subscribed cores, are the remaing 8 cores still accessible to run jobs with?

  2. Can a user run a job with fewer cores than specified in the Subcribed cores?

0 Votes 0 ·
Micky-4218 avatar image Micky-4218 Micky-4218 ·

To answer my own questions:

  1. Yes

  2. Yes

This configuration also allows jobs to be split across nodes. So a 50 core job can run across two 25-core nodes
You will have to include /singlenode=true to force your job to stay on one node




0 Votes 0 ·