Add Compute Nodes to the DSC

Before you can run LINQ to HPC jobs, you must add the compute nodes to the DSC. To function properly, the DSC needs at least as many nodes as the cluster replication factor. The default cluster replication factor is three, so in that case a minimum of three nodes must be added to the DSC in order to create file sets in it. For more information, see Configuring the replication factor.

In this topic:

  • How to add compute nodes to the DSC

  • Configuring the replication factor

  • Selecting locations for HpcData and HpcTemp

  • Additional steps for compute nodes that run Windows Server 2008

How to add compute nodes to the DSC

To add nodes to the DSC, you can use the dsc node add command. This command adds the specified compute node to the DSC database and configures the compute node for use by the DSC. The node is configured with two file shares, HpcData and HpcTemp. These are used to store the DSC data and temporary files associated with LINQ to HPC jobs. The /datapath and /temppath parameters of the dsc node add command allow you to specify the location for these folders. Ensure that the directories that you specify have enough space to accommodate the files that you want to store there. For additional considerations, see Selecting locations for HpcData and HpcTemp.

The following procedures describe how to run the dsc node add to add a single node to the DSC, and how to include the command in a script to add more than one node at a time. For more information about the dsc node add command, see Node Operations in the DSC command reference.

To add a node to the DSC

  1. Log on to the head node or client computer as a domain user with administrative permissions on the HPC cluster.

  2. Open a command prompt window. If you are on a client computer, open an elevated command prompt (run as administrator).

  3. Type a command using the following syntax:

    dsc node add [compute node name] /temppath:[ local path for HpcTemp share] /datapath:[ local path for HpcData share] /service:[ head node name]

    For example:

    dsc node add computeNode1 /temppath:c:\L2H\HpcTemp /datapath:c:\L2H\HpcData /service:myHeadNode

    You do not need to specify the /service parameter if the CCP_SCHEDULER environment variable is set on the computer where you are running the command. On the head node computer, CCP_SCHEDULER is set by default to the name of the head node.

  4. Repeat step 3 for each node in the cluster.

Note

A node that is already in the DSC must be removed (using the dsc node remove command) before it can be added again. The dsc node add command might not function as expected if you run it on a node that is already added to the DSC.

If you are adding a node to the DSC that already has the HpcData and HpcTemp shares configured, it is recommended that you manually delete all the data in any existing shares before you recreate them by running dsc node add. Otherwise, the dsc node add action can take a long time.

Alternately, you can use HPC PowerShell if you want to add all nodes to the DSC at the same time. You can find a similar PowerShell script in the LINQ to HPC code samples. The script, named dsc-nodes-add, adds all the nodes within a specified node group to the DSC.

To run ‘dsc node add’ on all nodes in the cluster

  1. Log on to the head node or client computer as a domain user with administrative permissions on the HPC cluster.

  2. Open an elevated HPC PowerShell window: Click Start, point to All Programs, click Microsoft HPC Pack, right-click HPC PowerShell, and then click Run as administrator.

  3. Type the following script to add all nodes to the DSC:

    $nodes = get-hpcnode
    foreach ($n in $nodes) 
    {
      $name = $n.NetBiosName
      dsc node add $name /temppath:c:\LinqHpc\Temp /datapath:c:\LinqHpc\Data /service:MyHeadNode
    }
    

The Get-HpcNode cmdlet returns an object that includes all nodes. You can then loop through each node in the set to perform the desired action. Get-HpcNode includes additional parameters that you can specify to return a subset of nodes (for example, to run a command only on the compute nodes).

Configuring the replication factor

You can change the replication factor on your cluster depending on the desired tradeoff between storage overhead and tolerance to node failures. The replication factor controls how many copies (that is, replicas) of each DSC file in each new DSC file set will be created. The value must be 1, 2, 3, or 4. A value of 1 means that no redundant copies will be kept. Without replication, you can lose data if a DSC node fails. Also, the graph manager has fewer options when it decides where to execute vertices, which can result in decreased performance for LINQ to HPC queries.

You can set the replication factor property by using the dsc params command (for detailed information, see PARAMS in the DSC command reference). For example, to set the replication factor to two, type the following command:

dsc params set replicationFactor 2

Selecting locations for HpcData and HpcTemp

When specifying the location for the HpcData and HpcTemp shared folders, consider the following:

  • Assigning the HpcData and HpcTemp shares to the same partition is recommended, but can lead to small inaccuracies in the size of the available free space calculated by the DSC. This is because the DSC assumes that the whole partition is allocated to the DSC data and no other files are present. Typically HpcTemp is small, relative to HpcData, so this inaccuracy is insignificant.

  • The recommended configuration for these shares is to associate both shares with one volume where that volume is constructed using software striping across multiple physical disks. The striped physical disks should be different from the system disk.

  • Usage of the HpcData share is governed by the size of file sets created by users and the replication factor used. In addition to file sets created by users, additional temporary file sets are created during query execution. These files sets have a lease time of 24 hours and will be automatically removed.

  • The HpcTemp share is used to store information related to LINQ to HPC queries. Information is stored in a UserName\jobID folder for each job. This share is automatically cleaned up, and files relating to jobs older than 24 hours are removed. Typically each job will create megabytes of data associated with the job.

Additional steps for compute nodes that run Windows Server 2008

A cluster that supports LINQ to HPC workloads must have the Windows Server 2008 R2 operating system installed on the head node. However the compute nodes can have either the Windows Server 2008 or the Windows Server 2008 R2 operating system installed.

In order to run LINQ to HPC on compute nodes that run Windows Server 2008, you must manually install the Microsoft Visual C++ 2008 SP1 Redistributable Package (x64) on the compute node (this will require a reboot).

The Microsoft Visual C++ 2008 SP1 Redistributable Package (x64) can be downloaded from: https://www.microsoft.com/downloads/en/details.aspx?familyid=BA9257CA-337F-4B40-8C14-157CFDFFEE4E&displaylang=en.

You may also receive the following error message if you try to add a Windows Server 2008 compute node to the DSC if its firewall is enabled: “\\[cnname]\HpcData already exists. Failed to get disk free space for \\[cnname]\C:, The RPC server is unavailable. (Exception from HRESULT: 0x800706BA).”

If this error occurs, do the following:

  1. Run the following command to turn on remote administration on all compute nodes in the cluster:

    clusrun /nodegroup:LinqToHpcNodes netsh firewall set service type=remoteadmin
    
  2. Run the DSC ADD NODE command again with the appropriate temp and data path parameters:

    DSC NODE ADD [cnname] ...
    
  3. The command should succeed.

For nodes that do not have the firewall enabled, DSC NODE ADD should work normally.

Important

After you add a node running the Windows Server 2008 operating system to the DSC, you must restart the node.