Azure Cluster Management Pack client - HPC 2019

The Azure Cluster Management (ACM) Pack client is a set of command line tools for diagnosing Azure HPC clusters. They're distributed in a Python package hpc-acm-cli, based on ACM Pack API.

Prerequisites

Python 2.7, 3.5 or 3.6 is required.

Installation

There're several ways to install the client. You should use Python Package Index (PyPI). Other methods are mainly for package development.

Install from PyPI

This command is a standard way to install a Python package.

python -m pip install --user hpc-acm-cli

Note

python might be python2 or python3 for Python 2 or Python 3 for some Linux distributions.

Install from GitHub

You can install the latest code in development from GitHub by running this command:

python -m pip install --user git+https://github.com/Azure/hpcpack-acm-cli.git#egg=hpc-acm-cli

Install from source

Get the source code to your computer and then run this command:

python -m pip install --user -e <path-to-the-source-directory>

Note

The -e option enables the editable mode for the package. Any change you make in the source takes effect without reinstallation.

Usage

After installation, there're three commands for checking cluster nodes, checking and doing diagnostic jobs, and checking and running general commands separately:

  • clusnode
  • clusdiag
  • clusrun

The commands each have subcommands, such as list, show, and new.

  • Run a command with the -h parameter to list its subcommands, like clusnode -h.
  • For help about a subcommand, for instance, list, run clusnode list -h.
  • All these commands require some common parameters, like host, issuer-url, client-id, and client-secret. You can save the values for them in a configuration file to avoid entering them repeatedly. See configuration section below for more information.
  • The example commands below assume you have the required parameters provided in the configuration file. Otherwise, you'll get an error at runtime.

clusnode

clusnode is for checking cluster nodes.

For example, to list the nodes in a cluster, run this command:

clusnode list

By default, it lists 100 nodes at once. If you prefer more, use the --count parameter:

clusnode list --count 1000

There's also a parameter --last-id for paging. Refer to command help for more information.

To check a specific node, run this command:

clusnode show <node-name>

clusdiag

clusdiag is for checking and doing diagnostic tests on a cluster.

For example, to list available diagnostic tests, use this command:

clusdiag tests

To run a diagnostic test, run this command:

clusdiag new <test-name> --pattern <your-node-name-pattern>

The --pattern parameter specifies a glob pattern just like the file name globing on most OSes. For example, abc* matches names starting with abc, so abc, abc1, and abc2 are all matched. You can use * to match all nodes.

You can also specify several nodes to run the test, by the --nodes parameter:

clusdiag new <test-name> --nodes "n1 n2 n3"

The nodes named n1, n2, and n3 are specified, separated by a space and quoted in a pair of ".

To see a list of diagnostic tests, run this command:

clusdiag list

To check detailed results of a test, use this command:

clusdiag show <id>

clusrun

clusrun is for checking or running general commands on a cluster.

For example, to run a command on all nodes of the cluster:

clusrun new --pattern "*" "hostname && date"

It runs hostname && date on all nodes in a cluster.

Configuration

The previous commands share a common configuration file, .hpc_acm_cli_config, for default values for the command line.

The file is generated at the first time you run any of the commands. It is under the user's home directory(~), typically /home/{username} for Linux and C:\Users{username} for Windows.

The configuration file sets default values for command parameters. The default values can be overridden by those provided on command line. See comments in the file for configurable options and examples.

Authentication

If you have configured your ACM app with AAD, you have to provide issuer-url, client-id, and client-secret parameters. You can find these values in the Azure portal. For more information, see Configure Azure Active Directory for Web Portal.

API Base Point

A host parameter is required to set the base point of ACM web API. It has the form {Your ACM App's URL}/v1, like https://myacmapp.azurewebsites.net/v1.

Next steps

Run ACM Pack diagnostics tests.