Parallel Programming - Using PVM

Parallel Programming - Using PVM

PVM is an inactive direction in HPC community, but
there are many lessons can be learned from its programming model, its
architecture design/implementation and how/why it failed to be the
dominate system.

Part I - What's PVM?

PVM (Parallel Virtual Machine) is a software framework for heterogeneous parallel computing in networked environments, which is based on message passing
model. Its main focus is uniform parallel computing framework on
interconnected heterogeneous computers of varied architecture(from Unix
to Windows, from PC, Workstation to MPP).

(diagram captured from [1])

The PVM system is composed of two parts:
-The first part is a Daemon , called pvmd3 and sometimes abbreviated pvmd , that resides on all the computers making up the virtual machine.
- The second part of the system is a Library of PVM interface routines, which contains a functionally complete primitives needed for cooperation between tasks of an application.

Part II - PVM Programming Model

A PVM application consists of a collection of cooperating tasks, each
of which is responsible for some workload of a big problem. Tasks can
be created/terminated across the network and it also can communicate and synchronize with other tasks.

Sometimes
an application is parallelized along its functions; that is, each task
performs a different function, for example, input, problem setup,
solution, output, and display. This process is often called functional parallelism .

A more common method of parallelizing an application is called data parallelism . In this method all the tasks are the same, but each one only knows and solves a small part of the data.

(diagram captured from [1])

PVM supports either or a mixture of these methods.

Parallel
applications can be viewed in another perspective, based on the
organization of the computing tasks. Roughly speaking, there are three
types:

- Crowd/Star Computing,
a collection of closely related tasks, perform computations on
different portions of the workload, usually involving the periodic
exchange of intermediate results. A star-like parent-child task
relationship exists among the tasks.

- Tree Computing,
tasks are spawned, usually dynamically as the computation progresses,
in a tree-like manner, thereby establishing a tree-like, parent-child
relationship.

Tree Computation Using Parallel Merge Sort (Detail Info)

- Hybrid Model, combine the upper to models.

Crowd Computation typically involves three phases:
- Initialization
- Computation
- Result aggregation

This model can be further divided into two types:

- Master/Worker,
master is responsible for process spawning, initialization, collection
and display of results, and perhaps timing of functions. The worker
programs perform the actual computation involved. Their workloads are
typically assigned by the master (statically or dynamically).

- Work Crew, multiple
instances of a single program execute, with one tasks(typically the one
initiated manually) taking over the non-computational responsibilities
in addition to contributing to the computation itself.

Part III - PVM Features/Interface

- Process/Task Management
- Resource Management/Configuration
- Message Passing

Detail Interface usage doc can be found at: PVM User Interface
Here are some PVM example application source code

Part IV - How PVM works

Geist's Book has a great chapter on the internals of PVM Design & Implementation

The core design lies in the host table and message routing:

1.
Host table describes all hosts in a virtual machine. It is issued by
the master pvmd and kept synchronized across the virtual machine.

Pvm Host Table (from[1])

2. Some host manipulation operations involves several hosts (for example, host addition), so 2-phase commit/3 phase commit protocol is applied (master is the coordinator).

3.
Message routing is accomplished through Pvmds. Message contains target
task ID, which in turn encapsulates some Pvmd ID. Pvm Daemon will use
the host table to identify the target Pvm Daemon information and put
the message to the corresponding send queue.

Part V - Developing PVM Applications

PVM app development cycle:
1. A user writes one or more sequential programs in C, C++, or Fortran 77 that contain embedded calls to the PVM library.
2.
These programs are compiled for each architecture in the host pool, and
the resulting object files are placed at a location accessible from
machines in the host pool.
3. To execute an application, a user
typically starts one copy of one task (usually the "master" task) by
hand from a machine within the host pool. This process subsequently
starts other PVM tasks, eventually resulting in a collection of active
tasks that then compute locally and exchange messages with each other
to solve the problem.

Notes for PVM on Windows:
1. Code/Bin can be found at https://www.netlib.org/pvm3/win32/
2. Don't use the InstallShiedl version, it contains many bugs, use manual install version
3.
If "[pvmd pid4604] mksocs() somepath\pvmd.xxx failed. You are required
to run on NTFS" error message shows up, check the file, delete it and
restart pvm
4. C/C++ pvm application needs include "pvm3.h", link libpvm.lib and ws2_32.lib ( also linklibgpvm3.lib if group communication functions are called)
5. Build-in Pvm Library uses old C runtime library, you should ignore "libc.lib" in VC++ linker/Input setting
6.
Build-in Pvm Library only works with static C/C++ runtime library,
please change VC++ project setting: Property->C/C++->Code
Generation->Runtime Library, and choose "Multi-Threaded(/MT)"
7. To run your application, executable files should be put into $(PVM_ROOT)\Bin\$(PVM_ARCH) directory

I wrote some Pvm applications, trying to build and run it will be a good start to Pvm journey.

Part VI - MPI vs PVM

MPI
is a pure standard or specification, but PVM can be regarded as both
standard and implementation. MPI standard is the work of MPI Forum,
which is formed by over 40 organizations, companies and individual
persons, while PVM's standard/implementation are maintained by one
project group.

Let's focus on the technical difference of the two system.

Programming Model

Both are based on message passing model and can support SPMD(Single Program Multiple Data) /MPMD(Multiple Program Multiple Data) pattern.

But MPI is treated as static model, where process communication happens in static manner. For example, tasks are regarded created statically with no failures.

While PVM is thought to be dynamic model,
where system resource can be configured, tasks can be create/destroyed
at runtime. And node failure is also within consideration.

What's more, PVM provides the conceptual view of virtual machine, which consists of various nodes.
Each node is a physical machine, such as PC workstation, SMP, Cluster
or even an entire MPP system. While MPI focus more on communications
and don't have such concepts.

Message Passing

Both
system is based on message passing model, so passing message is the
core feature of the two systems. But MPI provides more options and
richer semantic, its various implementations are considered to be
faster than that of PVM.

Process Control

The initial MPI standard doesn't contain specifications on how to create/destroy process/tasks, later improvements(MPI 2) add related APIs.
PVM
considered these dynamic features from the beginning, so spawn/kill
tasks related interface is included in the initial release of the
system.

Failure Handling

PVM
considered failure scenarios at the beginning, it provides some
(failure) event notification mechanism to let application developers to
write fault tolerant programs, although PVM itself is not fault
tolerant(for example the master PVM daemon is a single point of
failure).

MPI don't specify how to deal with failure at the
beginning and added PVM like event notification feature in later
version but still very limited(Its main purpose is to locate the root
cause of failure, not helper mechanism to write fault tolerant
application).
MPI3 is considering check-point based failure handling features.

Summary

PVM is designed for heterogeneous and dynamic environment, it provides a virtually uniformed conceptual view.

While MPI is mainly designed for high performance and source code level portability.

Reference [21][24][27] are very good material on this topic.

Part VII - Why PVM Failed

Standing
at today's position, we can tell easily that MPI beats PVM as the
message passing standard mechanism. But why? Since PVM's feature seems
more powerful and its concept and architecture is beautiful.

Here are some of my thoughts:

1. Do One Thing and Do it Well

MPI
initially only focus on communication with very limited dynamic
mechanisms. But at that time, performance/portability is critical. As
node scale is relatively small and the system is special purpose super
computers, dynamic and failure handling is not that important.

PVM
involves many great features but the practical performance is not so
good, since performance optimization is not its main focus.

2. Ecosystem is Important

MPI
starts as an industrial/academia cooperating efforts. Various HPC
hardware vendors has the motivation to adopt this standard (to win more
costumers). HPC application developers likes to use it because there is
no more porting pains any more. Both sides are very happy.

PVM
starts as pure research project, only one team define the spec and
write the implementation. Although it can quickly response to end user
requirement, the lack of industrial vendor support is very dangerous
for its long term survival.

PVM's main focus is heterogeneous
system integration. But HPC hardware system is very expensive, how many
users will have the requirement to integrate many such system?
Industrial vendors are very reluctant to develop a system that can will
talk to other competitor's similar product.

3. Vision Driven V.S. Problem Driven

PVM
is fancy and elegant but MPI solves the biggest problems of HPC
application developers. PVM is a very good research project but born to
be research purpose only.

[Reference]

Tutorials
01. The PVM Book
02. An Introduction to PVM Programming
03. Advanced PVM tutorial
04. PVM Beginner Guide
05. Parallel Processing using PVM
06. PVM programming hands-on tutorial
07. PVM User Manual

General
11. PVM Wiki
12. PVM official Home
13. PVM3 Home
14. PVM++
15. PVM/MPI Conference
16. Java over PVM (Paper, Home)
17. PVM Papers

MPI vs PVM
21. PVM and MPI: a Comparison of Features

22. PVM MPI are completely Different
23. Why are PVM/MPI so different?
24. Goals Guiding Design: MPI/PVM

25. PVM vs MPI Article
26. Performance Comparison PVM, MPI and DSM (ppt)
27. PVM/MPI Comparison using Physical Applications
28. PVM/MPI comparison for Java HPC