A Hybrid Operating System Cluster Solution: Dual-Boot and Virtualization with Windows HPC Server 2008 and Linux Bull Advanced Server for Xeon
Applies To: Windows HPC Server 2008
The choice of the right operating system (OS) for a high performance computing (HPC) cluster can be a very difficult decision for IT departments. And this choice will usually have a big impact on the Total Cost of Ownership (TCO) of the cluster. Parameters like multiple user needs, application environment requirements and security policies are adding to the complex human factors included in training, maintenance and support planning, all leading to associated risks on the final return on investment (ROI) of the whole HPC infrastructure. The goal of this paper is to show that simple techniques are available today to make that choice unnecessary, and keep your HPC infrastructure versatile and flexible.
In this white paper we will study how to provide the best flexibility for running several OSs on an HPC cluster. There are two main types of approaches to providing this service depending on whether a single operating system is selected each time the whole cluster is booted, or whether several operating systems are run simultaneously on the cluster. The most common approach of the first type is called the dual-boot cluster (described in  and ). For the second type of approach, we introduce the concept of a Hybrid Operating System Cluster (HOSC): a cluster with some computing nodes running one OS type while the remaining nodes run another OS type. Several approaches to both types are studied in this document in order to determine their properties (requirements, limits, feasibility, and usefulness) with a clear focus on computing performance and management flexibility.
The study is limited to 2 operating systems: Linux Bull Advanced Server for Xeon 5v1.1 and Microsoft Windows HPC Server 2008 (respectively noted XBAS and HPCS in this paper). For optimizing the interoperability between the two OS worlds, we use the Subsystem for Unix-based Applications (SUA) for Windows. The description of the methodologies is as general as possible in order to apply to other OS distributions but examples are given exclusively in the XBAS/HPCS context. The concepts developed in this document could apply to 3 or more simultaneous OS’s with slight adaptations. However, this is out of the scope of this paper.
We introduce a meta-scheduler that provides a single submission point for both Linux and Windows. It selects the cluster nodes with the OS type required by submitted jobs. The OS type of compute nodes can be switched automatically and safely without administrator intervention. This optimizes computational workloads by adapting the distribution of OS types among the compute nodes.
A technical proof of concept is given by designing, installing and running an HOSC prototype. This prototype can provide computing power under both XBAS and HPCS simultaneously. It has two virtual management nodes (aka head nodes) on a single server and the choice of the OS distribution among the compute nodes can be done dynamically. We have chosen Altair PBS Professional software to demonstrate a meta-scheduler implementation. This project is the result of the collaborative work of Microsoft and Bull.
Chapter 2 defines the main technologies used in HOSC: the Master Boot Record (MBR), the dual-boot method, the virtualization, the Pre-boot eXecution Environment (PXE), the resource manager and job scheduler tools. If you are already familiar with these concepts, you may want to skip this chapter and go directly to Chapter 3 that analyzes different approaches to HOSC architectures and gives technical recommendations for their design. The recommendations are implemented in Chapter 4 in order to determine the best technical choices for building an HOSC prototype. The installation setup of the prototype and the configuration steps are explained in Chapter 5. Appendix D shows the files that were used during this step. Finally, basic HOSC administrator operations are listed in Chapter 6 and ideas for future works are proposed in Chapter 7, which concludes this paper.
This document is intended for computer scientists who are familiar with HPC cluster administration.
All acronyms used in this paper are listed in Appendix A. Complementary information can be found in the documents and web pages listed in Appendix B.
In this document: