Cloud Computing: Virtualization Classes
There are several different approaches to—or classes of—virtualization, each suited to their own specific situations.
Kai Hwang, Jack Dongarra, Geoffrey Fox
Adapted from “Distributed and Cloud Computing: From Parallel Processing to the Internet of Things” (Syngress, an imprint of Elsevier)
Generally speaking, there are three typical classes of virtual machine (VM) architecture—the hypervisor, host-based virtualization and para-virtualization, which are differentiated by the position of the virtualization layer. The hypervisor is also known as the Virtual Machine Monitor (VMM).
The hypervisor supports hardware-level virtualization on bare-metal devices like CPU, memory, disk and network interfaces. The hypervisor software sits directly between the physical hardware and the OS. This virtualization layer is referred to as either the VMM or the hypervisor.
The hypervisor provides hyper-calls for the guest OSes and applications. A hypervisor can assume a micro-kernel architecture like Microsoft Hyper-V. It can also assume a monolithic hypervisor architecture like VMware ESX for server virtualization.
A micro-kernel hypervisor includes only the basic and unchanging functions (such as physical memory management and processor scheduling). The device drivers and other changeable components are outside the hypervisor. A monolithic hypervisor implements all the aforementioned functions, including those of the device drivers.
Therefore, the size of the hypervisor code of a micro-kernel hypervisor is smaller than that of a monolithic hypervisor. Essentially, a hypervisor must be able to convert physical devices into virtual resources dedicated for the deployed VM to use.
The Xen Architecture
Xen is an open source hypervisor program developed by Cambridge University. The core components of a Xen system are the hypervisor, kernel and applications. The organization of the three components is important.
Xen is a microkernel hypervisor, which separates the policy from the mechanism. The Xen hypervisor implements all the mechanisms, leaving the policy to be handled by Domain 0. It does not include any device drivers natively. It just provides a mechanism by which a guest OS can have direct access to the physical devices.
As a result, the size of the Xen hypervisor is kept rather small. Xen provides a virtual environment located between the hardware and the OS. A number of vendors are in the process of developing commercial Xen hypervisors; among them are Citrix XenServer and Oracle VM.
Like other virtualization systems, many guest OSes can run on top of the hypervisor. However, not all guest OSes are created equal, and one in particular controls the others. The guest OS, which has control ability, is called Domain 0, and the others are called Domain U. Domain 0 is a privileged guest OS of Xen. It is first loaded when Xen boots without any file system drivers being available. Domain 0 is designed to access hardware directly and manage devices. Therefore, one of the responsibilities of Domain 0 is to allocate and map hardware resources for the guest domains (the Domain U domains).
For example, Xen is based on Linux and its security level is C2. Its management VM is named Domain 0, which has the privilege to manage other VMs implemented on the same host. If Domain 0 is compromised, the hacker can control the entire system.
So in the VM system, you need security policies to improve the security of Domain 0. Domain 0, behaving as a VMM, lets users create, copy, save, read, modify, share, migrate and roll back VMs as easily as manipulating a file. Unfortunately, this also adds security problems during the software lifecycle and data lifetime.
Traditionally, you could envision a machine’s lifetime as a straight line. The current state of the machine is a point that progresses monotonically as the software executes. During this time, you make configuration changes, install software and apply patches.
In such an environment, the VM state is akin to a tree: At any point, execution can go into N different branches where multiple instances of a VM can exist at any point in this tree at any given time. VMs are allowed to roll back to previous states in their execution (for example, to fix configuration errors) or rerun from the same point many times (for example, as a means of distributing dynamic content or circulating a “live” system image).
Binary Translation with Full Virtualization
Depending on implementation technologies, hardware virtualization can be classified into two categories: full virtualization and host-based virtualization.
Full virtualization does not need to modify the host OS. It relies on binary translation to trap and virtualize the execution of certain sensitive, non-virtualizable instructions. The guest OSes and their applications consist of noncritical and critical instructions.
In a host-based system, both a host OS and a guest OS are used. A virtualization software layer is built between the host OS and guest OS.
With full virtualization, noncritical instructions run directly on the hardware while critical instructions are discovered and replaced with traps into the VMM to be emulated by software. Both the hypervisor and VMM approaches are considered full virtualization.
Why are only critical instructions trapped into the VMM? This is because binary translation can incur a large performance overhead. Noncritical instructions do not control hardware or threaten the security of the system, but critical instructions do. Therefore, running noncritical instructions on hardware not only can promote efficiency, but also can ensure system security.
This approach was implemented by VMware and many other software companies. The VMM scans the instruction stream and identifies the privileged, control- and behavior-sensitive instructions. When these instructions are identified, they’re trapped into the VMM, which emulates the behavior of these instructions. The method used in this emulation is called binary translation.
Therefore, full virtualization combines binary translation and direct execution. The guest OS is completely decoupled from the underlying hardware. Consequently, the guest OS is unaware that it’s being virtualized.
The performance of full virtualization may not be ideal because it involves binary translation, which is rather time-consuming. Full virtualization of I/O-intensive applications is a challenge. Binary translation employs a code cache to store translated hot instructions to improve performance, but it increases the cost of memory usage. The performance of full virtualization on the x86 architecture is typically 80 percent to 97 percent of that of the host machine.
An alternate VM architecture is to install a virtualization layer on top of the host OS. This host OS is still responsible for managing the hardware. The guest OSes are installed and run on top of the virtualization layer. Dedicated applications might run on the VMs. Some other applications can also run with the host OS directly.
This host-based architecture has some distinct advantages. First, the user can install this VM architecture without modifying the host OS. The virtualizing software can rely on the host OS to provide device drivers and other low-level services. This will simplify the VM design and ease its deployment.
Second, the host-based approach appeals to many host machine configurations. Compared to the hypervisor/VMM architecture, the performance of the host-based architecture might also be low. When an application requests hardware access, it involves four layers of mapping, which downgrades performance significantly. When the Internet Security and Acceleration (ISA) of a guest OS is different from the ISA of the underlying hardware, binary translation must be adopted. Although the host-based architecture has flexibility, the performance is too low to be useful in practice.
Para-virtualization needs to modify the guest OS. A para-virtualized VM provides special APIs requiring substantial OS modifications in user applications. Performance degradation is a critical issue of a virtualized system. No one wants to use a VM if it’s much slower than using a physical machine.
You can insert the virtualization layer at different positions in a machine software stack. However, para-virtualization attempts to reduce the virtualization overhead, and thus improve performance by modifying only the guest OS kernel. When guest OSes are para-virtualized, they’re assisted by an intelligent compiler to replace the non-virtualizable OS instructions with hypercalls.
The traditional x86 processor offers four instruction execution rings: Rings 0, 1, 2 and 3. The lower the ring number, the higher the privilege of instruction being executed. The OS is responsible for managing the hardware and the privileged instructions to execute at Ring 0, while user-level applications run at Ring 3. The best example of para-virtualization is kernel-based VM (KVM).
When the x86 processor is virtualized, a virtualization layer is inserted between the hardware and the OS. According to the x86 ring definition, the virtualization layer should also be installed at Ring 0. Different instructions at Ring 0 might cause some problems. However, when the guest OS kernel is modified for virtualization, it can no longer directly run on the hardware.
Although para-virtualization reduces overhead, it incurs other problems. First, its compatibility and portability may be in doubt, because it must support the unmodified OS as well. Second, the cost of maintaining para-virtualized OSes is high, because they could require deep OS kernel modifications.
Finally, the performance advantage of para-virtualization varies greatly due to workload variations. Compared with full virtualization, para-virtualization is relatively easy and more practical. The main problem in full virtualization is its low performance in binary translation. Speeding up binary translation is difficult. Therefore, many virtualization products employ the para-virtualization architecture. The popular Xen, KVM and VMware ESX are good examples.
KVM is a hardware-assisted para-virtualization tool, which improves performance and supports unmodified guest OSes such as Windows, Linux, Solaris and other Unix variants.
This is a Linux para-virtualization system—part of the Linux version 2.6.20 kernel. The existing Linux kernel carries out memory management and scheduling activities. KVM does the rest, which makes it simpler than the hypervisor that controls the entire machine.
Unlike the full virtualization architecture that intercepts and emulates privileged and sensitive instructions at run time, para-virtualization handles these instructions at compile time. The guest OS kernel is modified to replace the privileged and sensitive instructions with hypercalls to the hypervisor or VMM. Xen is one example of such para-virtualization architecture.
The privileged instructions are implemented by hypercalls to the hypervisor. After replacing the instructions with hypercalls, the modified guest OS emulates the behavior of the original guest OS. On a Unix system, a system call involves an interrupt or service routine. The hypercalls apply a dedicated service routine in Xen.
These many different types of virtualization architecture have different strengths and weaknesses. Examine each and you can apply the most suitable architecture to your environment.
**Kai Hwang**is a professor of computer engineering for the University of Southern California and a visiting Chair Professor for Tsinghua University, China. He earned a Ph.D. in EECS from the University of California, Berkeley. He has published extensively in computer architecture, digital arithmetic, parallel processing, distributed systems, Internet security and cloud computing.
Jack Dongarrais a University Distinguished Professor of Electrical Engineering and Computer Science for the University of Tennessee, a Distinguished Research Staff at Oak Ridge National Laboratory and a Turning Fellow at the University of Manchester. Dongarra pioneered the areas of supercomputer benchmarks, numerical analysis, linear algebra solvers and high-performance computing, and has published extensively in these areas.
Geoffrey Foxis a Distinguished Professor of Informatics, Computing and Physics and Associate Dean of Graduate Studies and Research in the School of Informatics and Computing at Indiana University. He received his Ph.D. from Cambridge University, U.K. Fox is well-known for his comprehensive work and extensive publications in parallel architecture, distributed programming, grid computing, Web services and Internet applications.
©2011 Elsevier Inc. All rights reserved. Printed with permission from Syngress, an imprint of Elsevier. Copyright 2011. “Distributed and Cloud Computing: From Parallel Processing to the Internet of Things” by Kai Hwang, Jack Dongarra, Geoffrey Fox. For more information on this title and other similar books, please visit elsevierdirect.com.