Estimating system capacity requirements

I've gotten a number of queries about estimating how many virtual machines can be run on a given host computer. In order to provide a general answer from which you can extrapolate for your own situation, I'm going to approach this issue from the other side -- estimating the host system capacity required to run the workload created by a given set of virtual machines.

This is actually a fairly complicated issue because so many factors affect virtual machine performance, such as:

  • Amount of physical RAM in the host computer
  • Amount of RAM assigned to each guest operating system
  • Workload for each guest operating system
  • Total workload for all running guest operating systems
  • Workload for the host operating system
  • Type of guest operating system
  • Number of physical NICs used by virtual machines
  • Speed of the NICs (100Mb vs. 1Gb, switched vs. non-switched)
  • Speed and number of host processors
  • Number of hard disk spindles
  • Type of physical hard disk (IDE, SCSI, Fibre, iSCSI, RAID 5, etc)
  • Number of host bus adapters
  • Virtual Server CPU resource allocation settings

There are some general approaches and rules of thumb that you can use for your estimate, though, which I'll cover in this article.

1. Gather workload data
 
As a first step in estimating the required system capacity, you can gather information about the workload of the servers you plan to virtualize. You'll need to determine the CPU, RAM, and hard disk, and network usage for each candidate server's workload.  I understand that Unisys has a tool called Sentinel that can help in this respect, at least in the hands of one of their consultants. Aogtech is another company that provides decision support tools (https://www.aogtech.com) as well as BMC with their Perform and Predict tool (https://www.bmc.com/products/proddocview/0,2832,19052_19429_26203_8707,00.html). Platespin has also just announced a product that is designed for this very purpose - PowerRecon (https://www.platespin.com/products/PowerRecon.aspx). I haven't tried any of these tools. If you have, the rest of us would like to hear about your experiences. You can also use Microsoft Operations Manager (MOM) to determine CPU utilization.
 
2. Decide how to group workloads

Once you've estimated the workloads of the physical servers that you want to virtualize, you can determine which ones to run together on the same instance of Virtual Server. It's best to run dissimilar workloads together. In other words, you'll want to deploy workloads that have similar resource requirements on different instances of Virtual Server. For example, mix network-intensive applications with those that are CPU-intensive and hard disk-intensive. This avoids excessive demands being made on a single system resource, such as processors. For additional best practices, see "Improving performance" in the Virtual Server 2005 Administrator's Guide.

3. Calculate system requirements

Finally, you need to determine the system requirements for each host computer, based on the workloads you intend to run on it. You'll likely need to adjust your estimates and your workloads iteratively until you find a sensible balance between a proposed workload for a host computer and the hardware required to run it.

You can use the guidelines and formulas in the Planning Guide included in the Solution Accelerator for Consolidating and Migrating LOB Applications for calculating how much system capacity you'll need for a given set of server workloads (https://www.microsoft.com/technet/itsolutions/ucs/lob/lobsa/default.mspx). The following are some quick rules of thumb you can use when estimating system capacity requirements:
 
RAM:
Required RAM is the sum of RAM needed for each proposed server workload plus a 20 percent buffer per virtual machine plus the RAM required by the host. In Virtual Server, any memory assigned to a virtual machine is fully committed. If you assign 1 GB of RAM to a virtual machine, when it's running that 1 GB of RAM will not be available for any other purpose. As a rule of thumb, a virtual machine requires the RAM that's assigned to it plus an additional 20 MB. The host operating system also needs some available RAM, and that amount is variable. It depends on how many files are open, whether network connections are active, etc. In practice, the host operating system takes up about 15% of the total RAM in a system with 2 GB RAM or less, or 10% of the total RAM in a system with more than 2 GB.
 
Processor:
Required CPU capacity is the sum of CPU needed for each proposed server workload plus that required for the host. Host processors can be overcommitted, and when that happens, the virtual machines are time-sliced in the same way that threads are time-sliced by the system. If the multi-programming level (MPL) -- e.g., the number of virtual machines that can run on one processor in any instant -- exceeds 4.0, then virtual machine response time may suffer. You can estimate MPL  as the sum of the average utilization of the virtual machine divided by 100 times the number of virtual machines divided by the number of host logical processors.
 
            MPL = Sum(vm_util1,…,vm_utiln) / 100 * total_VMs / total_host_logical_processors
 
The number virtual machines that can be running reasonably well depends on how active they are.
 
Disk:
Required hard disk space is the sum of hard disk space needed for each proposed server workload plus that required for the host. Quantifying disk is difficult because this activity is bursty. When configuring virtual machines, it's important to locate the virtual hard drives (VHDs) on different physical drives if the amount of activity is significant.
 
Network:

Required network capacity is the sum needed for each proposed server workload plus that required for the host. It's best to dedicate a NIC to the host. Networking has similar issues to disk due to bursty activity. If the total bandwidth being used by the virtual machines begins to approach the maximum bandwidth supported by a physical NIC, then you need to add additional physical NICs to the host machine.