Impact of virtualization on heavy web workloads
As the global trend is to virtualize servers in order to simplify management and lower TCO I am seeing more and more customer willing to virtualize their web workloads. As a result I am often asked what could be the impact of virtualization of their existing environment vs keeping dedicated servers.
I did my best to come up bellow with a list of items that needs planning as each of these might have an impact on an application before virtualizing it.
Each runtime CLR, ASP3.0, … will behave differently in both virtual and non-virtual environments so in this article I will stick to the potential impact of virtualization on ASP.NET applications.
· Physical RAM
When running ASP.NET web application the worker process hosting the code will either be 32 or 64 bit, so the possible amount of virtual memory addressable by the process will be very different. On 64 bit system such as Win 2008 R2, x86 worker process can access up to 4GB of virtual address space and 16TB for x64 worker processes. In practice each app is likely to behave differently when moving to x64 worker process because it is related to how much memory the application allocates (what type and rate of objects are allocated).
By default when the CLR starts in an IIS process, it initialize using the AutoConfig feature ( http://technet.microsoft.com/en-us/magazine/2006.11.insidemscom.aspx ) so the CLR will look at various setting and do a kind of !ProcInfo to see the current memory limits, number of cores driven by the hosts:
Process Started at: 2010 Jul 1 10:8:19.63
Kernel CPU time : 50 days 00:08:00.57
User CPU time : 50 days 00:08:00.73
Total CPU time : 50 days 00:08:01.30
WorkingSetSize: 61268 KB PeakWorkingSetSize: 61476 KB
VirtualSize: 531388 KB PeakVirtualSize: 535300 KB
PagefileUsage: 49028 KB PeakPagefileUsage: 49668 KB
52 percent of memory is in use.
Memory Availability (Numbers in MB)
Physical Memory 4095 1925
Page File 4095 4095
Virtual Memory 4095 3993
The key thing to note here is that the process and the way the CLR initialize will vary based on VM host configuration and that this VM host will share a bigger amount of physical resources (such as RAM in this scenario) than a dedicated server and higher concurrency means some level of additional contention.
It is also important to underline that the CLR subscribes at process startup to system memory pressure events (http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/creatememoryresourcenotification.asp ). When memory pressure events occurs, all processes that also subscribed to that event will also know it is a good time to see if some memory can be released (caches for examples). As a result the GC will work more intensively to free as much as possible memory and therefore CPU activity can be high at this point. Now let’s say you have 8 virtual machines hosted on a same physical server and they are all under peak load because they are running the same app. All servers on the same physical hosts could start spinning the CPU as much as it can at the same time. If the CPU is not fast enough at this point it is likely that the end user is going to experience some level of slow response. It is also important to underline that one of the greatness of x64 worker process is to be able to have a larger amount of virtual memory addressable. If an application is hosted in a VM that requires this large amount of memory for example to keep large amount of object s in caches in a memory and that the VM only has very small amount of memory, it may annihilate all the benefits of x64 as the caches will be trimmed more often.
When starting and running an application the JIT/NGET process occurs under various scenarios in order to execute code. When application starts .NET compilation & fusion occurs in: “%Windir%\Microsoft.NET\Framework (bitness) \ (version)\Temporary ASP.NET Files”.
If important amount of code need to be processed you might start hitting some level of disk latency due to virtual hard disks. This is important at startup time of the process but also at runtime especially if many Dynamic Assemblies are created on the fly. For more information see http://blogs.msdn.com/b/tom/archive/2007/12/05/dynamic-assemblies-and-what-to-do-about-them.aspx .
On IIS7.x by default static compression is enabled. The static content that needs to be compressed will be saved by default here: “%SystemDrive%\inetpub\temp\IIS Temporary Compressed Files“. As a result this means that if the environment is virtualized the host will potentially have higher disk utilization than a single machine therefore increasing latency.
o Application log files
It is common to see web site logging some level of application errors (most commonly Exceptions) into a log file. There are two common issues about this if the application is highly verbose (especially at peak load):
§ This will likely create a single point of contention as at the same time only one thread can access/write to the same file. It is likely that many threads will be locked there waiting to get access to that file and therefore cause blocking.
§ Text files are commonly not very well formatted. So stack traces will appear on multiple lines. This makes it hard to identify top occurring exceptions.
As a result having the log files in a virtual disk is likely to cause higher latency and therefore to block threads much longer (especially if there are no dedicated disk to it). On the other hand virtualization may help if the application supports load balancing on multiple VMs.
· System Anti-virus
The same physical host will run ‘Y’ times the number of VMs of Antivirus runtimes per virtual machine… it may happen that the compilation or compression process for static files occur at the same time on some VM therefore triggering AV scanning concurrently and therefore increase CPU load and contention. As a result it is even more important to make sure in such virtualized environments that antivirus exclusions are performed properly.
· File System Defrag
File fragmentation is likely to occurs for many reasons (compression, logs, compilation, page file, …) at some level on the various disks. The good news is that Windows 2008 comes with a built-in rule to defrag disks every Wednesday at 1am. As a result when several VMs are hosted on the same server all the VMs may start defragmenting at this same point in time. It a good practice to spread the defrag tasks at different times in these scenarios.
· IIS Logs
Archiving and analyzing your IIS logs is key to the overall quality process to understand how the application behaves. While the size of HTTERR logs are usually small depending on the activity, the IISW3C logs on the web servers can put some level of pressure on the disks. So keep in mind that by default these files will be generated inside the VMs.
· CPUs & CLR
We discussed previously the AutoConfig process. One important thing to know is that the number of cores seen by the CLR runtime will drive the amount of GC Heaps.
If you do a “!eeversion” on a process you might see something like this:
Server mode with 8 gc heaps
This tells you two important things: in this example, it’s using the “server” version of the CLR (as opposed to the workstation GC) and the number of heaps created. The “server” version of the CLR will create on heap per Core detected. As a result the number of heaps will strongly impact the way the GC behaves and performs. So if you host a VM with only one CPU this means that it will run will have only one heap…
· Network contention
On large web farms it is common to have multiple racks hosting each multiple web servers. Now let’s say you have 7 web servers each one generating 500Mbs per seconds of http traffic. Plus the traffic required for database access that hosted on other racks… It is very likely that it is going saturate the global rack up links at peak load and as a result slowing down traffic. Now let’s say you have 7 Hosts each one running 4 VMs… For large site it is strongly recommended that the entire backbone infrastructure is 10GB and that database mirrors are then put in place on the same racks to minimize network bandwidth & latency to avoid congestion. I would also strongly suggest splitting http and database traffic on different uplinks. Now let’s say for some reason it is decided to move 8 of the VMs hosted on Rack 1 to Rack 2, how does this impact the network rack uplinks? As a result this drives to the fact that the network infrastructure design & capacity plan needs to be carefully anticipated and this get even more important when moving virtual images around.
· Power & heat capacity planning
Having dedicated servers make heat and power generation & consumption more predictable but moving images also brings ease of management. What if you have 2 racks of servers and at peak load for some reason 80% of the VMs go down because of an unexpected peak load and overheat in the a same rack? Some might also have moved a couple of other CPU, network & disk intensive VMs just before that moved happened. These type of scenarios also needs to be taken into consideration as when sharing space or resources with others it also brings higher complexity and for peak loads it really means this type of infrastructure monitoring is fine tuned.
· Backups & configuration
Virtualization may really help in the backup area as each frontend may be “recycled” in a few seconds. IIS7.5 with webdeploy http://www.iis.net/download/webdeploy and shared configuration really makes our life easier.
To summarize, does virtualization will have an impact on performance? It really depends on where your application contentions are. If you currently run a web site on a single host and this app has a file lock contention and you virtualize this environment on two VMs (if the app supports it) then it is likely that this app will benefit from virtualization performance wise. Again it all depends about what and when resources are consumed plus where are the application contentions.
All the best,
EMEA Senior PFE Solutions Support Engineer / IIS HC Certifed PFE