Virtually a solution...

Surely it's time to consider virtualising at least some of your new Exchange 2007 deployment?  Microsoft, VMWare and others now have some great solutions that on the face of it seem perfect for increasing the efficiency, scalability and uptime of your Exchange solution.  In my opinion yes it's worth considering but only as part of a requirements gathering exercise to understand exactly what you are designing your Exchange solution to achieve.  ..and to be honest I haven't yet come across a design yet where virtualising any Exchange roles for production actually makes sense right now...  I'm not saying they don't exist; it's just I haven't come across them.

So why not?

Well firstly is it supported? Much has already been said about Microsoft's support stance when running Exchange 2003 Servers on a virtualised platform using a Windows 2003 Server O\S and for now we have to wait to see how this support stance will change with Exchange Server 2007.  An official statement on the support stance for Exchange Server 2007 SP1 running on Windows 2008 & Hyper-V is imminent I believe but until then I would be reticent to virtualise any Exchange roles.  I think that the implications of deploying an unsupported or partially supported solution is a risk too far.  But lets say the story changes and Microsoft issues a statement that says that providing certain criteria are met Microsoft will fully support all Exchange roles running on a virtualised platform...

(**I think this is unlikely by the way - I would expect support for CAS, Hub and Edge only.)

Surely virtualisation offers a simple solution to increase uptime and meet our aggressive SLA's? Well maybe.  This is the second reason why you might consider virtualising Exchange.

I think at this point discussions would diverge depending on which role we are considering virtualising.  For the mailbox role there are numerous recovery scenarios that might present themselves which we would need to recover from. (physical database corruption, disk failure, server failure etc etc..)  Which of these would virtualisation be of benefit in recovering from?  If we have deployed continuous replication then to be honest 'none'. If we haven't deployed continuous replication in any form, at a branch office for example, then the only benefit I can see is the use of 'Quick Migration' to another host.  But if you have the capacity to have a host on standby then why not use continuous replication. If the standby host is actually running other virtualised services then you need to scale this server out.  And in my opinion virtualisation introduces a layer of additional performance requirements which, for the mailbox role certainly, means that this host would have to be scaled out to such an extent that I would consider reverting to having a standby host with continuous replication.  There is also a lot of discussion over virtualising the secondary node of a CCR solution or the SCR target server.  My question would be what happens when you have to activate your SCR target permanently and\or move the cms to the second node in the event of an unrecoverable failure of the active node?  This causes all sorts of problems.

(I think we can also set aside the Unified Messaging role which I believe is too processor intensive to be considered for virtualisation.)

So what about increasing uptime for the Hub, Edge and CAS roles?  Well what might we need to recover from?  disk failure, server failure..? These would be the most likely I would have thought for these roles. Would virtualisation hasten recovery given these situations?  Again, maybe.  If we had a problem with the host operating system we could 'quick migrate' the server to another host.  This is certainly an option but my preference would certainly be to have additional role servers available for resilience which would also cater for periods where the requirements of the service are unusually high.  An additional role server for resilience would 'seamlessly' take over the work of the failed server temporarily whilst the failed server is rebuilt using a combination of SCCM and Powershell scripts for example.

I think another particularly important factor in this argument regarding virtualisation and uptime is that for most administrators running a virtualised Exchange Server introduces an unfamiliar additional layer of complexity which might in a real world disaster recovery scenario actually increase the time it takes to recover the server rather than reduce it.  Of course this risk is mitigated somewhat by cast iron documented recovery procedures backed up with regular fire drills but in my experience few have the time or resources to do this.

The other (& third) main reason that I am going to cover here is consolidation.  If you look at the average processor usage across most Exchange deployments I bet it's not that high.  There is definitely a case for maximising processor efficiency of a host and one of the options is to run multiple virtual servers across a single host. I think this is the most compelling reason for virtualising Exchange and which often surfaces because it appears to effect the bottom line. ..but I'm not sure it really does.

I think if you scale out where possible and maximise the usage of the servers that you deploy then you can claw back some of that 'wasted' processing power.  Of course in order to cater for periods of high utilisation then you need to make sure you do not scale out too far but you would need to do this anyway - virtualised or not.  CAS, Hub and Edge roles do tend to be relatively processor intensive applications and certainly for enterprise deployments are also dependant on suitably designed storage.  If you virtualise you would need to cater for the additional load that the virtualisation layer will add. ..and the problem here is that like any new technology or combination of technologies the success of this relies on suitably intensive and relevant testing.  At the moment there is not a lot of information out there so you would in a certain extent be taking a bit of a step into the unknown.  This is a risk that I am not sure I would want to take.

The other point to make here is that the case for consolidation will definitely change.  What if we could deploy cost-effective 64 core servers now?  Well then the case for running multiple Hub Transport servers on the same host increases.  If we can only deploy a virtualised environment on 4 core servers then it might be more cost effective to buy more lower spec boxes, scale them correctly and forget virtualisation for now.

So in summary is vital that you firstly combine your SLA's and requirements to understand exactly the parameters of your design, and then, in my view, architect a solution to meet those needs with the fewest possible moving parts.  In my opinion that currently precludes the use of virtualisation for most deployments.  I am certain the story will change as the availability of scalable hardware increases, Hyper-V matures and ultimately when the next versions of Exchange Server are released...  But for now and certainly for Exchange Server I think the negatives will outweigh the benefits for most.