Everything you wanted to know about SR-IOV in Hyper-V Part 6

Summarising the series so far, we’ve answered “Why SR-IOV?”, looked at hardware and firmware dependencies, and run through the user interface and PowerShell cmdlets to configure SR-IOV. Next on the agenda is how SR-IOV and Live Migration co-operate. In addition, there’s a video showing covering the configuration aspects of SR-IOV from the previous part which also shows Live Migration in action.

A goal very early on in Windows “8” planning was that we should consider features which are incompatible with mobility scenarios such as Live Migration. (Actually SR-IOV has been in the pipeline for considerably longer than that as you can probably tell from the deck back at WinHEC 2008 I linked to previously, and even talked about at WinHEC 2006.)

That goal poses somewhat of a problem when hardware is assigned to a virtual machine. Let’s ignore SR-IOV for a moment and take a step back a few years into the initial development and prototyping of the feature. You may be familiar with the term “Discrete Device Assignment”. This is where a fully-fledged PCI Express device is assigned to a virtual machine. Discrete assignment, from a software engineering perspective can be viewed in some ways as a stepping stone towards SR-IOV support. However, discrete assignment is fraught with issues in several areas:

  • Security
  • Usability & Mobility
  • Scalability

From a security perspective, a virtual machine with significant uncontrolled access to hardware is too risky to entertain in a shipping product, while still providing a production support statement. It would be extremely difficult to secure the VM to the extent that it was unable to cause side effects to other partitions, thus breaking a critical security tenet.

From a scalability perspective, it is difficult to describe assigning a device taking an entire PCI Express slot into a single virtual machine as scalable. It would be very expensive to scale to tens or hundreds of VMs in this manner. If indeed, you could find a server with that many PCI Express slots.

From a usability perspective, it is difficult to describe this as a common user scenario. Granted, there are niche exceptions which do have some merit, but several things would be lost in the process. The most important of these is Live Migration support, shortly followed by state changes (apart from running or shutdown), then snapshots. Probably more as well. The reason is that is extremely difficult to save the state of a piece of arbitrary hardware inside a running VM, and subsequently restore it to a running state on a different platform. Further, even if we could temporarily halt the VM with hardware state intact (as would be necessary during Live Migration black-out), without an absolutely identical configuration on the target at all levels (not just hardware) the chances of successful restoration are non-existent.

Hence, we considered discrete assignment not very useful except to an extremely niche segment of our user base who aren’t concerned about security, scalability or mobility. Instead, we focused on something that addresses all of these concerns, namely SR-IOV. Security is built in at all levels. Only a single PCI Express slot is needed to support many virtual machines. And we support mobility (live and quick migration), state changes and snapshots.

So you may be wondering how SR-IOV overcomes the statement I made about it “being difficult to save the state of a piece of hardware inside a running VM, and subsequently restore it to a running state on a different platform”, yet still be able to achieve all these goals. After all, a VF is true hardware and it running in the VM.

As simple as it is, the answer probably isn’t immediately obvious. The answer is that we don’t save the hardware state at all, and don’t even attempt to tackle the problem. And yet we are able to migrate to a platform which could have a completely different physical NIC, the same type of NIC at a different firmware release level, or even a platform which doesn’t have SR-IOV support. And through all of these scenarios, keep networking fully functional in the VM. Confused yet?

One more minor backtrack. You may have noticed I’ve said a couple of times that a VF “backs” a software based network adapter. By this I mean that the VM always has a software based network adapter, but when a VF is available, we “failover” automatically to the hardware path for I/O. The software path is always present, only the VF is transient. So now it should be a little more obvious. Whenever we go through a state transition that would require hardware state to be saved, we remove any VFs from the VM beforehand, falling back to software based networking. (I say VFs in plural as a VM can have multiple software network adapters, up to eight, hence up to eight VFs assigned.) Once the VF is removed, we can perform any operation necessary on the VM as it is a complete software based container at that point. Once the operation has been completed, assuming hardware resources are available and other dependencies met, we will give a VF back to the VM. This completely solves the problem.

Those paying attention may have noticed I said that “we remove any VFs from the VM beforehand” and are wondering whether there are implications if the guest operating system isn’t co-operative. The short answer is no, our state model covers this scenario, although it is certainly easier when the guest operating system co-operates in VF deallocation.

Note also I used the word “failover” in quotation marks. I could have said “team”, but that implies a little more functionality than “failover”. Truthfully, we haven’t come up with a good term yet, but the point is that this “failover” is nothing to do with NIC teaming, also now native in Windows Server “8”. It simply means that we use a VF automatically if it is present, or software based networking if it is not, and during the transition either way, there will be no loss of network packets. As we will see shortly, NIC teaming and SR-IOV can co-exist as well in a virtual machine.

At this point, the adage about a picture speaking a thousand is apt. Rather than attempt a series of badly drawn block diagrams, here’s a video showing SR-IOV configuration and Live Migration. There are so many other new features in Hyper-V also shown in this video which aren’t immediately obvious, I’m struggling to contain myself, but do manage to limit myself at least to talking just about SR-IOV in the recording. Maybe you will notice the use of an SMB file share for the VM, the use of VHDX, Live Migration without a cluster, and of course PowerShell support.



Windows Server “8” Beta. Demonstration of SR-IOV in Hyper-V and Live Migration

More to follow in the next part.