question

Steph-7528 avatar image
0 Votes"
Steph-7528 asked LloydPlayfair-6407 commented

Hyper-V problem after add 10 VMs

Hi,

We have buy a new AMD (EPYC 7413 24-Core Processor + 512 Go de RAM and Full SSD) server (we working with intel only before) but we want to try AMD...
We have installed Windows Server 2022 on this server (all other Hyper-V (x4) server are on Server 2016)

Yeterday, I have to rollback my migration because after move 10 vm hyper-v on this new server, it had become unstable :

Can’t control VM at all (if they are online, I can’t stop it for example, and some of them who are stop can’t be start.. I have status stuck on « starting… »), it’s like the vmms it’s completely stuck (I tried to restart the process (vmms service) many times, but doesn’t work)...

If I restart the whole server the process works again I can start VM fine but after few minutes it’s fuck again… I'm loosing the control of the VMs.

I can't open settings parameters of VM for example.. Need to restart whole server again to retake the control..

At the first time, I was thinking that a VM in the lot crashed Hyper-V (I had the problem once on another server, but It was not exactly the same symptoms, I was not able to launch console for example in this case...).
It was a VM with a network card who cause the issue, and I had to delete the card and add a new one.. I'm wondering if it's not the case again... I have found a similair problem on a website, the user had a BSOD caused by linux VM on a AMD server.. No BSOD in my case however.

In the problematic server, I have found warning info in the log many times :

VMs Utilization Plan Vport QueuePairs adjusted from the request number 16 to real number 4.
Reason:
The number of requests exceeds the maximum number of QP per VPort supported by the physical network.
Event ID => 280

I found nothing on the web about this event, do you what is it ? The time log seems to fit the problem.

I have planned to reinstall the whole server in Windows server 2019 and reinstall hyper-V role to know if it's caused by a sort of incompatibility between OS and drivers ? I didn't find drivers for Windows server 2022 and I'm wondering if it can be the problem.

If you have any idea...


Thanks again !!

windows-server-hyper-v
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

FYI Building an updated test environment at work using two HPE DL380 G8 with Windows Server 2022, was stable with 5 or so VMs once we started migrating more VM have gotten the same "VMS Utilization Plan Vport QueuePairs adjusted from requested number (16) to actual (4)." Events
Thus this isn't a Intel/AMD processor issue it seems to an Network issue, as before we get BSOD and or Cluster Disks not responding etc... we also get network interface events like "HP Ethernet 10Gb 2-port 560FLR-SFP+ Adapter #2" has begun resetting. There will be a momentary disruption in network connectivity while the hardware resets. Reason: The network driver did not respond to an OID request in a timely fashion. This network interface has reset 1 time(s) since it was last initialized."
I know our Servers and NICs are not certified for 2022, but I suspect your EPYC hardware is current thus this seem to be a driver/Windows 2022 Server bug.
We also have issues (live migration hangs and interment cluster errors) with Server 2019 on production G10 hardware, thankfully not as the bad as above.. but its seem OSes past Server 2016 Hyper-V seems to have become quite flaky. The G8s came from our retired 2012 R2 Hyper-V Cluster environment, only one node (4 way cluster) once rebooted unexpectedly in 5 years!


0 Votes 0 ·
DSPatrick avatar image
0 Votes"
DSPatrick answered DSPatrick commented

didn't find drivers for Windows server 2022 and I'm wondering if it can be the problem

Yes, likely is the problem. I'd check here and with the manufacturer about support for Server 2022.
https://www.windowsservercatalog.com/

--please don't forget to upvote and Accept as answer if the reply is helpful--







· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Just checking if there's any progress or updates?

--please don't forget to upvote and Accept as answer if the reply is helpful--



0 Votes 0 ·
JamieCormack-1716 avatar image
0 Votes"
JamieCormack-1716 answered Steph-7528 commented

I had basically the same issue with a system using 2x AMD EPYC 7302 16-core processors but I could only get to 5 VM's before it would not work, tried all kinds of things, removing the virtual switch, turning off Hyper-V allow NUMA spanning. Once I started them up manually in a different order and I managed to get to 6 then it went wrong again.

I also didn't have any host crashing, BSOD etc but as soon as I tried to start the 6th VM I couldn't stop it starting, access settings for any of the VM's etc, everything just stopped working.

I wiped and put in Server 2019 and it all started working straight away, got 10 running no issue.

The only other difference between when I was using 2022 to when I was using 2019 was that I had to use a PowerShell created Hyper-V teamed virtual switch rather than just use the server windows teaming......when I went back to 2019 I used the windows teaming method that works fine for us.

I don't think that was the issue but worth mentioning, the issue for me feels like the combination of an AMD processor and Server 2022. I have to say my experience with AMD processors in servers is that it always adds unnecessary pain, sadly the more expensive Intel Xeons just work. (the first PC I built had an AMD K6-3....so i've worked with them for a long time too)

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi Jamie,

Thanks for your feedback !
It's nice to see i'm not alone with this issue..

I have reinstalled the whole server with the 2019 Version and ... NO PROBLEM :D

I have never see a problem like that with Intel..

Hope your server is fine too now ;-)

Have a nice day

0 Votes 0 ·