question

TonyGedge-0532 avatar image
1 Vote"
TonyGedge-0532 asked MichaelSurmanian-5246 published

RDMA & QoS configured, but no OperationalTrafficClasses shown on adapters

I have a Windows Server 2016 cluster with RDMA NICs which for some reason doesn't apply QoS Policies to the NICs.

I've set up the policies, traffic classes and configured RDMA/QoS on the appropriate adapters:

New-NetQosPolicy "Live Migration" -LiveMigration -Priority 5
New-NetQosPolicy "SMB Direct" -NetDirectPort 445 -Priority 3
New-NetQosPolicy "Cluster" -IPDstPort 3343 -Priority 6
New-NetQosTrafficClass "SMB Direct" -Priority 3 -Algorithm ETS -Bandwidth 90
New-NetQosTrafficClass "Cluster" -Priority 6 -Algorithm ETS -Bandwidth 5
Enable-NetQosFlowControl -Priority 3
Disable-NetQosFlowControl -Priority 0,1,2,4,5,6,7
Set-NetAdapterQos -Name storage -Enabled:$True
Set-NetAdapterRDMA -Name storage -Enabled:$True

With the above set, there are no OperationalTrafficClasses shown on the storage adapters:
46009-image.png

The strange thing is this configuration did have OperationalTrafficClasses showing yesterday, but I can't seem to get it back to the same state. Does anyone know why this might be the case or where to look to try to diagnose the issue?



windows-server-hyper-vwindows-server-clustering
image.png (18.6 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

MicoMi-MSFT avatar image
0 Votes"
MicoMi-MSFT answered TonyGedge-0532 commented

Hi,
Typically, networks operate on a best-effort delivery basis, which means that all traffic has equal priority and an equal chance of being delivered in a timely manner. When congestion occurs, all traffic has an equal chance of being dropped.
When you configure the QoS feature, you can select specific network traffic, prioritize it according to its relative importance, and use congestion-management and congestion-avoidance techniques to provide preferential treatment. Implementing QoS in your network makes network performance more predictable and bandwidth utilization more effective.

If the DCBX willing bit on a device is set to true, the device is willing to accept configurations from the DCB switch through the DCBX protocol. If the willing bit is set to false, the device rejects all configuration attempts from remote devices and enforces only the local configurations.
Here are some docs for you to better understand the DCB:
http://www.darrylvanderpeijl.com/windows-server-2016-networking-rdma-dcb-pfc-ets-etc/
https://docs.microsoft.com/en-us/windows-server/networking/technologies/dcb/dcb-manage

Thanks for your time!
Best Regards,
Mico Mi


If the Answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for that - I was actually after lower-level details. For example:

  • I assume RemoteTrafficClasses is the DCBx config received by the adspter.

  • Is OperationalTrafficClasses from local config only is it the active config as modified by any DCBx received config?

  • What does it mean if one or the other is not present?

  • According to the doco I read, windows doesn't respect DCBx even if Willing is True - what is the actual behaviour in this case?

It'd be nice if this output and behaviour was properly defined in the command documentation.

0 Votes 0 ·
MicoMi-MSFT avatar image
0 Votes"
MicoMi-MSFT answered TonyGedge-0532 commented

Hi,
As the article states, If QoS is disabled, then this cmdlet only gets the hardware QoS capabilities of the network adapter. If QoS is enabled, then this cmdlet gets the operational traffic class and flow control configurations in addition.
https://docs.microsoft.com/en-us/powershell/module/netadapter/get-netadapterqos?view=win10-ps
Since you have configured QoS, you can try to disable and then enable QoS and check if it works.

Disable QoS on a specified network adapter:
Disable-NetAdapterQos -Name "Ethernet 2"
Enable QoS on the specified network adapter:
Enable-NetAdapterQos -Name "DCBNIC1"
Try again:
Get-NetAdapterQos -Name "*" | Where-Object -FilterScript { $_.Enabled }

Thanks for your time!
Best Regards,
Mico Mi


If the Answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I thought I did this, but I'll retest just to be sure.

0 Votes 0 ·

Yep, tested again and that does not work:
46425-image.png

And there are QoS Policies defined:
46426-image.png

And there are QoS Traffic classes defined:
46427-image.png

And the adapters have priority and VLAN tagging on:
46441-image.png

So I don't know why the OperationalTrafficClasses aren't being set.

0 Votes 0 ·
image.png (26.5 KiB)
image.png (33.4 KiB)
image.png (10.4 KiB)
image.png (13.3 KiB)
xeonkeeper avatar image
0 Votes"
xeonkeeper answered MichaelSurmanian-5246 published

I have the same issue on my Intel X722 network cards. But on other cards mellanox or Qlogic all is ok. But Intel iWARP, on others i using ROCE.
How i can check, thats ets and pfc working correctly?

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Windows57874-capturea.png Admin Center will allow you to create a Network QoS Policy Performance Monitor


0 Votes 0 ·
capturea.png (23.9 KiB)
MicoMi-MSFT avatar image
0 Votes"
MicoMi-MSFT answered TonyGedge-0532 commented

Hi,
Before the issue happened, did you make any changes to your environment?
Best Regards,
Mico Mi

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

It's difficult to be 100% sure. I believe the only change was to set a VLAN ID on the NICs. We are in the middle of verifying a new physical cluster install.

Originally QoS was enabled but no VLAN was set as we had configured the network switch ports in access mode on the appropriate VLAN. With this configuration operational traffic classes were displayed. Our network team advised us that the switches required an actual VLAN tag to be set in the frames in order for QoS processing to occur, so we set the VLAN on the adapters and had the network switch ports reconfigured to trunk mode. Somewhere about this time operational traffic classes stopped being displayed.

DCBX is disabled globally so the card configuration shouldn't be being negotiated.

What I can't determine is why no operational traffic classes are being displayed. We've even reinstalled the OS and there's no change in behaviour.

0 Votes 0 ·
TonyGedge-0532 avatar image
0 Votes"
TonyGedge-0532 answered

We have changed the switch configuration and we now have OperationalTrafficClasses showing but no RemoteTrafficClasses showing.

Clearly I don't understand where these values are coming from and what they're used for.

Can anyone outline what windows uses OperationalTrafficClasses and RemoteTrafficClasses for (in QoS output)?? If DCBx willing is false, are they just advisory? What would the presence or absence indicate?

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.