question

GlenHarrison-3233 avatar image
0 Votes"
GlenHarrison-3233 asked GlenHarrison-3233 commented

Storage Spaces Direct Cluster (Validation fails on port 3343 over the Mellanox NICs)

Hi Everyone,

I am building a 4 node storage spaces direct cluster running Server 2022.

Each node has two (dual port) NICS. Intel 10gb and Mellanox 100gb.

When running the cluster validation test, is it normal to see errors on the mellanox NICs for port 3343?

My config is:

intel nic0 and nic1 attached to a SET vSwitch (vnics for management, cluster, livemigration)

mellanox nic0 and nic1 for storage

The report is all green except for this one error. It's not firewall, as i've checked the ports are allowed.

Thanks!!

windows-serverwindows-server-hyper-vwindows-server-storage
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hey,

What model are the Intel NICs?
Can you provide a screenshot of the validation error?

The Mellanox NICs are not configured in any SET Team Correct? Have you made sure they're tagged for the right VLANs? And that Network QoS is configured for RoCE properly on both the host NICs and the Switch interfaces they're connected to?

Are there any issues with pinging the various Mellanox interfaces from each host?

0 Votes 0 ·

The intel nics are X710

I'll try and get a screenshot when I'm next in the office, but it just listed all the mellanox nics between each other saying port 3343 error
There is no vSwitch on the mellanox, only the intels
Ping works fine between nodes, and I can ping the IPs assigned to the mellanox

Weird thing, I've since created the cluster which worked great. Now I have the cluster built, I've since ran the cluster tests which came back error free?

It's only the test cluster report before the clusters created which fails

The switch ports for the mellanox have vlan tagging, but not the vlanid on the adapters in windows. They are set to vlan 0, if I change that to match the vlan tag of the switch port it breaks comms

I've tested livemigration over the mellanox which works fine, which makes me think the earlier error was a red herring....

0 Votes 0 ·
NZ-BenThomas avatar image NZ-BenThomas GlenHarrison-3233 ·

Sorry I missed this reply last week!

So with the Intel x710 NICs, I've seen this issue a few times. Are you using untagged VLANs on the Intel NICs for management traffic? If you change to tagged VLANs for management as well, the error generally goes away.

As for the Mellanox NICs, in order for RoCE to work correctly, you need to configure VLANs, PFC, ETS and DCB correctly. If you are unable to set the NICs on the host to match the VLAN on the switch without it breaking, then the switches aren't configured correctly. The Switch needs to be configured as a trunk port with the VLANs tagged for storage, and then the NICs on the host need to be tagged (VLANID set) to the matching VLAN for storage.

Live migration won't fail if RoCE is misconfigured, but it will cause increased CPU usage. It will also impact the latency and storage performance of S2D, as well as cause potential timeout issues that impact reliability during patching operations.

0 Votes 0 ·
Show more comments

1 Answer

LimitlessTechnology-2700 avatar image
0 Votes"
LimitlessTechnology-2700 answered

Hi there,

Some points to note here.

  • Patch the server with all Windows OS Updates and restart it.

  • Try disabling the Antivirus on both the servers and give it a try.

Here is a thread as well that discusses the same issue and you can try out some troubleshooting steps from this and see if that helps you to sort the Issue.

Cluster Network Validation - fail UDP port 3343
https://docs.microsoft.com/en-us/answers/questions/249241/cluster-network-validation-fail-udp-port-3343.html

S2D Cluster Validation Fails Firewall and UDP Port 3343
https://social.technet.microsoft.com/Forums/office/en-US/c3e15170-2a83-48a8-b671-efc2a9afe4cf/s2d-cluster-validation-fails-firewall-and-udp-port-3343?forum=winserverfiles



--If the reply is helpful, please Upvote and Accept it as an answer--

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.