We are having an issue with our failover cluster.
We have 4 nodes all connected to a storage array and all are on Server 2016, whenever we restart just 1 node it starts to take down the entire cluster. The cluster crashes and can no longer connect to any node. The VM's get put into a failed state. We have to wait until the node completely boots up than wait again until the clusters sees all the nodes again.
This happens if we restart any node, there's not a specific one. Same happens when a node crashes. I have tried migrating all VM's off the node and restarting but the same exact issue happens. Once everything starts crashing it can take hours before everything goes back to normal...
What can be the cause of the cluster crashing when only 1 of 4 nodes restarts and how can I fix this issue?