question

OptimusPrime-5538 avatar image
0 Votes"
OptimusPrime-5538 asked jiayaozhu-MSFT answered

Phycial Host in Hyper-V cluster going down randomly

Hi,
I have 02 Physical host configured as Hyper-V Cluster, 01 & 02. These are losing connectivity randomly almost once in a week, sometimes host 01 lose connectivity and sometimes host 02. All machines in that host also stop working and I cant ping that machine as well. I have to manually access the machine and reboot it to let it work again.
I checked from events, it was giving me error "Nodes are not consistently configured with IPv4 and/or IPv6 addresses on network adapters that are usable by the cluster", I found a solution of disabling Microsoft ISATAP Adapters from Device manager, and then I run the cluster validation and it passed all network related test.
But again I got the same issue after one week, host 02 lost connectivity, i checked logs today from Host-01 and found this "DCOM was unable to communicate with the computer clust-srv-02.matager.sa using any of the configured protocols; requested by PID 408 (C:\Windows\system32\mmc.exe)."

Anyone has idea what could be the issue ?

windows-server-hyper-v
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

jiayaozhu-MSFT avatar image
0 Votes"
jiayaozhu-MSFT answered

Hi,

Thanks for your posting!

The error message "DCOM was unable to communicate with the computer" typically occurs when an application located on a server tries to communicate to another server and fails, because the remote server is not there anymore or unavailable.

In this place, I may need to know more about your network configuration:

1) "Nodes are not consistently configured with IPv4 and/or IPv6 addresses on network adapters that are usable by the cluster"
——For this error message, please tell me more about your NIC, adapter and protocol configuration.

2) "DCOM was unable to communicate with the computer"
——For this error message, I would like to know if you have conducted any operation before this message popped up, such as running DCDiag.exe, ServerManager.exe, etc.

3) After you ran the cluster validation and passed all network related tests, was this lost connectivity issue only occurred on Node2 or it appeared on both nodes, just like before.

4) Can you go to Task Manager\Processes to find the process with PID 408 and kill this process?

5) Please follow the guidance in this article to further troubleshoot your issue:
https://www.itexperience.net/event-10028-dcom-unable-communicate-computer/

(Please note: Information posted in the given link is hosted by a third party. Microsoft does not guarantee the accuracy and effectiveness of information.)

Thanks for your support!

BR,
Joan


If the Answer is helpful, please click "Accept Answer" and upvote it.

Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

OptimusPrime-5538 avatar image
0 Votes"
OptimusPrime-5538 answered

Hi,
Thanks for your answer, Yes may be that DCOM error came because one Host lost connectivity.

1- External Virtual Switch is being used as NIC and only IPv4 is configured (attached screenshot).
2- I disabled ISATAP adapters on both hosts, got the same problem again after this and found this DCOM error (attached screenshot).
3- I ran test on both hosts of Cluster and result was pass, and this lost connectivity happened to Host-1.
4- No 408 PID was found in Task Manager.
99143-cluster-01-ip.jpg99038-cluster-01-isatap.jpg99115-cluster-02-ip.jpg99074-cluster-02-isatap.jpg



cluster-01-ip.jpg (124.8 KiB)
cluster-02-ip.jpg (118.5 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

jiayaozhu-MSFT avatar image
0 Votes"
jiayaozhu-MSFT answered

Hi,

Thanks for your patience!

1) Have you tried my ways in the article that I sent you, if so, what were the results?

2) Can you create a test machine (VM is fine), add it in the cluster meanwhile remove Node1 from the cluster, to see if this DCOM error persist in your cluster.

3) I found an article to help you resolve this DCOM issue:

https://support.solarwinds.com/SuccessCenter/s/article/Event-10028-DCOM-was-unable-to-communicate-with-the-computer?language=en_US

Please note: Information posted in the given link is hosted by a third party. Microsoft does not guarantee the accuracy and effectiveness of information.

4) I find a similar issue with yours:

https://docs.microsoft.com/en-us/answers/questions/52702/dcom-was-unable-to-communicate-with-the-computer-x.html

Based on this similar issue, I want to check if you can view your Node1 on server manager. It seems that your Node1 no longer exists in your network but DNS record still exists. So try to clean up your DNS record, remove your Node1 from the cluster and re-add it into this cluster, to see what will happen.

Thanks for your support!

BR,
Joan


If the Answer is helpful, please click "Accept Answer" and upvote it.

Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.