Some or all Cluster resources are not discovered by the OpsMgr agent

RANJAN PANDEY 96 Reputation points
2021-10-22T19:12:03.857+00:00

Some or all Cluster resources are not discovered by the OpsMgr agent
we have faced Discovery issue for SQL Instance which is hosted on 3 node cluster (all the SQl Instance disappear from SCOM and not getting discover)

Operations Manager
Operations Manager
A family of System Center products that provide infrastructure monitoring, help ensure the predictable performance and availability of vital applications, and offer comprehensive monitoring for datacenters and cloud, both private and public.
1,418 questions
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,797 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
961 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,556 questions
0 comments No comments
{count} votes

Accepted answer
  1. RANJAN PANDEY 96 Reputation points
    2021-10-22T19:19:37.78+00:00

     We ran the below PowerShell cmdlet and found that there are two resources which are offline in cluster database but not present in Failover Cluster Manager (FCM) console.
    Get-ClusterResource | where {$_.state -eq "Offline"}

     We discussed this with SQL DBA and he confirmed that these resources were removed as part of a maintenance activity last weekend
    Recommendation:

     If you are decommissioning any cluster node in future make sure you remove them from SCOM and then decommission them. Anyway, it is a best practice to always remove the SCOM Agent before decommissioning any server or else it will leave the entries in SCOM and can create various problems and inconsistencies.
     The orphaned resource issue is a very edge case scenario. And you might not run into later. However, just to be 100% sure we will not run into this issue, make sure after any maintenance in cluster, the output of the below PowerShell cmdlet is NULL.
    Get-ClusterResource | where {$_.state -eq "Offline"}
    ** Bear in mind that there might be cluster resource which can be offline. That is absolutely fine. But if you find some resource offline in PowerShell but the same resource is not present in Failover Cluster Manager then those are pretty much orphan. The cluster admin can confirm that.

     We removed both the orphan resources using the below PowerShell cmdlets.

    Remove-ClusterResource -Name "File Server (\XXXXXXX)"
    Remove-ClusterResource -Name "SQL Network Name (\YYYYYY)"
     Restarted the Health Service on all 3 nodes. After that all the cluster objects were discovered in SCOM. Also the missing SQL Instances were discovered.

     This a by design behavior of the SCOM Cluster Discovery that whenever it detects orphan resources in the cluster it will remove the cluster from SCOM. This is documented in the below article.
    Support Tip: Some or all Cluster resources are not discovered by the OpsMgr agent - Microsoft Tech Community
    https://techcommunity.microsoft.com/t5/system-center-blog/support-tip-some-or-all-cluster-resources-are-not-discovered-by/ba-p/349391

    0 comments No comments

0 additional answers

Sort by: Most helpful