Discovery never stops… but tried everything!
So have you ever been in situation where everything you did just didn’t help and every blog post, every forum entry and every link in search engine provided an information which you tried three times or more? Such things happen and then you have to fight on your own (or with little help from Support Escalation Engineers).
Whenever you see a blue bar just going from left to right for ages on discovery wizard, your mind starts to refresh every blog post you’ve seen and remind yourself of three things you have to check before going wild. First thing is a Microsoft SQL Server Service Broker (which was discussed thousand Times). In Operations Manager 2012 SQL Server Broker is enabled on OperationsManager database from default, which eases your pain (http://support.microsoft.com/kb/941409).
Second thing will be great, old Health Service State folder, which can be stale after many operations and this will prevent showing results in discovery pane. Third option – you didn’t specify SPNs for SDK Service correctly. Good blog post and nice bunch of links can be found in Marnix Wolf’s post from two years ago (http://thoughtsonopsmgr.blogspot.gr/2010/08/discovery-wizard-is-running-for-ever.html). What if there’s no luck? What if all of those options fail?
If everything works fine, but one thing doesn’t and you don’t know why – it has to be a permission, as old legends say. But what, why, when? Let’s see exactly how the discovery proces works (from http://blogs.technet.com/b/momteam/archive/2007/12/10/how-does-computer-discovery-work-in-opsmgr-2007.aspx)
Since I’ve been discovering servers only, the option to check if they are accessible is enabled by default, in Alert View I saw multiple entries on no connection to remote host. That means that steps 8 to 10 went through, but returning results didn’t finish. Since step 11 didn’t show proper results and there’s no connection with permissions and returning results from discovery engine, something had to happen before step 8.
One of the steps which are not documented in this list is when SQL Server checks if SDK Account is in the proper group to use OperationsManager’s SQL Server Broker. This is done somewhere between steps 5 and 7.
What was the turning point? It was the Application log in SQL Machine which showed such entry:
Log Name: Application
Date: 10/24/2012 4:55:48 PM
Event ID: 28005
Task Category: Server
An exception occurred while enqueueing a message in the target queue. Error: 15404, State: 19. Could not obtain information about Windows NT group/user 'domain\OmSDK', error code 0x5.
So it turns out that SQL Server service account should be able to check attributes of OMSDK service account. Normally you should see permissions like this:
The real life shows something different. Authenticated Users on the new OU for SCOM accounts didn’t have this entry. In fact it was totally missing. Didn’t want to break anything, maybe it was on purpose, maybe not. Adding Read right to SQL Server account resolved the hanging discovery process.
Hope you can add it to the list of possible solutions of hanging discovery problem.