happens all the time, has happened for several versions of scom and several versions of windows. an agent server goes from 15% cpu to 100% cpu in about 3 seconds, which is super unhealthy. but scom never notices because the agent server is too busy to let the agent tell scom about it, i guess? so the monitor never changes state, so no alerts are generated, no recoveries are started. has anybody else seen this, and how do you handle it?
here's a box that was at 100% cpu for over two days straight. no alert. not until somebody manually went in and restarted a runaway service this morning did scom start seeing CPU readings again.
