We are having issues with our SQL Server cluster failover instance. here is what is happening.
When cluster fails (manually or automatically) and ownership goes to secondary node, i see SQL Server service in "Running" state (on secondary node), SQL Agent stays as it is in stopped state, at the same time I see SQL Server resource in WSFC shows as "online pending" (on second node), it takes a long time in this state. Then it fails and all resources fails back to primary node.
After failing back to primary node WSFC shows SQL Server resource as "online pending" when i go to services i see "SQL server" service is in running state but SQL Agent stays in stopped state. It stays in "online pending" state for a long time just like it does in first failover and then SQL Server resource in WSFC goes into "failed" state.
Then what i do is on the primary node, go to services bring SQL Server service online by starting it manually, SQL Server service comes back up in running state and then i try to bring SQL Server resource in WSFC online by right clicking on it. It shows online pending for some time and then it fails again.
Next what i do is, go to services again and bring SQL Server and SQL Agent both services online one after another, they successfully starts and goes in "running" state. Then i try to bring SQL Server cluster resource online, it comes back online alongwith "sql agent" cluster resource.
Can someone tell me what is going on, why is WSFC not able to bring SQL resources online automatically and only way to bring sql server cluster resource online is by manually starting both sql server and sql server agent services manually and then manually bring cluster resource online.