Getting Kerberos error (4771) for Failover Cluster (Windows server 2019)

Marcos dos Santos de Oliveira 6 Reputation points
2021-08-16T13:47:21.183+00:00

I'm trying to debug some erros (1260) we are getting in Failover Cluster manager.
123579-1260.png
I found it very strange, as the cluster is able to create and modify DNS records successfully. Both the cluster and nodes computer accounts have permissions to edit the records. So I searched the DNS (AD) servers's logs for correlated events. In the DNS Audit log, I could see that the records could indeed be updated successfully:
123626-dns-update-success.png
But, In AD security event logs, the following events are being logged:
123609-4771.png

I checked the AD objects and both the nodes and the cluster (CNO) itself have full permissions on the CNO.
I ran out of ideas. The cluster was created by a colleague that since left the company, so I don't know if the objects were created manually or automatically by the failover cluster manager.

Do anyone know anything that I could have missed?

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,690 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
957 questions
0 comments No comments
{count} vote

2 answers

Sort by: Most helpful
  1. Seeya Xi-MSFT 16,436 Reputation points
    2021-08-17T05:51:09.41+00:00

    Hi @Marcos dos Santos de Oliveira ,

    Here is an introduction to error 4771.
    It has an error code, please take a closer look at the Description of the event fields.
    You should be 0x18 (because I didn't see it in the screenshot). Usually, the reason for this error is the wrong password was provided.
    This can be something as simple as a mapped drive, cached password in a scheduled task or service.
    Check the account status in AD and enter the correct account password and try again.

    Best regards,
    Seeya


    If the response is helpful, please click "Accept Answer" and upvote it, as this could help other community members looking for similar queries.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


  2. Chris Kolson 1 Reputation point
    2021-11-09T16:04:14.917+00:00

    So we have had this same issue that i have battled for a month, checking literally every nook and cranny relating to security, authentication, and DNS. I finally found the root cause, and I wanted to get this out there since our env is identical in regards to using gMSA on 2012r2 for ad.

    TLDR: in my case, it is KB related. KB5006672, KB5005568, KB5005030, and KB5004244 all cause this. It is really easy to test as well for anyone who wants to confirm:
    In your failover cluster manager, double click on your cluster under cluster core resources to check dns status:
    147886-clusterdns.png

    This will be your confirmation check after uninstalling each KB.
    Uninstall any KB's that are mentioned. It looks like it started happening in july of 2021 the earliest. Preview cumulative updates can also potentially hold this problem.

    After uninstall, if DNS shows OK, that was the root cause. please note that since these are cum sec updates, and further monthly updates can potentially cause as well.

    147825-image-2021-11-02-13-13-03-867.png

    I also tested this on a sandbox ADDS .local domain that was on server 2019 with AD most up to date. The thinking was since (my/the) environment is security heavy, a GPO could have been causing the issues. We confirmed that is not the case and just used an out of the box domain w/o any GPO's whatsoever, and still reproduced the DNS cred missing problem with said KB articles.

    Hope this helps.

    PS: if this does end up fixing anyones problem (as of november 9th, 2021), please open a ticket with microsoft and share your findings. The more light people can shed, the better.

    edit: our environment was AD on 2012r2, and cluster was on server 2019 using group managed service accounts.
    also tested on an "out-of-the-box" test AD environment with only default settings on server 2019, and reproduced the issue confirming it was windows update related.