Hello,
We are experiencing random authentication issues between applications and SQL that are members of AAD DS managed domain.
Environment:
- Application Server - Azure VM with Windows Server 2019 with IIS and .Net application. Connecting to SQL server using Integrated Security.
- SQL Server - Azure VM with Windows Server 2019 and SQL 2019. Configured for Windows authentication.
- All application servers and SQL servers are members of same managed domain (AAD DS).
- Service accounts for Applications and SQL servers are created in AAD and synced to AAD DS domain.
Issue Description:
All application servers experience connection issues to SQL server with same messages simultaneously:
System.Data.Entity.Core.EntityException: The underlying provider failed on Open. ---> System.Data.SqlClient.SqlException: Login failed. The login is from an untrusted domain and cannot be used with Integrated authentication.
System.Data.SqlClient.SqlException: System.Data.SqlClient.SqlException: Connection Timeout Expired. The timeout period elapsed during the post-login phase. The connection could have timed out while waiting for server to complete the login process and respond
At same time we see following errors on SQL server:
Login failed. The login is from an untrusted domain and cannot be used with Integrated authentication.
SSPI handshake failed with error code 0x80090304, state 14 while establishing a connection with integrated security; the connection has been closed. Reason: AcceptSecurityContext failed. The operating system error code indicates the cause of failure. The Local Security Authority cannot be contacted
The issue duration is about 10-30 seconds and after that everything works as usual.
There are no performance issues during the issue on SQL or Application servers.
It looks like such messages can be logged at time of domain controller unavailability, it is probably related to DCs reboot and/or patching, or host maintenance. Here is a similar issue related to domain controller unavailability with the same messages logged: https://www.sqlservercentral.com/forums/topic/sspi-handshake-failed-error-when-one-of-several-domain-controllers-is-restarted
How can we check AAD DS or hosts servers logs to check if the time is matched to connection issues? Or can it be related to AAD to AAD DS sync process?