question

Jean-MichelJacques avatar image
0 Votes"
Jean-MichelJacques asked tbgangav-MSFT answered

Azure Update Management fails for servers behind Log Analytics Gateway

Hello,
Since a few days we can't use Update Management anymore on our on-premise servers that they are using a Log analytic Gateway to connect to Azure. The Update Agent readiness shows "Disconnected" for those servers and any update deployment fails with message that hybrid worker is not available.
When checking the Operations Manager event log on those servers, we can regularly find this error:

 A module of type "Microsoft.EnterpriseManagement.HealthService.AzureAutomation.HybridAgent" reported an exception Microsoft.EnterpriseManagement.HealthService.ModuleException: Unable to Register Machine for Patch Management, Registration Failed with Exception System.InvalidOperationException: System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send. ---> System.IO.IOException: Authentication failed because the remote party has closed the transport stream.
    at System.Net.TlsStream.EndWrite(IAsyncResult asyncResult)
    at System.Net.ConnectStream.WriteHeadersCallback(IAsyncResult ar)
    --- End of inner exception stack trace ---
    at System.Net.HttpWebRequest.EndGetRequestStream(IAsyncResult asyncResult, TransportContext& context)
    at System.Net.Http.HttpClientHandler.GetRequestStreamCallback(IAsyncResult ar)
    --- End of inner exception stack trace ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at AgentService.HybridRegistration.PowerShell.WebClient.AgentServiceClient.<Put>d__30`2.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at AgentService.HybridRegistration.PowerShell.WebClient.AgentServiceClient.<RegisterV2>d__18.MoveNext()
 --- End of stack trace from previous location where exception was thrown ---
    at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
    at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
    at AgentService.OmsHybridRegistration.PowerShell.Commandlets.OmsHybridRunbookWorker.<RegisterWithService>d__62.MoveNext()
    
    at AgentService.OmsHybridRegistration.PowerShell.Commandlets.OmsHybridRunbookWorker.ExtractErrorMessageAndThrow(AggregateException exception)
    at AgentService.OmsHybridRegistration.PowerShell.Commandlets.OmsHybridRunbookWorker.RegisterGroupInDatabase()
    at AgentService.OmsHybridRegistration.PowerShell.Commandlets.OmsHybridRunbookWorker.Register() AgentServiceURI: https://cbff4b08-5c3a-4d34-8edf-7e75c597ca19.agentsvc.azure-automation.net/accounts/cbff4b08-5c3a-4d34-8edf-7e75c597ca19 which was running as part of rule "Microsoft.IntelligencePacks.AzureAutomation.HybridAgent.Init" running for instance "" with id:"{42DF9A9B-7F94-FCF8-BD0E-8BA9462687B1}" in management group "AOI-cbff4b08-5c3a-4d34-8edf-7e75c597ca19".

When checking the OMS Gateway log on the server acting as gateway, we find this event occurring constantly (event ID 406):

 2020-10-26 13:11:00 [10] ERROR TcpConnection - Server certificate chain does not include a trusted root certificate. Cert count in chain: 3. Root cert: CN=DigiCert Global Root G2, OU=www.digicert.com, O=DigiCert Inc, C=US

Client servers and gateway are in the same AD on-premise domain so no certificate is defined for communications between all Microsoft Monitoring agents and the Log Analytics gateway.
Client servers run Windows Server 2012/2012 R2/2016/2019, they all have the problem
We tried to enable TLS 1.2 as recommended in
https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/azure-monitor/platform/agent-windows.md
on a few clients but this did not solve the problem
Communication and updates are OK for the few servers that directly connect to Azure without using the gateway (including the server acting as gateway itself).
This problem began 3 days ago after Update Management has been successfully used for a long time on those servers.
Can someone help me identify the problem? Thank you!

azure-monitorazure-automation
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @Jean-MichelJacques, Thank you for reaching out on this!, I am checking with our internal team to understand if there is a current limitation on this feature. Will get back to you at the earliest.

0 Votes 0 ·

Hi @Jean-MichelJacques, If you are still facing the issue, can you please provide your region details so that we will investigate further?

0 Votes 0 ·

Hello, problem is now solved since yesterday
However, no action was taken at our side to solve the problem.
Do you know if something could have changed ? My region in West Europe.
Thank you!

0 Votes 0 ·

1 Answer

tbgangav-MSFT avatar image
0 Votes"
tbgangav-MSFT answered

Hi @Jean-MichelJacques, It was determined that a recent SSL Certificate update to Automation caused connection issues to Automation endpoints. In order to mitigate impact, rollback of the certificates is done on all regions.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.