question

Tomwuyts-9002 avatar image
1 Vote"
Tomwuyts-9002 asked Tomwuyts-9002 commented

SQL Connection - error 35

Hi,

Since (at least) last monday (22/3), I've been getting the error below from a background task that's running in AKS on linux. When running the same task from visual studio (on Windows), this error does not occur.

The error occurs anywhere between 1 and 10 times per minute, across 30-50 pods running the task. If I reduce the scaling, it still happens.

The full error:

 Microsoft.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 35 - An internal exception was caught)  ---> System.ObjectDisposedException: Cannot access a disposed object. Object name: 'System.Net.Sockets.Socket'.    
    
 at System.Net.Sockets.Socket.SetSocketOption(SocketOptionLevel optionLevel, SocketOptionName optionName, Int32 optionValue)   
 at System.Net.Sockets.Socket.set_NoDelay(Boolean value)    
 at Microsoft.Data.SqlClient.SNI.SNITCPHandle..ctor(String serverName, Int32 port, Int64 timerExpire, Object callbackObject, Boolean parallel)    
 at Calidos.Artemis.Services.RelinkerADTService.Relink(String hisPatientUid) in /src/Models/Calidos.Artemis.Services/MessageProcesses/RelinkerADTService.cs:line 84    
 at Calidos.Artemis.QueueProcessor.RelinkADTAction.Run(RelinkMsg msg, CloudQueue theQueue) in /src/MessageProcessor/Calidos.Artemis.QueueProcessor/Actions/RelinkADTAction.cs:line 27    
 at Calidos.Artemis.QueueProcessor.QueueWorker.ProcessMessage_Relinker(CloudQueue theQueue, Boolean msgOk, String[] msg, Object msgType) in /src/MessageProcessor/Calidos.Artemis.QueueProcessor/QueueWorker.cs:line 272 ClientConnectionId:de553492-0d5d-4937-957c-2a76dbb3e8ee Routing Destination:d31d9ada42e9.tr2384.westeurope1-a.worker.database.windows.net,11047   
 at Calidos.Artemis.Services.RelinkerADTService.Relink(String hisPatientUid) in /src/Models/Calidos.Artemis.Services/MessageProcesses/RelinkerADTService.cs:line 84    
 at Calidos.Artemis.QueueProcessor.RelinkADTAction.Run(RelinkMsg msg, CloudQueue theQueue) in /src/MessageProcessor/Calidos.Artemis.QueueProcessor/Actions/RelinkADTAction.cs:line 27    
 at Calidos.Artemis.QueueProcessor.QueueWorker.ProcessMessage_Relinker(CloudQueue theQueue, Boolean msgOk, String[] msg, Object msgType) in /src/MessageProcessor/Calidos.Artemis.QueueProcessor/QueueWorker.cs:line 272
    
 System.ObjectDisposedException: Cannot access a disposed object. Object name: 'System.Net.Sockets.Socket'.    
 at System.Net.Sockets.Socket.SetSocketOption(SocketOptionLevel optionLevel, SocketOptionName optionName, Int32 optionValue)    
 at System.Net.Sockets.Socket.set_NoDelay(Boolean value)    
 at Microsoft.Data.SqlClient.SNI.SNITCPHandle..ctor(String serverName, Int32 port, Int64 timerExpire, Object callbackObject, Boolean parallel)   
 at System.Net.Sockets.Socket.SetSocketOption(SocketOptionLevel optionLevel, SocketOptionName optionName, Int32 optionValue)    
 at System.Net.Sockets.Socket.set_NoDelay(Boolean value)    
 at Microsoft.Data.SqlClient.SNI.SNITCPHandle..ctor(String serverName, Int32 port, Int64 timerExpire, Object callbackObject, Boolean parallel) 


Occasionally, I also see this error 40:

 Microsoft.Data.SqlClient.SqlException (0x80131904): A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 40 - Could not open a connection to SQL Server) 


According to https://github.com/dotnet/SqlClient/issues/449, I was asked to ask Azure support. They forwarded me here.

Thanks in advance,
Tom Wuyts

azure-kubernetes-service
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Tomwuyts-9002 avatar image
0 Votes"
Tomwuyts-9002 answered

I've found the cause. The database-connection string used by our AKS-pods had Pooling set to "False", causing it to use a ton of connections (500-1000/minute over 30 pods). Setting it to "True" reduced the amount of connections to 30-40, and the error no longer appears!

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

shivapatpi-MSFT avatar image
1 Vote"
shivapatpi-MSFT answered karishmatiwari-msft commented

Hello @Tomwuyts-9002 ,
Do you still see the issue as of now , if yes - kindly let us know.
Can you SSH into the node on which the pod is running and validate the latest logs in /var/log/syslog (or syslog.1) , check if you are seeing the errors like "eth0: Lost carrier"

If you would have seen those errors , you might be hitting the issue mentioned here https://github.com/Azure/aks-engine/issues/4341
It should be fixed as of today , kindly let us know if you are still seeing the time out errors.






· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi @shivapatpi-MSFT , I could not find that error in the logs the 3 nodes.


I do see a few out of memory errors, causing a pod to be killed. This happens less frequent than the above error, and doesn't occur at similar timestamps as far as I can tell.



0 Votes 0 ·

Forgot to mention: the error still occurs today.

0 Votes 0 ·

Hi @shivapatpi-MSFT, I'm still experiencing this error and it's starting to impact our customers. Do you have any other suggestions as to where to look to find a solution for this?

I noticed that there are a lot of connections going to the sql db (about 500-600/minute), could this have something to do with this?

0 Votes 0 ·
Muhammadhamid-8726 avatar image
0 Votes"
Muhammadhamid-8726 answered Tomwuyts-9002 commented

Hi , I am Facing the same error , if someone had solved please help me thanks,

Regards , M.Asad

· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hi, I solved it by putting Pooling on true in the connectionstring.

0 Votes 0 ·