Log shipping fails to update metadata tables in monitoring and secondary servers

 

 

Recently I was working with a Customer and came across a situation where log shipping does not update the

system meta data table log_shipping_monitor_secondary in monitor server and secondary server in msdb.

there is a known issue when SQL agent account does not have permissions to update tables in msdb in monitoring server.

we need to grant full permissions in msdb however it does not help in our scenario.

Log shipping configuration is primary server ,two secondary server and monitoring server . Restore and Copy job completes successfully however it wont update the meta data table log_shipping_monitor_secondary.

We captured profiler to understand what is happening and we see the below errors in profiler

OLE DB provider "SQLNCLI10" for linked server "LOGSHIPLINK_Test\Test-1499715552"
returned message "Login timeout expired".

OLE DB provider "SQLNCLI10" for linked server "LOGSHIPLINK_Test\Test_-1499715552" returned message
"A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections.
For more information see SQL Server Books Online.".

OLE DB provider "SQLNCLI10" for linked server "LOGSHIPLINK_Test\Test_-1499715552" returned message "Login timeout expired".

OLE DB provider "SQLNCLI10" for linked server "LOGSHIPLINK_Test\Test_-1499715552" returned message "A network-related or instance-specific error has occurred while establishing a connection to SQL Server. Server is not found or not accessible. Check if instance name is correct and if SQL Server is configured to allow remote connections. For more information see SQL Server Books Online.".

OLE DB provider "SQLNCLI10" for linked server "LOGSHIPLINK_Test\Test_-1499715552" returned message "Login timeout expired".

So when restore job runs it try to connect monitor server and it fails with login time out expired.

Then we need to understand why it does not update the data in secondary server itself. Restore job executes the stored procedure sp_processlogshippingmonitorhistory which triggers sp_MSprocesslogshippingmonitorsecondary and this is responsible to update the meta data tables with the last restored file name and date.

however since it failed to execute the below command it did not update the metadata tables

select @linkcmd = quotename(sys.fn_MSgetlogshippingmoniterlinkname(upper(@monitor_server))) + N'.msdb.sys.sp_processlogshippingmonitorhistory'

Now it is no more log shipping issue instead it is a connectivity issue , we need to troubleshoot why we were getting login time out. then I understand that the two servers are in different domain and monitoring server is configured with NetBIOS name \instance name. if we try connecting using NetBIOS name it fails however it works fine if we connect with FQDN\instance name

The best option is to reconfigure the Monitoring server with FQDN or troubleshoot why NetBIOS name is not working.

The possible cause for NETBIOS name doesn't work possibly because it is not defined in the DNS search order. It is possible to use fully-qualified domain names, or even raw IP addresses. The Domain Name System is hierarchical and the DNS server cannot uniquely resolve a NETBIOS name.
Before sending a host name to the server, DNS client will try to guess its fully-qualified domain name (FQDN) based on the list of known DNS suffixes it has access to through various configuration settings. It will keep asking the configured DNS servers to resolve the potential names until it finds a match. The order of DNS servers and domain suffixes is important because the DNS client will use the first one it can resolve. If it happens to be a wrong guess, you will not be able to connect to the desired target host.

we can also make it work by creating an alias in configuration manager which eliminates the login time out error and allow log shipping jobs to update meta data tables.

So if you encounter this issue , the things that needs to be validated are

1. SQL Server agent account permissions in monitoring server and secondary servers

2. connectivity issues between monitoring server and secondary server

3. any permissions issues at the object level in msdb

once you rule out all these settings then best place to start is to capture SQL profiler with statement level and errors &warning events.

Happy reading Smile