question

Heisenberg avatar image
0 Votes"
Heisenberg asked Yufeishao-MSFT answered

sql server transaction replication delays

hello, we have a transactional replication with 1 article for 1 table with 2 subscribers (publisher and distributor are on the same server). This table is sized around 100G and index on it is sized around 200G. at times transferring data to subscriber runs very slow. When i use tracer tokens i see there is no delay in publisher to distributor , however distr. to subscriber has a very long latency sometimes around 1 hour or more.

How can i troubleshoot to see what is happening on subscriber side. Is there any way to find if large update or delete or insert operation happened on the publisher side which cause this delay?
Can issue caused by busy publisher in delay in transferring data to subscriber ?

Few points.
There are no blocks on the subscribers but most of the time i see pageiolatch_ex or _sh waits on the subscriber for the replication connections sessions. Table has around 27 non-clustered indexes, thats a worrysome issue i know, but i need find out what is happening when delays like this are going on.

sql-server-general
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

TomPhillips-1744 avatar image
0 Votes"
TomPhillips-1744 answered
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Yufeishao-MSFT avatar image
0 Votes"
Yufeishao-MSFT answered

Hi @SQLServerBro,

Replication is one of the log based operations in SQL Server, the configuration of the T-Log for the publisher database closely relates to the performance of replication. So taking a closer look at parameter values for the Log Reader is an essential part of troubleshooting replication issues including latency. It executes from the distributor and scans the T-Log of the publisher database.

You can query the MSLogreader_history table in the distribution database to see the log reader statistics, the message of this table can make you nail down your analysis of Log Reader Agent performance to Reader or Writer Thread issues.

The distribution database has many tables to support SQL Server replication, it is important to know how big the replication specific tables are, this should be a part of your troubleshooting effort.

you find the replication latency is between the distributor and subscriber, some points are worth considering:
publishing stored procedure execution
enable multiple streams for subscriber
maintain indexes and statistics in distribution database
distribution agent performance
distribution agent profile
https://www.mssqltips.com/sqlservertip/3598/troubleshooting-transactional-replication-latency-issues-in-sql-server/

for your concerns about busy publisher, in fact, the much more common scenario is that the subscriber goes down, log records accumulate in the publisher, fill the disk and your publisher goes down.

And about pageiolatch_ex or _sh, it may be about latch connection, SQL Server uses buffer latches to protected pages in the buffer pool and I/O latches to protect pages not into the buffer pool. If SQL Server needs to wait for the I/O subsystem to respond it will wait on an exclusive or shared I/O latch depending on the type of request.
https://docs.microsoft.com/en-us/sql/relational-databases/diagnose-resolve-latch-contention?view=sql-server-ver15

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.