question

JasonW-5564 avatar image
0 Votes"
JasonW-5564 asked EchoLiu-msft commented

Intermittent slow inserts on a large table with lots of data churn

I have a rather large table that has anywhere for 400 million to 700 million rows in it at a time. This table contains individual business transactions that contain a unique id field (generated by a third party system). We keep the last 15 days worth of transactions in this table. We have a unique index on this table for unique id field. We have to keep this unique index as a check as the unique identifier is generated by a third party and we need to make sure we don't import the same record twice into this table.

The data is imported continuously round the clock by an ssis job that imports it from CSV files. Every couple of hours we run a process to aggregate the new data and write the aggregates to a new table. Every evening we delete all rows in the table that are older than 15 days to keep only the most recent 15 days worth of data in the table. We ingest approximately 40 million rows each business day and substantially less on the weekends. We end up with a a range of 400-700 million rows of data in the table at all times.

The problem is that sometimes the inserts are fast and other are slow and I cannot figure out why. I have maintenance job that rebuilds the index (not re-org) twice a day and includes a 50% fill factor (I have played with this setting and seem to get my longest run of good throughput with this setting). Even when I rebuild it the index is less than 15% fragmented, but the rebuild seems to help the import performance each time anyways.

I also have have a clustered index on the importid column, which is just a sequential identity column added to each row. We delete in order as so this index is never above 4% fragmented. Any ideas on what could be causing this? I have ran traces, checked resource monitor etc. I am not seeing blocking, over taxed disks, cpu, etc. I rarely if ever have anything else touch this table. It only exists to prevent duplicates and then generate the aggregates.

Below is the table definition. I am open for any and all ideas as this has me stumped! I would like to solve this in the short term, but open to any long term better solution ideas as well!

USE [BizUnit_Dev]
GO

/ Object: Table [dbo].[BizUnitProd] Script Date: 10/15/2020 11:23:34 PM /
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[BizUnitProd](
[order] [varchar](50) NULL,
[account] [varchar](300) NULL,
[name] [varchar](300) NULL,
[network] [varchar](300) NULL,
[cpm] [float] NULL,
[fixedCost] [float] NULL,
[leaseId] [uniqueidentifier] NULL,
[creativeId] [varchar](50) NULL,
[issued] [datetime2](7) NULL,
[impressions] [int] NULL,
[completed] [datetime2](7) NULL,
[channelKey] [varchar](50) NULL,
[playerId] [uniqueidentifier] NOT NULL,
[leaseProgress] [float] NULL,
[flightProgress] [float] NULL,
[bonus] [bit] NULL,
[id] [varchar](50) NULL,
[orderName] [varchar](300) NULL,
[creative] [varchar](300) NULL,
[importedOn] [datetime2](7) NOT NULL,
[importId_orig] [bigint] NOT NULL,
[localtime] [datetime2](7) NULL,
[RAdded] [datetime2](7) NULL,
[orderNumber] [varchar](100) NULL,
[__FileName] [varchar](200) NULL,
[ImportID] [bigint] IDENTITY(1,1) NOT NULL
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [idxBizUnitProd__importId] ON [dbo].[BizUnitProd]
(
[importId] ASC
)
INCLUDE([network],[completed],[playerId],[creative]) WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
CREATE UNIQUE NONCLUSTERED INDEX [IX_BizUnitProd_ID] ON [dbo].[BizUnitProd]
(
[id] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO







sql-server-transact-sqlazure-sql-database
· 4
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Do you have any updates?
Please remember to accept the answers if they helped. Your action would be helpful to other users who encounter the same issue and read this thread.

Echo

0 Votes 0 ·

Please, @JasonW-5564. Any updates to provide or additional questions we can assist you with, just let us know. Thank you, Mike.

0 Votes 0 ·

Do you have any updates?
Experts have provided some solutions, and you can try to solve the problem based on our suggestions.If you have new questions, please let us know.

Echo

0 Votes 0 ·
EchoLiu-msft avatar image
0 Votes"
EchoLiu-msft answered EchoLiu-msft edited

Hi @JasonW-5564

Rebuilding the index and setting the fill factor can reduce index fragmentation, thereby improving query efficiency,reduce the impact of indexes on inserted data.
Regarding the insertion is sometimes slow,could you provide an execution plan ?

In addition, you can check whether there are uncommitted transactions and deadlocks (deadlock tracking can be turned on in advance). When inserting a large amount of data, you can first insert it into an intermediate table, and then insert it into the current table.

I found a similar problem, which may help you:
Insert statement sometimes runs slow and sometimes fast SQL Server 2014

If you have any question, please feel free to let me know.
If the response is helpful, please click "Accept Answer" and upvote it.


Regards
Echo


If the answer is helpful, please click "Accept Answer" and upvote it.
Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

TiborKaraszi avatar image
0 Votes"
TiborKaraszi answered

Checking execution plan might be helpful, if possible.

You have already played with fillfactor, reducing plage splits when inserts happens. The clustered index is on something ever increasing, so that shouldn't become fragmented. But the other indexes might, of course.

Can you spot a pattern where the slowness is more prominent the longer time it was since last defrag? I.e., is it defrag related.

You checked for blocking already. It shouldn't be deadlocks, since I'd assume you'd spots the 1205 errors for the selected victim.

How about autogrow of database files? That can take a while! You can capture that in an XE trace.

Or auto update statistics? Again, you can capture that in a trace.

You can for instance use my "looking for strange" trace found here: https://karaszi.com/looking-for-strange. Perhaps run it in two phases. First to see what events actually happens. Then modify the trace, remove events in interesting and add a file target so you not only see how meny of each event but also capture the actual events.

Another thing is to check what you are waiting for. I don't know what your insert job looks line, but you can get session level wait stats and possibly insert that into a table so you can check that at a later time. Wait stats for the whole instance ls less relevant, since your import wait stats will be muddled by the whole.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

JasonW-5564 avatar image
0 Votes"
JasonW-5564 answered

I am sorry for the delayed response. I did not get a notice of replies! I am new to this format and am used to the previous MSDN forums. Not much change. I did want to reiterate that I have used perfmon, resource monitor, etc. to ensure I do not have a disk system bottleneck and ran multiple benchmark tests to confirm. I now rebuild this index 4 times a day taking a 30 minute penalty where I stop the import/inserts, rebuild then turn the import job back on. My index is never more than 2% fragmented or pages more than 53% full (specified to 50% fill factor in index). This seems to help but still seeing occasional periods of slowness. In fact saw a period of slowness last night for ~ 5 hours and when I looked no other job running, no other active queries at the time via sp_who/sp_who2. I have never looked at query plans interms of bulk inserts only selects really. It has the feeling of this insert job occasionally gets a bad query plan for whatever reason. Does this sound feasible? If so how do I correct it. It is a straight bulk insert, with no where clause or conditions. I don't understand what is happening!

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

ErlandSommarskog avatar image
0 Votes"
ErlandSommarskog answered ErlandSommarskog commented

First, you can set up get mail replies from this forum as well. Click on your avatar in the upper-right corner and select your Profile and then Settings.

What does "SELECT @@version" report? That is always good to know.

To be honest I don't have that much input to offer; this kind of problem is difficult to assess without access to the system, and I have to rely on what you tell me what you already have done.

But one thing about the query plans: there are two indexes on the table, and this means that SQL Server has a choice for the query plans, so the slowness can indeed be found in the query plan.

If you have SQL 2016 or later, Query Store is a great tool to track this kind of thing. Not the least since the issue is intermittent. By default, Query Store aggregates by data by the hour, so this permits you go back and check the plans when there has been a slowness issue.

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Erland,
It is SQL 2014
(Microsoft SQL Server 2014 (SP2-CU17-GDR) (KB4505419) - 12.0.5659.1 (X64) May 24 2019 19:09:40 Copyright (c) Microsoft Corporation Enterprise Edition: Core-based Licensing (64-bit) on Windows NT 6.3 <X64> (Build 9600: ) (Hypervisor) ).

I ran sp_who2 to get the spid of the ssis bulk insert job then ran dbcc inputbuffer (<spid>). I cut and pasted that into a new query window and select display estimated execution plan
and that just said:

"Msg 4890, Level 16, State 2, Line 1
Insert bulk is not supported in showplan mode."

I am at a loss for what to do /try next.

0 Votes 0 ·

It's a bummer that you don't have access to Query Store. That makes my idea a lot more difficult to pursue. You need to run the statement to get the actual plan. But if that run goes fine, you only have that plan. And you will not know if it is different next time when it runs slow.,

Would it be possible for you to disable the non-clustered index during the load? If nothing else, it would speed up the load. Then you need to rebuild later, but that usually pays off.

Also, you are on SP2. I would certainly recommend that you install SP3 and also apply to CU4 to be on the latest version there is of SQL 2014.

Finally, Tibor has some great suggestions as well. (And yes, sys.dm_exec_session_wait_stats came in SQL 2016, so you don't have that one.)

0 Votes 0 ·
TiborKaraszi avatar image
0 Votes"
TiborKaraszi answered

I don't think that bulk loading operations has execution plans, at least not in the same way as traditional DML commands.

What can affect a bulk load are things like recovery model for the database, what indexes you have on the table, if you specify tablock etc. So think long and hard of it could be anything like that.

I reiterate my earlier suggestions:

Waits stats. Unless you know what you are waiting for, you are in the dark. I believe that sys.dm_exec_session_wait_stats came in 2016, so you have to rely on sys.dm_os_wait_stats. Grab a snapshot into a table before and efter the import. See what you wait for. Compare between fast and slow runs to see what differs. Use Glenn Berry's diag scripts to filter out stuff that are just noise.

And also use my looking for strange trace. Start it before a run, and check it in the morning to see if any strange things happens during the import. It if does and you need more info, then just modify the trace and add a file destination. Now you don't just count how many of each event, you also grab the actual events.

See my earlier reply for some more info.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.