I have a rather large table that has anywhere for 400 million to 700 million rows in it at a time. This table contains individual business transactions that contain a unique id field (generated by a third party system). We keep the last 15 days worth of transactions in this table. We have a unique index on this table for unique id field. We have to keep this unique index as a check as the unique identifier is generated by a third party and we need to make sure we don't import the same record twice into this table.
The data is imported continuously round the clock by an ssis job that imports it from CSV files. Every couple of hours we run a process to aggregate the new data and write the aggregates to a new table. Every evening we delete all rows in the table that are older than 15 days to keep only the most recent 15 days worth of data in the table. We ingest approximately 40 million rows each business day and substantially less on the weekends. We end up with a a range of 400-700 million rows of data in the table at all times.
The problem is that sometimes the inserts are fast and other are slow and I cannot figure out why. I have maintenance job that rebuilds the index (not re-org) twice a day and includes a 50% fill factor (I have played with this setting and seem to get my longest run of good throughput with this setting). Even when I rebuild it the index is less than 15% fragmented, but the rebuild seems to help the import performance each time anyways.
I also have have a clustered index on the importid column, which is just a sequential identity column added to each row. We delete in order as so this index is never above 4% fragmented. Any ideas on what could be causing this? I have ran traces, checked resource monitor etc. I am not seeing blocking, over taxed disks, cpu, etc. I rarely if ever have anything else touch this table. It only exists to prevent duplicates and then generate the aggregates.
Below is the table definition. I am open for any and all ideas as this has me stumped! I would like to solve this in the short term, but open to any long term better solution ideas as well!
USE [BizUnit_Dev]
GO
/****** Object: Table [dbo].[BizUnitProd] Script Date: 10/15/2020 11:23:34 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[BizUnitProd](
[order] varchar NULL,
[account] varchar NULL,
[name] varchar NULL,
[network] varchar NULL,
[cpm] [float] NULL,
[fixedCost] [float] NULL,
[leaseId] [uniqueidentifier] NULL,
[creativeId] varchar NULL,
[issued] datetime2 NULL,
[impressions] [int] NULL,
[completed] datetime2 NULL,
[channelKey] varchar NULL,
[playerId] [uniqueidentifier] NOT NULL,
[leaseProgress] [float] NULL,
[flightProgress] [float] NULL,
[bonus] [bit] NULL,
[id] varchar NULL,
[orderName] varchar NULL,
[creative] varchar NULL,
[importedOn] datetime2 NOT NULL,
[importId_orig] [bigint] NOT NULL,
[localtime] datetime2 NULL,
[RAdded] datetime2 NULL,
[orderNumber] varchar NULL,
[__FileName] varchar NULL,
[ImportID] [bigint] IDENTITY(1,1) NOT NULL
) ON [PRIMARY]
GO
CREATE CLUSTERED INDEX [idxBizUnitProd__importId] ON [dbo].[BizUnitProd]
(
[importId] ASC
)
INCLUDE([network],[completed],[playerId],[creative]) WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO
CREATE UNIQUE NONCLUSTERED INDEX [IX_BizUnitProd_ID] ON [dbo].[BizUnitProd]
(
[id] ASC
)WITH (PAD_INDEX = ON, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 50, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
GO