Some things I learned about SQL Indexing Tuning recently (posted by Paul)


Where to put the clustered index

Last month there was a heated discussion in our group about whether to move the clustered index of a table from the primary key (int identity) to two columns that are heavily referenced in queries.

Many people would think that the best place to put the key would be on the columns that are heavily queried with the thinking that this would translate into the best performance, e.g. a clustered index is always faster than a non-clustered index.

About the same time I also witnessed an architect rip a database schema apart because said developer was using synthetic keys instead of natural ones, and how one should always use a natural key instead of a synthetic and put the clustered key on that.

Pretty convincing arguments until I came across Kimberly Tripp's blog on the topic. It seems that the above two arguments are not always - or even in most cases - valid. In fact she asserts that the best place (usually) to put a clustered index would be on a synthetic key. The criteria she looks for in a clustering key are:

1. Unique
2. Narrow
3. Static (not changed)

You can read the full explanation at http://www.sqlskills.com/blogs/kimberly/default.aspx#abdaee3f7-1e15-414b-b75f-a290db645159. I haven't checked out her webcasts yet so am not sure it's there, but I've seen her present these arguments mathematically, walking through how the indexes are actually used and calculating out the IO involved with each step. She also goes into a fair amount of explanation as to why tuning practices that were developed for earlier versions of SQL have been superceded.

In the military we were taught to not to memorize the maintenance manuals because procedures change, and if we tried to go from memory we could find ourselves doing unsafe things, such as applying the wrong amount of torque to a bolt.

In the same vein there is no silver bullet to SQL server index tuning; if you want to get the most out of your database you have to take the time to tune your indexes and not rely on the strategy of "we always did it this way" or "because I just don't like [some] practice".


In the where clause, put the most restrictive (selective) index at the beginning

To be honest I never thought much about how I ordered my AND statements in the where clause. But apparently SQL filters data progressively and if you have indexes the order is important. So the first statement will run faster than the second.

SELECT m.Member_No, m.FirstName, m.Region_No
FROM dbo.Member AS m WITH (INDEX (MemberFirstName, member_region_link))
WHERE m.FirstName LIKE 'K%'
AND m.Region_No > 6
AND m.Member_No < 5000 

SELECT m.Member_No, m.FirstName, m.Region_No
FROM dbo.Member AS m WITH (INDEX (member_region_link, MemberFirstName))
WHERE m.FirstName LIKE 'K%'
AND m.Region_No > 6
AND m.Member_No < 5000 

Such a simple thing and yet it seems missed from a lot of SQL performance tip sheets.

Paul