SQL Server 2008 Indexing Best Practices

Article
03/17/2009

Here are some additional notes on "SQL Server 2008 Indexing Best Practices" I took while attending an advanced class on SQL Server taught by Kimberly Tripp (https://sqlskills.com/AboutKimberlyLTripp.asp).

Please note that, although these notes were taken during the class, I might have added some of my own (mis)interpretation :-). Always check your facts on Books Online (I try to provide links when applicable). As with anything else you get from a blog, never use any of this in production before you thoroughly validate it a test environment. Please keep in mind that some of those will be hard to follow without the right pre-requisite knowledge and the right context. Reading the post from top to bottom will help.

The best index

There's a best index for every query
You don't want to do that for every query, though
You would end up with too many indexes
You need to select a good set of base indexes

Find a balance

You need to find a balance between no indexes and one ideal index on every query
Do not put a non-clustered index on every columns :-)
You can find (mathematically) if a non-clustered index will help or not
You can sometimes replace a narrow non-clustered index with a wider one that is more useful

Strategies

Create constraints for primary keys and alternate/candidate keys
Add indexes to foreign key constraints. It can help some joins
Capture workloads and use the database tuning advisor (DTA)
Add indexes to help with specific queries using joins, aggregations, etc.
Missing index DMVs query https://blogs.msdn.com/bartd/archive/2007/07/19/are-you-using-sql-s-missing-index-dmvs.aspx
These are only strategies you should consider, not absolutes.
Also, you need to maintain the indexes over time. Things change over time.

OLTP x OLAP

OLTP tables typically uses fewer indexes than OLAP tables
OLTP tables can make good use of indexes
However, there are certain indexes on OLTP that will be bad

Clustered index keys

Clustered Index keys: Unique, narrow, static, ever increasing
Examples: Identity / Date,Identity / GUID (with NEWSEQUENTIALID function)
Add something like identity to avoid ending up with uniquefiers
This means that you will have a lot less management over time
If you can have a really good clustering key, avoid heaps.

Identity

Identity is naturally unique, static, narrow, hot spot for inserts
Being a hot spot in this case is actually a good thing
It minimizes cache requirements, naturally needs less maintenance
Could have contention in GAM/SGAM pages if you have lots of tables with identity
Can overcome with multiple files, see https://support.microsoft.com/kb/328551

Lookups vs. Table Scans

Lookups. Physical vs. Logical I/Os
Potential for physical I/Os is in the leaf level (non-leaf are usually cached)
SQL knows how many pages are on the leaf level, how many rows per page
SQL needs to make a decision on Lookup x Table Scan
If it's in the WHERE clause, it doesn't mean an index on that will be used
Estimating query cost: tipping point for a query to go TS over a lookup
Won't spoil the results, but it is smaller than most would think...
Details at https://sqlskills.com/BLOGS/KIMBERLY/post/Tipping-Point-Queries-More-Questions-To-Really-Test-You!.aspx

Hints

You can force SQL to do Lookup x TS, but SQL usually does a good estimate
There are cases when the statistics for the table are not up to date
People end up using more hints (like INDEX, FORCESEEK) or plan guides than they should
If you use a hint and you're wrong, SQL will follow your hint anyway

Coverage

The power of covering. Leaf level of the non-clustered index has all the rows
People use indexes with "LN,FN,MI including SSN" and "LN, FN, MI including PHONE" and "LN with SSN", etc..
You can cover the same with a single index on "LN, FN, MI including SSN, PHONE"
Order of included columns do not matter. Just the order of the keys
Could this be happening because of the missing index message in query results?
Multiple developers not in sync can be a common source if this kind of thing.

Included columns

INCLUDE option lets you separate what goes in the leaf and non-leaf of index
There is no limit to what you can INCLUDE, but be careful with the duplication
You can create a non-clustered, covering, seekable index just for a query
For instance, you could create a non-clustered index on the SSN key and a few INCLUDED columns just to help a specific set of queries without hitting data pages at all
Full scan on a small non-clustered index could even beat seeks in certain cases.

Filtered Indexes

SQL Server 2008 includes the ability to do Filtered Indexes
For instance, filter for "status=1" to index only active items
For instance, filter for orderdate in last year for certain reports
Now that the index is much smaller, you can now maybe consider including more columns
This will give you more coverage without using a lot of space
Filtered indexes statistics are usually more accurate, if they cover fewer rows
This is not a substitute for partitions. Don't create filtered indexes if all sets are useful
Kimberly's SP_HELPINDEX2 shows included columns and filtered indexes
https://www.sqlskills.com/blogs/Kimberly/post/Updates-(fixes)-to-sp_helpindex2.aspx

Sparse Columns

Combine with sparse columns for indexes on subsets with only certain columns INCLUDED
Beware: Create indexes on tables with sparse columns might materialize sparse columns.

Conclusion

The best case: index covers all you need (key+included), nothing you don't and is seekable
Indexes do use more space and come at a cost for inserts, updates, maintenance
You don't have to do this for every query or every table
Put effort in the top few queries and you'll fix most of your issues
There are usually a few situations that cause you most of the grief
Look at that stored procedure that is executed thousand times per hour
You can overdo it. Be careful

Kimberly has published a related presentation at https://sqlskills.com/resources/conferences/Tripp_SPR302_PreconferenceSession_UPDATED_VERSION.pdf

SQL Server 2008 Indexing Best Practices

Additional resources