Indexes Supporting Foreign Keys
Most developers building an OLTP system know that you should declare and enforce relationships between tables using Foreign Keys. But what physical database structures should you implement to support those relationships?
Every Foreign Key is a relationship from one or more columns on the table on which the Foreign Key is declared to a Key on the target table. That’s why it’s called a “Foreign” key: the columns refer to a Key on some other table. In SQL Server, the target columns can be the Primary Key of the target table or a Unique Constraint. In all cases there is a unique index on the target table. But should you index the Foreign Key columns as well?
That would have some benefits:
- Reduced cost for DELETE and UPDATE of the target key on the referenced Table
- Reduced cost for lookups by the FK columns and joins between the two tables
And some costs:
- Increased cost for all CRUD operations on the FK table
- Increased database size
- Developer/DBA time
That’s all well and makes me inclined to index the foreign keys, but IMO it’s not enough to support a proactive best-practice of supporting Foreign Keys with indexes because:
- These are merely performance considerations
- There are performance costs as as well as benefits
- The performance impact will not always be material
- If you properly test and monitor your workload you will discover any really helpful indexes anyway
However there is one more more benefit to indexing your foreign keys, and it’s not really a performance benefit. Indexing the foreign keys reduces the scope and duration of locks required to perform DELETE (and key UPDATE) on the referenced table, and can prevent deadlocks. I first learned about this for Oracle, where I learned almost everything I know about Oracle, from the incomparable Tom Kyte. You can read about this here. This applies equally in SQL Server. Unindexed foreign keys reduce concurrency and make your database more prone to deadlocks.
I think the bottom line is that supporting your Foreign Keys with indexes is required to ensure correct behavior of your application if you perform frequent deletes on the referenced table. It is also required if you update the target key on the referenced table. You really shouldn’t be updating your keys, but even if you don’t some applications will include the key columns in the column update list in an UPDATE statement, which actually is updating the keys.
So what do we mean fora Foreign Key to be supported by an index?
Definition: A Foreign Key F defined on n columns a table T is supported by an index I on T just in case the first n columns of I are all foreign key columns.
It doesn’t matter what order the Foreign Key columns appear in in the index, so long as they comprise the leading “edge” of the index. An important special case for this is where a table’s Primary Key supports a Foreign Key. For Parent/Child relationships, Child tables are often modeled with a compound Primary Key with the Parent Primary Key as the leading edge of the Child Key. Such a Primary Key also supports the Foreign Key, and so you only need a single index on the table. For instance in AdventureWorks2008, the Primary Key of SalesOrderDetail supports the ForeignKey referencing SalesOrderHeader:
CREATE TABLE Sales.SalesOrderDetail( SalesOrderID int NOT NULL, SalesOrderDetailID int IDENTITY(1,1) NOT NULL, . . . CONSTRAINT PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID PRIMARY KEY (SalesOrderID, SalesOrderDetailID), CONSTRAINT FK_SalesOrderDetail_SalesOrderHeader_SalesOrderID FOREIGN KEY(SalesOrderID) REFERENCES Sales.SalesOrderHeader (SalesOrderID) )
In addition to supporting the Foreign Key without requiring a secondary index on SalesOrderDetail, this pattern has the added benefit of placing all the SalesOrderDetail rows for a SalesOrderHeader onto a contiguous set of pages. IMO this makes it vastly superior to a pattern you see sometimes where the child table will have only a single Primary Key column:
CREATE TABLE Sales.SalesOrderDetail( SalesOrderID int NOT NULL, SalesOrderDetailID int IDENTITY(1,1) NOT NULL, . . . CONSTRAINT PK_SalesOrderDetail_SalesOrderID_SalesOrderDetailID PRIMARY KEY (SalesOrderDetailID), CONSTRAINT FK_SalesOrderDetail_SalesOrderHeader_SalesOrderID FOREIGN KEY(SalesOrderID) REFERENCES Sales.SalesOrderHeader (SalesOrderID) )
Since this would require a non-clustered index on SalesOrderDetail.SalesOrderID to support the Foreign Key, and the SalesOrderDetail rows for an order might be spread across separate pages.
Anyway, how do you identify what Foreign Keys in your database are not supported by indexes, and which ones need to be? I would start by looking at all Foreign Keys that are not supported by an index, and evaluate each one to see how often deletes are performed on the referenced table, and how many rows are the the Foreign Key table. The more often you delete from the referenced table, and the more rows in the Foreign Key table, the more you need to support the Foreign Key with an index.
Here’s a query that will find all the Foreign Keys without a supporting index, sort them by size of the Foreign Key table, and indicate how many rows have been deleted from the referenced table recently.
with fkc as ( select fk.name fk_name, fkc.*, COUNT(*) over (partition by fkc.constraint_object_id) fk_column_count from sys.foreign_key_columns fkc join sys.foreign_keys fk on fk.object_id = fkc.constraint_object_id ), ic as ( select object_name(i.object_id) table_name, i.name index_name, ic.*, COUNT(*) over (partition by ic.object_id, ic.index_id) ix_column_count from sys.index_columns ic join sys.indexes i on i.index_id = ic.index_id and i.object_id = ic.object_id where i.type in (1,2) --clustered and nonclustered and i.is_disabled = 0 and i.is_hypothetical = 0 --order by table_name ), matched_columns as ( select ic.object_id, ic.table_name, ic.index_id, ic.index_name, fkc.constraint_object_id, fkc.fk_name, ic.column_id, ic.key_ordinal, fkc.fk_column_count, COUNT(*) over (partition by ic.object_id, ic.index_id) supported_column_count --count the supported columns from fkc join ic on fkc.parent_object_id = ic.object_id and ic.ix_column_count >= fkc.fk_column_count --exclude indexes with too few columns and ic.key_ordinal <= fkc.fk_column_count --include only as many columns as the FK has and ic.column_id = fkc.parent_column_id --match the index columns to the FK columns ), supported_fks as ( select distinct object_id, table_name, constraint_object_id, fk_name, index_name supporting_index, index_id supporting_index_id from matched_columns where supported_column_count = fk_column_count ), istats as ( select object_id, SUM(leaf_delete_count + leaf_ghost_count) deletes from sys.dm_db_index_operational_stats(db_id(),null,null,null) istats where index_id = (select MIN(index_id) from sys.indexes i where i.object_id = istats.object_id and (i.is_primary_key = 1 or i.is_unique_constraint = 1) ) group by object_id ) select schema_name(schema_id) fk_table_schema, object_name(parent_object_id) fk_table, object_name(fk.referenced_object_id) referenced_table, name constraint_name, coalesce(istats.deletes,0) referenced_table_delete_count, (select SUM(rows) from sys.partitions where object_id = fk.parent_object_id and index_id in (0,1)) fk_table_rows from sys.foreign_keys fk left join istats on fk.referenced_object_id = istats.object_id where fk.object_id not in (select constraint_object_id from supported_fks) order by fk_table_rows desc