Partitioned Tables in SQL Server 2008

In this post, I introduced how SQL Server 2005 implements query plans on partitioned tables.  If you've read that post or used partitioned tables, you may recall that SQL Server 2005 uses a constant scan operator to enumerate the list of partition ids that need to be scanned.  As a refresher, here is the example from that post showing the plan for scanning a table with four partitions:

CREATE PARTITION FUNCTION PF(INT) AS RANGE FOR VALUES (0, 10, 100)
CREATE PARTITION SCHEME PS AS PARTITION PF ALL TO ([PRIMARY])
CREATE TABLE T (A INT, B INT) ON PS(A)

SELECT * FROM T

  |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1004]) PARTITION ID:([PtnIds1004]))
       |--Constant Scan(VALUES:(((1)),((2)),((3)),((4))))
       |--Table Scan(OBJECT:([t]))

SQL Server 2005 treats partitioned tables specially and creates special plans, such as the above one, for partitioned tables.  SQL Server 2008, on the other hand, for the most part treats partitioned tables as regular tables that just happen to be logically indexed on the partition id column.  For example, for the purposes or query optimization and query execution, SQL Server 2008 treats the above table not as a heap but as an index on [PtnId].  If we create a partitioned index (clustered or non-clustered), SQL Server 2008 logically adds the partition id column as the first column of the index.  For instance, if we create the following index:

CREATE INDEX TA ON T(A) ON PS(A)

SQL Server 2008 treats it not as a single column index on T(A), but rather as a multi-column or composite index on T([PtnId],A).  Note that the table and index are still stored physically as partitioned tables.  In this example, the table and non-clustered index are decomposed into four heaps and four non-clustered indexes.

By treating partitioned tables as indexes, SQL Server 2008 can frequently use much simpler query plans.  Let's look at the SQL Server 2008 plan for the above example:

  |--Table Scan(OBJECT:([T]))

This query plan is no different than what you might see if you scanned any other ordinary table!  So, how can we tell that the table is really partitioned and how can we tell what partitions the plan is going to scan?  Luckily, the graphical and XML plans identify partitioned tables by adding a "Partitioned" attribute to any scan, seek, or update operator that processes a partitioned table.  Moreover, the actual graphical plan and the STATISTICS XML output identify the exact partitions that the plan touches during execution.  For example, here is an excerpt from the STATISTICS XML output generated by running the above query:

<RunTimePartitionSummary>
  <PartitionsAccessed PartitionCount="4">
    <PartitionRange Start="1" End="4"/>
  </PartitionsAccessed>
</RunTimePartitionSummary>

This output shows that, as expected, the query plan scanned all four partitions of the table.  Now, suppose we insert a row into the second partition (up until now, the table has been empty) and run the following slightly modified query:

INSERT T VALUES (1, 1)

SELECT TOP 1 * FROM T

This query can stop executing as soon as it finds a single row.  It scans the first partition and finding no rows continues on to scan the second partition.  At that point, it finds a row and stops.  Thus, the query terminates after scanning only two partitions.  This outcome is clearly reflected in the STATISTICS XML output:

<RunTimePartitionSummary>
  <PartitionsAccessed PartitionCount="2" >
    <PartitionRange Start="1" End="2" />
  </PartitionsAccessed>
</RunTimePartitionSummary>

Now let's see how partition elimination works.  Let's start with static partition elimination:

SELECT * FROM T WHERE A < 100

  |--Table Scan(OBJECT:([T]), SEEK:([PtnId1001] >= (1) AND [PtnId1001] <= (3)) ,  WHERE:([T].[A]<(100)) ORDERED FORWARD)

Because SQL Server 2008 treats the partitioned table like an index on [PtnId], it can simply calculate the range of partitions that need to be scanned and perform a "seek" on those partitions.  It may seem a little strange to see a SEEK predicate on a table scan, but it just means that the plan is going to scan partitions 1 through 3.

And now, let's look at dynamic partition elimination:

DECLARE @I INT
SELECT @I = 0
SELECT * FROM T WHERE A < @I

  |--Table Scan(OBJECT:([T]), SEEK:([PtnId1001] >= (1) AND [PtnId1001] <= RangePartitionNew([@I],(0),(0),(10),(100)) ),  WHERE:([T].[A]<[@I]) ORDERED FORWARD)

This plan is identical to the prior plan except that the constant partition id 3 has been replaced by the RangePartitionNew function which computes the correct partition id from the variable @I.  To find out exactly which partitions were scanned, we can once again check the STATISTICS XML output:

<RunTimePartitionSummary>
  <PartitionsAccessed PartitionCount="1">
    <PartitionRange Start="1" End="1"/>
  </PartitionsAccessed>
</RunTimePartitionSummary>

Note that this plan computes exactly which partitions to scan.  Compare that to the SQL Server 2005 plan which evaluates a filter for each and every partition id:

  |--Nested Loops(Inner Join, OUTER REFERENCES:([PtnIds1004]) PARTITION ID:([PtnIds1004]))
       |--Filter(WHERE:([PtnIds1004]<=RangePartitionNew([@i],(0),(0),(10),(100))))
| |--Constant Scan(VALUES:(((1)),((2)),((3)),((4))))

       |--Table Scan(OBJECT:([t]), WHERE:([t].[a]<[@i]) PARTITION ID:([PtnIds1004]))

While evaluating the filter repeatedly may not cost too much if we have only four partitions, it is certainly going to waste some cycles if we have many partitions!

In my next post, I'll take a look at how SQL Server 2008 handles scans and seeks on partitioned indexes.