Recursive CTEs

One of the most important uses of CTEs is to write recursive queries.  In fact, CTEs provide the only means to write recursive queries.  As I noted last week, there are several excellent CTE examples, including recursive CTE examples, in Books Online.  I'll once again start with the examples from Books Online.  To run these examples, you'll need to install the Adventure Works Cycles OLTP sample database.

Recursive CTEs always follow the same pattern.  The body of the CTE is a UNION ALL query that combines one or more anchor sub-selects which seed the result set and one or more recursive sub-selects which generate the remainder of the result set.  The recursive sub-selects reference the recursive CTE itself.

Let's look at a simple example which computes a list of all employees and their level within the organization.  Computing the level with the organization can only be done using a recursive CTE as it requires counting the number of joins required to reach each employee:

WITH DirectReports(ManagerID, EmployeeID, EmployeeLevel) AS
(
    SELECT ManagerID, EmployeeID, 0 AS EmployeeLevel
    FROM HumanResources.Employee
    WHERE ManagerID IS NULL
    UNION ALL
    SELECT e.ManagerID, e.EmployeeID, EmployeeLevel + 1
    FROM HumanResources.Employee e
        INNER JOIN DirectReports d
        ON e.ManagerID = d.EmployeeID
)
SELECT ManagerID, EmployeeID, EmployeeLevel
FROM DirectReports

This example has a single anchor sub-select and a single recursive sub-select.  The anchor sub-select computes a list of top level employees who have no manager (in this case, there is only one such employee) and assigns these employees level 0.  The recursive sub-select then searches for employees, who work for each top level manager, and assigns these employees level 1.  The process then repeats with the recursive sub-select searching for employees progressively deeper within the organization.  The query terminates when it reaches employees who are not managers.  At this point the recursive sub-select returns no further employees.

Now let's look at the plan used by this query:

  |--Index Spool(WITH STACK)
       |--Concatenation
            |--Compute Scalar(DEFINE:([Expr1013]=(0)))
            |    |--Compute Scalar(DEFINE:([Expr1003]=(0)))
            |         |--Index Seek(OBJECT:([HumanResources].[Employee].[IX_Employee_ManagerID]), SEEK:([HumanResources].[Employee].[ManagerID]=NULL) ORDERED FORWARD)
            |--Assert(WHERE:(CASE WHEN [Expr1015]>(100) THEN (0) ELSE NULL END))
                 |--Nested Loops(Inner Join, OUTER REFERENCES:([Expr1015], [Recr1006], [Recr1007], [Recr1008]))
                      |--Compute Scalar(DEFINE:([Expr1015]=[Expr1014]+(1)))
                      |    |--Table Spool(WITH STACK)
                      |--Compute Scalar(DEFINE:([Expr1009]=[Recr1008]+(1)))
                           |--Index Seek(OBJECT:([HumanResources].[Employee].[IX_Employee_ManagerID] AS [e]), SEEK:([e].[ManagerID]=[Recr1007]) ORDERED FORWARD)

Like the query itself, this plan has two parts: an anchor part and a recursive part.  These two parts are "glued" together using the concatenation operator which implements the UNION ALL.

The single most interesting operator in this plan is the spool.  This is a special spool known as a stack spool.  A stack spool is a bit like a common subexpression spool in that it has a primary instance which loads input rows into the worktable and a secondary instance which reads and returns the rows from the worktable.  The primary spool has a child since it load input rows while the secondary spool does not have a child since it merely reads and returns the rows loaded by the primary spool.  (See my posts on Optimized Non-clustered Index Maintenance in Per-Index Plans and Aggregation WITH CUBE for examples of common subexpression spools.)

Unlike ordinary common subexpression spools which eagerly load all rows into the worktable before returning any rows, a stack spool is a lazy spool which means that it begins returning rows immediately.  The primary spool (the topmost operator in the plan) is fairly mundane.  Like any other lazy spool, it reads each input row, inserts the row into the spool, and returns the row to the client.  The secondary spool (below the nested loops join) is where all the interesting action is.  Each time the nested loops join requests a row from the secondary spool, the spool reads the last row from the spool, returns it, and then deletes it to ensure that it is not returned more than once.  Note that the term stack spool comes from this last in first out behavior.  The primary spool pushes rows onto the stack and the secondary spool pops rows off of the stack.

Now that we understand how the spool works, let's step through the plan and see how it works.  The query processor begins by executing the first input to the concatenation operator which also happens to be the anchor part of the plan.  This part of the plan uses an index seek to find all employees who have no manager.  The index seek returns the following row which the primary spool inserts into the worktable:

 ManagerID   EmployeeID  EmployeeLevel
----------- ----------- -------------
NULL        109         0

Once it finishes executing the anchor part of the plan, the query processor continues by executing the second input to the concatenation operator which happens to be the recursive part of the plan.  This part of the plan has a nested loops join with the secondary spool as its outer input.  The nested loops join requests a row from the spool which so far contains only the one row for employee 109.  The spool returns this row and deletes it from the worktable.  The join then executes the index seek on its inner input to find all employees who work for employee 109.  This index seek returns the following rows which the primary spool inserts into the worktable:

 ManagerID   EmployeeID  EmployeeLevel
----------- ----------- -------------
109         6           1
109         12          1
109         42          1
109         140         1
109         148         1
109         273         1

Next, the nested loops join requests another row from the secondary spool.  This time the worktable contains the six rows shown above, so the spool returns (and deletes) the last row corresponding to employee 273.  The nested loops join again executes the index seek on its inner input and returns the following rows:

 ManagerID   EmployeeID  EmployeeLevel
----------- ----------- -------------
273         268         2
273         284         2
273         288         2

Now the worktable contains the following rows:

 ManagerID   EmployeeID  EmployeeLevel
----------- ----------- -------------
109         6           1
109         12          1
109         42          1
109         140         1
109         148         1
273         268         2
273         284         2
273         288         2

This process repeats continuing with employee 288.  Eventually, the spool returns a row corresponding to an employee who is not a manager.  At that point, the index seek on the inner side of the join returns no rows.  After the join consumes the final row from the spool, the query terminates.

There are few other interesting points worth noting about this plan.  First, the plan includes two sets of compute scalars to compute the recursion level.  One set ([Expr1003] and [Expr1009]) is used to compute EmployeeLevel.  The other set ([Expr1013] and [Expr1015]) is generated internally and used by the assert operator to ensure that the query terminates and does not get into an infinite loop.  If you repeat this query without EmployeeLevel, you will see that one set of the compute scalars disappears from the plan while the internally generated set remains.

We can control the maximum recursion level.  For example, if you run the query with OPTION (MAXRECURSION 2), you will see that the constant in the assert operator changes from the default of 100 to 2.  Since this query requires more than two levels of recursion to complete, if you execute it with this option, it will fail with the following error:

Msg 530, Level 16, State 1, Line 1
The statement terminated. The maximum recursion 2 has been exhausted before statement completion.

In my next post I'll continue this discussion of recursive CTEs.