Row Goals Gone Rogue


This post discusses “row goals“, but with a twist. The point is to illustrate how row goals can cause unnecessarily slow queries. First, run this script:

     USE tempdb
    CREATE TABLE even (c1 int, c2 CHAR(30));
    CREATE TABLE odd (c1 int, c2 CHAR(30));
    DECLARE @x int;
    SET @x = 1;
    WHILE (@x <= 10000)
        INSERT INTO even (c1, c2) VALUES (@x * 2, '');
        INSERT INTO odd (c1, c2) VALUES (@x * 2 - 1, '');
        IF @x % 1000 = 0
            RAISERROR ('Inserted %d rows...', 0, 1, @x) WITH NOWAIT;
            COMMIT TRAN;
            BEGIN TRAN;
        SET @x = @x + 1;

This will insert 10,000 even integers into a table named [even], and the same number of odd values into a table named [odd]. Then consider this simple query:

     -- QUERY #1
    SELECT *
    FROM even t1
    INNER JOIN even t2 ON t1.c1 = t2.c1

This joins the [even] table to itself, and will return all 10,000 rows from the table. If you run with “SET STATISTICS TIME ON; SET STATISTICS PROFILE ON; ” to see the query plan that SQL selects for this query, you should find that it chooses a hash match:

  |--Hash Match(Inner Join, HASH:([t1].[c1])=([t2].[c1]), RESIDUAL:([t2].[c1]=[t1].[c1]))
      |--Table Scan(OBJECT:([tempdb].[dbo].[even] AS [t1]))
      |--Table Scan(OBJECT:([tempdb].[dbo].[even] AS [t2]))

While a hash join is often the fastest join option, a hash has the highest initial build cost of any join strategy. Before the hash join can evaluate even a single row for the join, it must first hash all of the rows in one of the join inputs and store these in a hash table (“build cost” refers to this up-front work). Only then can it begin to join the rows in the other input with the rows in the under-the-covers hash table. In contrast to a hash join, a nested loop join strategy has an extremely low build cost. There are no expensive up-front delays before a nested join can produce its first row, but by default SQL optimizes with the goal of minimizing total query execution time. In this case, the nature of a hash join allows it to complete the query in a small fraction of the time it would take for a loop join (on my machine the hash join plan runs in about 100ms), despite the high initial setup cost. So the optimizer is doing the right thing by selecting a hash join for this query. However, watch what happens if you add a “TOP 1” to the query:

     -- QUERY #2
    SELECT TOP 1 *
    FROM even t1
    INNER JOIN even t2 ON t1.c1 = t2.c1

The plan changes to use a loop join, and the query completes in a few milliseconds:

    Rows  EstRows  Executes  EstExecs StmtText
   ----- -------- --------- --------- -------------------------------------------------------------
   1     1        1         1         |--Top(TOP EXPRESSION:((1)))
   1     1        1         1             |--Nested Loops(Inner Join, WHERE:([t2].[c1]=[t1].[c1]))
   1     1        1         1                 |--Table Scan(OBJECT:([tempdb].[dbo].[even] AS [t1]))
   1     10000    1         1                 |--Table Scan(OBJECT:([tempdb].[dbo].[even] AS [t2]))

The addition of a TOP 1 tells the optimizer that it doesn’t have to execute the whole query; it only has to produce enough rows out of each child operator in the query to get a single row out the top. The optimizer takes this into account and correctly estimates that for this join it won’t have to scan many rows at all before finding a match and taking an early exit. As a result, it decides to use a nested loop join, because a loop join should be able to produce that single row and exit long before the hash join could build its hash table. This is also the right thing to do. If you want to prove it to yourself, run query #2 with SET STATISTICS TIME ON, then again with an OPTION(HASH JOIN) hint to force a hash join despite the higher cost. You should find that the hash join is faster when returning all rows, but the loop join is faster when only one row must be returned. This query optimizer intelligence is possible because of the “row goal” feature.

The TOP operator isn’t the only thing that can cause the optimizer to set a row goal. An OPTION (FAST N) query hint sets a row goal of N — the optimizer will cost plans in about the same way that it would with a TOP N, but without artificially restricting the number of rows the query will return. The chosen plan should return the first N rows as quickly as possible, but the end-to-end query execution time might be much slower than you’d get without the hint. Also, semi-joins or anti-semi-joins, like you’d get with an EXISTS or NOT EXISTS clause, can also implicitly set a row goal of 1.

If row goals are a new concept for you, I recommend reading through Paul White’s Row Goals In Depth article before continuing.

Now, let’s run the same TOP 1 query, but instead of joining [even] to itself, we’ll join [even] to the [odd] table:

      -- QUERY #3
     SELECT TOP 1 *
     FROM even t1
     INNER JOIN odd t2 ON t1.c1 = t2.c1

You’ll note that this query is extremely slow relative to the others. On my machine it takes about 9 seconds, which is several thousand times slower than the similar query that does a self-join on the [even] table. Now you and I know that the [even] table contains even integers, and the [odd] table contains odd integers. Based on that understanding, we can guess that this query should return no rows. But the query optimizer has no concept of “odd” and “even”. Remember that statistics only provide a high-level summary of the data in a column. In this case, the statistics on the [even] and [odd] tables seem to indicate that the [c1] columns in these two tables have similar data distributions, so from the query optimizer’s perspective this is the same problem it was faced with in query #2. It guesses that it will very quickly find a matching row, so it chooses a loop join-based plan to minimize operator startup costs. If you look at the plan, you can see that it selects the same plan that was used for query #2. Unfortunately, in this case the results are disastrous. The query plan slogged through 100 million rows (10,000 * 10,000) looking for a match for the join, and it never found one. The Top operator in this plan reflects the row goal of 1.

    Rows      EstRows  Executes  EstExecs StmtText
   --------- -------- --------- --------- -------------------------------------------------------------
   0         1        1         1         |--Top(TOP EXPRESSION:((1)))
   0         1        1         1             |--Nested Loops(Inner Join, WHERE:([t2].[c1]=[t1].[c1]))
   10000     1        1         1                 |--Table Scan(OBJECT:([tempdb].[dbo].[even] AS [t1]))
   100000000 10000    1         2                 |--Table Scan(OBJECT:([tempdb].[dbo].[odd] AS [t2]))

What happened here? This mess is caused by a combination of things:

  • The “TOP 1” told the query optimizer that it should choose a plan that is likely to return a single row as fast as possible, rather than optimizing for return of all rows that could satisfy the join.
  • The data in the [even] and [odd] tables exhibit negative correlation. That is, the values in [even] are never going to match a value in [odd], even though the data distribution is similar. (If you can make a prediction about the data in one column based on the data in another column, you can say that those columns are correlated.)
  • There are a few exceptions, but in general you can say that the query optimizer is ignorant of data correlation. In this case, the implication is that the QO can’t know at compile time that it will never find a value in [even] that matches a value in [odd].
  • Where the data distributions described by SQL’s statistics indicate overlapping data ranges, the optimizer is generally optimistic, and assumes that it will find a match if it joins those two sets.

The last item is key. It means that there is a fundamental assumption of positive correlation built into join cardinality estimation. Usually when you join two sets this is the right assumption. For example, consider:

     SELECT * FROM Employee
    INNER JOIN Manager ON Employee.ManagerId = Manager.Id

Every employee’s [ManagerId] field will generally refer to a row that exists in the [Manager] table. But occasionally you join two sets together expecting to get no or few matches even though the high-level data distribution looks similar. The example in this post is artificial, but there are plenty of real world examples. Consider a data cleansing routine that is looking for illegal values in some data imported from an untrusted source.

     SELECT * FROM ImportedData
    WHERE RegionCode IN (SELECT RegionCode FROM DisallowedRegions)

That query by itself won’t hit this problem because there’s no row goal. But combine it with an EXISTS so that the QO sets an implied row goal of 1, and you may find yourself waiting for a while:

     IF EXISTS (
        SELECT * FROM ImportedData
        WHERE RegionCode IN (SELECT RegionCode FROM DisallowedRegions))
        RAISERROR ('Found some bad data', 16, 1)

This is by-design behavior. While the optimizer is pretty smart, the hypothetical world it models to come up with cost estimates doesn’t always capture all of the nuances of real world data. Usually these simplifying assumptions are good, because they allow SQL to produce good plans without taking too long to compile. But sometimes the simplified model leads to suboptimal plans. It pays to know the optimizer’s limits, so that you can quickly recognize these cases during query tuning.


(cross-posted here)