Filtered Indexes and Forced Parameterization: Can't we all just get along?
We have a database here that stores information about Azure SQLDB health issues; it was chugging along fine for months, then earlier this week it started to struggle. You can see the large increase in steady-state CPU consumption:
After asking around, we learned that forced parameterization had been enabled on this database around the time that the performance problem appeared.
ALTER DATABASE HealthDB SET PARAMETERIZATION FORCED
We could have disabled the feature, but it had been enabled for a good reason. We wanted to understand what was going on so we could look for other solutions. QDS and Query Performance Insight made it easy to identify the problem query.
The query was trivial, barely more complicated than this:
SELECT * FROM CurrentHealthReports WHERE ProjectToCluster = 1
QDS told us this query was scanning the whole table on each execution, just to return a row or two. The table is about 125 million rows and 600GB. That's hardly one for the record books, but it's still large enough that a full table scan is painful.
|--Parallelism(Gather Streams) |--Clustered Index Scan(OBJECT:([CurrentHealthReports].[PKCurrentHealthReports]), WHERE:([CurrentHealthReports].[ProjectToCluster]=[@0]))
This plan choice took me by surprise, because there are only a tiny number of rows with ProjectToCluster = 1, and there's a nonclustered index that has ProjectToCluster as its leading column. A seek on this index would have been much more efficient than the current scan-based plan. And, in fact, QPI also showed us that up until Tuesday this query had been using a much more efficient plan that did a seek on this index. (I have to say: I have a giant crush on QDS and the analysis features built on top of it. The job of identifying problem queries and good vs. bad plan choices used to be so much more difficult!)
|--Nested Loops(Inner Join, OUTER REFERENCES:([CurrentHealthReports].[PKCol])) |--Index Seek(OBJECT:([CurrentHealthReports].[idx_NotYetProjected]), SEEK:([CurrentHealthReports].[ProjectToCluster]=(1)) ORDERED FORWARD) |--Clustered Index Seek(OBJECT:([CurrentHealthReports].[PKCurrentHealthReports]), SEEK:([CurrentHealthReports].[PKCol]=[CurrentHealthReports].[PKCol]) LOOKUP ORDERED FORWARD)
My first thought was that the optimizer must be over-costing the index seek plan for some reason. I tried to force the old seek plan, so I could explore the reason for the costing error. I was expecting a bad cardinality estimate; instead, I was surprised to see it fail:
SELECT * FROM CurrentHealthReports WITH (INDEX = idx_NotYetProjected) WHERE ProjectToCluster = 1 Msg 8622, Level 16, State 1, Line 48 Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.
Now we knew the superficial reason that the optimizer wasn't choosing this plan: it thought that the plan was illegal. But the reason why the plan was illegal stumped us for a while. Eventually, I looked more closely at the index definition. The fact that it was a filtered index stood out. A filtered index has a WHERE clause, and its B-tree only indexes rows that satisfy the filter condition. Filtering is a smart choice for this index. But I was suspicious because we were seeing unusual behavior when forcing the index, and most indexes aren't filtered.
CREATE INDEX idx_NotYetProjected ON CurrentHealthReports (ProjectToCluster) WHERE (ProjectToCluster = 1)
After reviewing the plans again, we noticed that the query was parameterized. The parameter name "@0" is typical of an auto-parameterized query:
select * from CurrentHealthReports where ProjectToCluster = @0
Now the 8622 error made sense: after forced parameterization was enabled, the predicate "WHERE ProjectToCluster = 1" gets turned into "WHERE ProjectToCluster = @0". The query plan that is generated and cached for this query needs to be able to work for any possible value of @0 -- after all, that's the whole point of forced parameterization. But the index we expected SQL to use is filtered, and it can only be used to find rows where ProjectToCluster = 1. Any plan that relies on this index could never be used for arbitrary values of the parameter. When we tried to force the use of the index, the 8622 was SQL's way of telling us that the plan we were asking for wasn't legal. And, of course, this also explained why the optimizer was no longer choosing the index seek query plan on its own.
Here's the lesson I'm taking away from this: The presence of filtered indexes should make you cautious about enabling forced parameterization at the database level. The two features don't always play nicely, and the combination can cause query performance problems and/or 8622 errors. To fix the query perf problem, we considered disabling forced parameterization:
ALTER DATABASE HealthDB SET PARAMETERIZATION SIMPLE
In the end, we decided to leave forced parameterization enabled. Instead, we disabled the feature for this particular query with an OPTION (PARAMETERIZATION SIMPLE) hint.
DECLARE @stmt nvarchar(max); DECLARE @params nvarchar(max); EXEC sp_get_query_template N'SELECT * FROM CurrentHealthReports WHERE ProjectToCluster = 1', @stmt OUTPUT, @params OUTPUT; EXEC sp_create_plan_guide N'PlanGuide_HealthNotYetProjectedQuery', @stmt, N'TEMPLATE', NULL, @params, N'OPTION(PARAMETERIZATION SIMPLE)'
The PARAMETERIZATION SIMPLE hint can only be applied via a plan guide like the one shown above. A slightly less surgical alternative would be an OPTION(RECOMPILE) query hint, which happens to have the side effect of disabling forced parameterization for the query. This one can be applied by a change to the query text.
SELECT * FROM CurrentHealthReports WHERE ProjectToCluster = 1 OPTION (RECOMPILE)
If you want to play with this scenario yourself, I've uploaded a simplified version of it here.
(Cross-posted to personal blog at https://bartduncansql.wordpress.com/2018/02/08/filtered-indexes-and-forced-parameterization)