UPDATE STATISTICS undocumented options

Article
07/21/2006

If you read the Books Online page describing the UPDATE STATISTICS command, you will see that there are some undocumented options.

UPDATE STATISTICS table | view
[
{
{ index | statistics_name }
| ( { index |statistics_name } [ ,...n ] )
}
]
[ WITH
[
[ FULLSCAN ]
| SAMPLE number { PERCENT | ROWS } ]
| RESAMPLE
| <update_stats_stream_option> [ ,...n ]
]
[ [ , ] [ ALL | COLUMNS | INDEX ]
[ [ , ] NORECOMPUTE ]
] ;

<update_stats_stream_option> ::=
[ STATS_STREAM = stats_stream ]
[ ROWCOUNT = numeric_constant ]
[ PAGECOUNT = numeric contant ]

<update_stats_stream_option>

This syntax is for internal use only and is not supported. Microsoft reserves the right to change this syntax at any time.

There is a very good reason why these options are undocumented. They are meant for testing and debugging purposes, and should never ever be used on production systems.

However, these options can also be extremely helpful to create examples and sample scripts to demonstrate various features and behaviors of the Query Optimizer and the related query plan shapes. We decided to explain what ROWCOUNT and PAGECOUNT do, so that to be able to use these commands in examples we’ll be posting in some future. Feel free to use these options on development systems for experimental and educational purposes, but please do not play with fire and do not use them in production!

As the name of these options suggest, ROWCOUNT and PAGECOUNT alter the internal metadata of the specified table or index by overriding the counters containing the row and page counts of the object. These counters are in turn read by the Query Optimizer when processing queries that access the table and/or index in question. These commands can basically cheat the Optimizer into thinking that a table or index is extremely large.

SQL Server’s Query Optimization process is structured in multiple stages. Further optimization stages consider a progressively larger and more sophisticated set of possible tree transformations and query optimizations. Later stages of optimization are only entered when the estimated cost of the query is sufficiently high, in order to avoid wasting precious CPU cycles against simple queries that do not need that level of sophistication anyway. The multiple optimization stages are a mean to produce efficient query plans without consuming excessive amounts of CPU. Typically, in order to make the Optimizer “think” a lot and enter these later stages it is necessary to have big tables with a large number of rows and pages, which in turn take time and space to populate. Using ROWCOUNT and PAGECOUNT allows us to exercise these code paths with relatively simple scripts that do not require an extremely complex setup phase.

Here is an example. When running this simple script on your SQL 2005 instance you will likely see a different query plan for the two selects before and after updating the statistics. The recompile option is used to ensure that the query plans are regenerated. From the statistics profile, you’ll also see very different estimated row counts and consequently costs.

use tempdb

create table t1(i int, j int)

create table t2(h int, k int)

set statistics profile on

select distinct(i) from t1

select * from t1, t2 where i = k order by j + k

update statistics t1 with rowcount = 10000, pagecount = 10000

update statistics t2 with rowcount = 100000, pagecount = 100000

select distinct(i) from t1 option (recompile)

select * from t1, t2 where i = k order by j + k option (recompile)

Enjoy!

Stefano

UPDATE STATISTICS undocumented options

Additional resources