The Read Ahead that doesn’t count as Read Ahead
As you may know from having read the documentation, whitepapers, or personal synthetic experiments, SQL Server’s Storage Engine supports a performance optimization mechanism named read-ahead (RA for short). Its aim is anticipating the data and index pages needed to fulfill a query execution plan and bringing those pages into the buffer pool (cache) before they are actually used by the query. This allows computation and I/O to overlap, taking full advantage of both the CPU and the disk.
The mechanism varies depending on what type of object, a heap or an index, you are accessing. For example, when a Heap is scanned, it must be 16 pages or more so that the storage engine considers prefetching (just another name for reading ahead) its pages into the buffer pool. This reduces the number of potential waits that would occur if none of the referred pages were cached when they were actually accessed by the query. The reason why that minimum number of 16 pages must be met is because setting up the read-ahead mechanism puts some overhead that is not worth assuming for such small tables. Those pages prefetched through this mean are accounted into the particular statistics for the query that brought them into cache. Those stats can be seen in the “read-ahead reads” output item produced when STATISTICS IO is enabled for the session running the query. And they are also globally counted for in the Buffer Manager->Readahead pages/sec performance counter.
But what I wanted to talk about in this post, was another prefetching mechanism that can be triggered by any query when SQL is running any of the editions considered Enterprise (i.e. Developer, Evaluation, and Enterprise itself). The aim of this optimization is to warm up the cache as quickly as possible. To do so, the buffer pool converts any request to read a single page from disk into a request that will read the whole extent containing the page initially requested.
It is important to notice that even if you are running an Enterprise edition, SQL doesn’t apply this optimization every time it reads a page from disk. When the buffer pool cannot grow more, because it has reached its maximum size (Buffer Manager’s Total Pages reaches Target Pages), and there are less than 2048 free pages it won’t do this aggressive cache filling. In any other case, it will apply it. So this is not something that will only occur after a service recycle. It could happen that you detach a database whose pages were occupying an important percentage of the buffer pool. Once the database is detached, those buffers are marked as free, and any subsequent singleton read might be converted into an 8-pages read if the new number of free pages exceeds the hard limit of 2048.
Now, one important thing to know is that these pages are not accounted as read-ahead pages, neither in the personal statistics of the query that triggered its read from disk (STATISTICS IO) nor in the global performance counter Buffer Manager->Readahead pages/sec.
Actually, in the IO statistics of the particular query only the single page requests will be counted. And the global performance counter that will expose the total number of pages read by this ramp up mechanism is Buffer Manager->Pages read/sec.
This is the reason why if you run the following example in an Enterprise edition, and it happens that any of the allocation pages (IAM and the PFS covering it) that are read before the data page that needs to be read is identified, are contained in the same extent as the data page, by the time the query requests the data page from the buffer pool it will already be in memory and won’t need to go to disk to get it. So, in my case, I create a table t1 an inserted one row in it. Its IAM page happened to be page 1:90 and the data page happened to be 1:89. After having issued a checkpoint and purged all clean buffers from the cache, when I run the SELECT statement that scans the whole heap it will, first of all, have to get the IAM page from the buffer pool. Since that IAM page is not in memory, it has to read it from disk and since, in my case, I’m running an Enterprise edition and the sum of current free pages plus the number of pages my buffer pool can still grow until it reaches its target is way beyond 2048, it decides to read the whole extent that the IAM page (1:90) is part of. That extent is identified as extent 1:88, and contains the range of pages from 1:88 through 1:95. As you can see, our data page is part of that range, so it will be prefetched. Therefore, by the time the query asks the buffer pool for page 1:89 to read the rows in it, it won’t have to go to disk to read it because it is in memory already.
Warning: Don’t run this example in a production server because the DBCC DROPCLEANBUFFERS could cause a big and undesired performance impact.
create table t1 (c1 int, c2 char(4000))
insert into t1 values (1, REPLICATE('x', 4000))
set statistics io on
select * from t1
set statistics io off
So when I run the previous SELECT in my Enterprise edition with plenty of memory to meet the conditions described before, and the allocation layout is just the one I mentioned in the previous paragraph, it returns the following IO statistics:
Table 't1'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
While if I run it when some of the previous conditions are not met, the data page is not prefetched and when the query requests it from the buffer pool, it will have to be read from disk at that point:
Table 't1'. Scan count 1, logical reads 1, physical reads 1, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
This optimization is available and enabled by default in Enterprise editions of the product. What I was not aware of until today, and probably many of you didn’t know about it either, is that beginning with SQL Server 2005 Service Pack 1 – and that includes SQL Server 2008 and SQL Server 2008 R2, both of them from their RTM releases – lower editions of SQL Server (i.e. Standard, or even Express) can also benefit from this performance booster if you enable globally traceflag 840.