读取页Reading Pages

适用于:Applies to: 是SQL ServerSQL Server(所有支持的版本)yesSQL ServerSQL Server (all supported versions) 适用于:Applies to: 是SQL ServerSQL Server(所有支持的版本)yesSQL ServerSQL Server (all supported versions)

SQL Server 数据库引擎Database Engine 实例的 I/O 包括逻辑读取和物理读取。The I/O from an instance of the SQL Server 数据库引擎Database Engine includes logical and physical reads. 每次 数据库引擎Database Engine缓冲区缓存请求页时都会发生逻辑读取。A logical read occurs every time the 数据库引擎Database Engine requests a page from the buffer cache. 如果页当前不在缓冲区高速缓存中,物理读取将首先将页从磁盘复制到缓存中。If the page is not currently in the buffer cache, a physical read first copies the page from disk into the cache.

数据库引擎Database Engine 实例生成的读取请求由关系引擎控制,并由存储引擎优化。The read requests generated by an instance of the 数据库引擎Database Engine are controlled by the relational engine and optimized by the storage engine. 关系引擎决定最有效的访问方法(例如,表扫描、索引扫描或键读取);存储引擎的访问方法和缓冲区管理器组件确定要执行的读取的常规模式,并对实现访问方法所需的读取进行优化。The relational engine determines the most effective access method (such as a table scan, an index scan, or a keyed read); the access methods and buffer manager components of the storage engine determine the general pattern of reads to perform, and optimize the reads required to implement the access method. 执行批处理的线程将安排读取。The thread executing the batch schedules the reads.

预读Read-Ahead

数据库引擎Database Engine 支持称为“预读”的性能优化机制。The 数据库引擎Database Engine supports a performance optimization mechanism called read-ahead. 预读首先预测执行查询执行计划所需的数据和索引页,然后在查询实际使用这些页之前将它们读入缓冲区高速缓存。Read-ahead anticipates the data and index pages needed to fulfill a query execution plan and brings the pages into the buffer cache before they are actually used by the query. 这样可以让计算和 I/O 重叠进行,从而充分利用 CPU 和磁盘。This allows computation and I/O to overlap, taking full advantage of both the CPU and the disk.

预读机制允许 数据库引擎Database Engine 从一个文件中读取最多 64 个连续页 (512KB)。The read-ahead mechanism allows the 数据库引擎Database Engine to read up to 64 contiguous pages (512KB) from one file. 该读取作为缓冲区高速缓存中相应数量(可能是非相邻的)缓冲区的一次散播-聚集读取来执行。The read is performed as a single scatter-gather read to the appropriate number of (probably non-contiguous) buffers in the buffer cache. 如果此范围内的任何页在缓冲区高速缓存中已存在,当读取完成时,所读取的相应页将被放弃。If any of the pages in the range are already present in the buffer cache, the corresponding page from the read will be discarded when the read completes. 如果相应页在缓存中已存在,也可以从任何一端“裁剪”页的范围。The range of pages may also be "trimmed" from either end if the corresponding pages are already present in the cache.

有两种类型的预读:一种用于数据页,一种用于索引页。There are two kinds of read-ahead: one for data pages and one for index pages.

读取数据页Reading Data Pages

用于读取数据页的表扫描在 数据库引擎Database Engine中非常有效。Table scans used to read data pages are very efficient in the 数据库引擎Database Engine. SQL Server 数据库中的索引分配映射 (IAM) 页列出了表或索引使用的区数。The index allocation map (IAM) pages in a SQL Server database list the extents used by a table or index. 存储引擎可以读取 IAM 以生成必须读取的磁盘地址的排序列表。The storage engine can read the IAM to build a sorted list of the disk addresses that must be read. 这使得存储引擎能够根据要读取的磁盘位置,将其 I/O 操作优化为按顺序执行的大型顺序读取。This allows the storage engine to optimize its I/Os as large sequential reads that are performed in sequence, based on their location on the disk. 有关 IAM 页的详细信息,请参阅 管理对象使用的空间For more information about IAM pages, see Managing Space Used by Objects.

读取索引页Reading Index Pages

存储引擎按键的顺序依次读取索引页。The storage engine reads index pages serially in key order. 例如,下图显示了一组叶级页的简化表示法,该组叶级页包含映射叶级页的键集和中间索引节点。For example, this illustration shows a simplified representation of a set of leaf pages that contains a set of keys and the intermediate index node mapping the leaf pages. 有关索引中页的结构的详细信息,请参阅 聚集索引结构For more information about the structure of pages in an index, see Clustered Index Structures.

Reading_Pages

存储引擎使用高于叶级的中间索引页上的信息为包含键的页安排序列预读。The storage engine uses the information in the intermediate index page above the leaf level to schedule serial read-aheads for the pages that contain the keys. 如果请求针对的是 ABC 到 DEF 之间的所有键,则存储引擎将首先读取高于叶级页的索引页,If a request is made for all the keys from ABC to DEF, the storage engine first reads the index page above the leaf page. 但它并不是仅仅按顺序读取页 504 到页 556(即指定范围内的包含键的最后一页)之间的每个数据页。However, it does not just read each data page in sequence from page 504 to page 556 (the last page with keys in the specified range). 相反,存储引擎将扫描中间索引页并生成必须要读取的叶级页的列表。Instead, the storage engine scans the intermediate index page and builds a list of the leaf pages that must be read. 然后,存储引擎会按键的顺序安排所有读取。The storage engine then schedules all the reads in key order. 存储引擎还会识别出页 504/505 以及页 527/528 是相邻页,并执行一次散播读取,从而在单个操作中检索这些相邻页。The storage engine also recognizes that pages 504/505 and 527/528 are contiguous and performs a single scatter read to retrieve the adjacent pages in a single operation. 如果在一个序列操作中要检索许多页,则存储引擎将一次安排一个读取块。When there are many pages to be retrieved in a serial operation, the storage engine schedules a block of reads at a time. 完成这些读取子集后,存储引擎将安排同等数量的新读取,直到安排完所需的全部读取。When a subset of these reads is completed, the storage engine schedules an equal number of new reads until all the required reads have been scheduled.

存储引擎使用预提取加快非聚集索引的基表查找。The storage engine uses prefetching to speed base table lookups from nonclustered indexes. 非聚集索引的叶级行包含指针,指向含有每个特定键值的数据行。The leaf rows of a nonclustered index contain pointers to the data rows that contain each specific key value. 存储引擎浏览非聚集索引的叶级页时,它也会开始计划异步读取已检索了其指针的数据行。As the storage engine reads through the leaf pages of the nonclustered index, it also starts scheduling asynchronous reads for the data rows whose pointers have already been retrieved. 这可以使存储引擎在完成非聚集索引的扫描之前从基础表中检索数据行。This allows the storage engine to retrieve data rows from the underlying table before it has completed the scan of the nonclustered index. 无论表是否有聚集索引,都会使用预提取。Prefetching is used regardless of whether the table has a clustered index. SQL Server Enterprise 比 SQL Server 其他版本使用更多的预提取,可以预读更多页。SQL Server Enterprise uses more prefetching than other editions of SQL Server, allowing more pages to be read ahead. 在任何版本中都无法配置预提取的级别。The level of prefetching is not configurable in any edition. 有关非聚集索引的详细信息,请参阅 非聚集索引结构For more information about nonclustered indexes, see Nonclustered Index Structures.

高级扫描Advanced Scanning

在 SQL Server Enterprise 中,高级扫描功能可以使多项任务共享完全表扫描。In SQL Server Enterprise, the advanced scan feature allows multiple tasks to share full table scans. 如果 Transact-SQL 语句的执行计划需要扫描表中的数据页,并且 数据库引擎Database Engine 检测到其他执行计划正在扫描该表,则 数据库引擎Database Engine 会在第二个扫描的当前位置将第二个扫描加入第一个扫描。If the execution plan of a Transact-SQL statement requires a scan of the data pages in a table and the 数据库引擎Database Engine detects that the table is already being scanned for another execution plan, the 数据库引擎Database Engine joins the second scan to the first, at the current location of the second scan. 数据库引擎Database Engine 会一次读取一页,并将每一页的行传递给这两个执行计划。The 数据库引擎Database Engine reads each page one time and passes the rows from each page to both execution plans. 此操作将一直持续到该表的结尾处。This continues until the end of the table is reached.

此时,第一个执行计划已有完整的扫描结果,而第二个执行计划仍必须检索在它加入正在进行的扫描之前读取的数据页。At that point, the first execution plan has the complete results of a scan, but the second execution plan must still retrieve the data pages that were read before it joined the in-progress scan. 然后,第二个执行计划中的扫描将绕回到表的第一个数据页,并从这里向前扫描到它加入第一个扫描时所处的位置。The scan for the second execution plan then wraps back to the first data page of the table and scans forward to where it joined the first scan. 可以按这种方式组合任意数量的扫描。Any number of scans can be combined like this. 数据库引擎Database Engine 将循环遍历数据页,直到完成所有扫描。The 数据库引擎Database Engine will keep looping through the data pages until it has completed all the scans. 这种机制也称为“走马灯式扫描”,说明了为何在没有 ORDER BY 子句的情况下无法保证 SELECT 语句所返回结果的顺序。This mechanism is also called "merry-go-round scanning" and demonstrates why the order of the results returned from a SELECT statement cannot be guaranteed without an ORDER BY clause.

例如,假设某个表有 500,000 页。For example, assume that you have a table with 500,000 pages. UserA 执行了一条 Transact-SQL 语句,要求对该表进行扫描。UserA executes a Transact-SQL statement that requires a scan of the table. 当扫描已处理了 100,000 页时,UserB 执行了另一条 Transact-SQL 语句,要对同一个表进行扫描。When that scan has processed 100,000 pages, UserB executes another Transact-SQL statement that scans the same table. 数据库引擎Database Engine 将为页 100,001 之后的页安排一组读取请求,并将每页中的行同时传递回两个扫描。The 数据库引擎Database Engine schedules one set of read requests for pages after 100,001, and passes the rows from each page back to both scans. 当扫描到 200,000 页时,UserC 执行了另一条 Transact-SQL 语句,要对同一个表进行扫描。When the scan reaches the 200,000th page, UserC executes another Transact-SQL statement that scans the same table. 则从页 200,001 开始, 数据库引擎Database Engine 将把它读取的每一页中的行传递回所有三个扫描。Starting with page 200,001, the 数据库引擎Database Engine passes the rows from each page it reads back to all three scans. 当数据库引擎读取完第 500,000 行之后,UserA 的扫描就完成了,而 UserB 和 UserC 的扫描将绕回到页 1 开始读取。After it reads the 500,000th row, the scan for UserA is complete, and the scans for UserB and UserC wrap back and start to read the pages starting with page 1. 数据库引擎Database Engine 到达页 100,000 时,UserB 的扫描就完成了。When the 数据库引擎Database Engine gets to page 100,000, the scan for UserB is completed. 然后 UserC 的扫描将继续进行,直到它读取完页 200,000。The scan for UserC then keeps going alone until it reads page 200,000. 此时,所有扫描均告完成。At this point, all the scans have been completed.

在没有高级扫描的情况下,每个用户都必须要争用缓冲区空间并因此导致磁盘臂争用。Without advanced scanning, each user would have to compete for buffer space and cause disk arm contention. 然后,会分别为每个用户读取一次相同的页,而不是一次读取并由多个用户共享,这样会降低性能并加重资源负担。The same pages would then be read once for each user, instead of read one time and shared by multiple users, slowing down performance and taxing resources.

另请参阅See Also

页和区体系结构指南 Pages and Extents Architecture Guide
写入页Writing Pages