Disk I/O

The foundation for efficient query processing is efficient data transfer between disks and memory. SQL Server 7.0 incorporates many improvements in disk I/O.

Sequential I/O vs. Random I/O

Disk I/O is one of the more expensive computer operations. There are two types of disk I/O: sequential I/O, which reads data in the same order as it is stored on the disk, and random I/O, which reads data in random order, jumping from one location to another on the disk. Random I/O can be more expensive than sequential I/O, particularly when large amounts of data are involved.

Microsoft SQL Server 7.0 maintains an on-disk structure that minimizes random I/O and allows rapid scans of large heap tables. These are tables without a clustered index, meaning that the data rows are not stored in any particular order. This is an important feature for decision support queries. Such disk-order scans can also be employed for clustered and nonclustered indexes if the index’s sort order is not required in subsequent processing steps.

Earlier versions of SQL Server use a page chain, in which each page has a pointer to the next page holding data for the table. This results in random I/O and prevents read-ahead because, until the server reads a page, it does not have the location of the next page.

SQL Server 7.0 takes a different approach. An Index Allocation Map (IAM) maps the pages used by a table or index. The IAM is a bitmap, in disk order, of the data pages for a particular table. To read all the pages, the server scans the bitmap, determining which pages need to be read in what order. It then can use sequential I/O to retrieve the pages and issue read-aheads.

If the server can scan an index rather than reading the table, it will attempt to do so. This is useful if the index is a covering index (one that has all the fields necessary to satisfy the query). The server may satisfy a query by reading the B-tree index in disk order rather than in sort order. This results in sequential I/O and faster performance.

Large I/O

In SQL Server 7.0, all database pages are 8 KB and data is read in 64-KB extents (in earlier releases these figures were 2 KB and 16 KB, respectively). Both of these changes increase performance by allowing the server to read larger amounts of data in a single I/O request. This is particularly important for very large databases and decision support queries, in which a single request can process large numbers of rows.

Scanning Read-Ahead

SQL Server 7.0 takes increased advantage of striped disk sets by reading multiple extents ahead of the actual query processor request. This results in faster scans of heap tables and B-tree indexes.

Prefetch Hints

Disk-order scans speed up scans of large amounts of data. SQL Server 7.0 also speeds up fetching data using a nonclustered index.

When searching for data using a nonclustered index, the index is searched for a particular value. When that value is found, the index points to the disk address. The traditional approach is to issue immediately an I/O for that row, given the disk address. The result is one synchronous I/O per row and, at most, one disk at a time working to evaluate the query. This does not take advantage of striped disk sets.

SQL Server 7.0 takes a different approach. It continues looking for more record pointers in the nonclustered index. When it has collected a number of them, it provides the storage engine with prefetch hints. These hints tell the storage engine that the query processor needs these particular records soon. The storage engine then can issue several I/O requests simultaneously, taking advantage of striped disk sets.