The primary purpose of a database is to store and retrieve data, so performing a lot of disk reads and writes is one of the inherent attributes of a database server. Disk I/O operations consume many resources and take a relatively long time to complete. Much of the logic in relational database software concerns making the pattern of I/O usage highly efficient.
Microsoft® SQL Server™ allocates much of its virtual memory to a buffer cache and uses the cache to reduce physical I/O. Data is read from the database disk files into the buffer cache. Multiple logical reads of the data can be satisfied without requiring that the data be physically read again. The data remains in the cache until it has not been referenced for some time and the database needs the buffer area to read in more data. Data is written back to disk only if it is modified. Data can be changed multiple times by logical writes before a physical write transfers the new data back to disk.
The data in a SQL Server version 7.0 database is stored in 8 KB pages. Each group of eight contiguous pages is a 64 KB extent. The buffer cache is also divided into 8 KB pages.
SQL Server I/O is divided into logical and physical I/O. A logical read occurs every time the database server requests a page from the buffer cache. If the page is not currently in the buffer cache, a physical read is then performed to read the page into the buffer cache. If the page is currently in the cache, no physical read is generated; the buffer cache simply uses the page already in memory. A logical write occurs when data is modified in a page in memory. A physical write occurs when the page is written to disk. It is possible for a page to remain in memory long enough to have more than one logical write made before it is physically written to disk.
One of the basic performance optimization tasks for a SQL Server installation involves sizing the SQL Server memory. The goal is to make the buffer cache large enough to maximize the ratio of logical reads to physical reads, but not so large that excessive memory swapping starts generating physical I/O to the pagefile. SQL Server 7.0 does this automatically when running with its default configuration settings.
By maintaining a relatively large buffer cache in virtual memory, SQL Server can significantly reduce the number of physical disk reads it requires. After a frequently referenced page has been read into the buffer cache, it is likely to remain there, completely eliminating further reads.
SQL Server 7.0 uses two Microsoft Windows NT® features to improve its disk I/O performance:
Before scatter-gather I/O was introduced in Windows NT version 4.0 Service Pack 2, all of the data for a disk read or write on Windows NT had to be in a contiguous area of memory. If a read transferred in 64 KB of data, the read request had to specify the address of a contiguous area of 64 KB of memory. scatter-gather I/O allows a read or write to transfer data in to or out of discontiguous areas of memory.
If SQL Server 7.0 reads in a 64 KB extent, it does not have to allocate a single 64 KB area and then copy the individual pages to buffer cache pages. It can locate eight buffer pages, then do a single scatter-gather I/O specifying the address of the eight buffer pages. Windows NT places the eight pages directly into the buffer pages, eliminating the need for SQL Server to do a separate memory copy.
In an asynchronous I/O, an application requests a read or write operation from Windows NT. Windows NT immediately returns control to the application. The application can then perform additional work, and later test to see if the read or write has completed. By contrast, in a synchronous I/O, the operating system does not return control to the application until the read or write completes. Using asynchronous I/O allows SQL Server to maximize the work done by individual threads while they are processing a batch.
SQL Server supports multiple concurrent asynchronous I/O operations against each file. The maximum number of I/O operations for any file is controlled by the max async io configuration option. If max async io is left at its default of 32, then a maximum of 32 asynchronous I/O operations can be outstanding for each file at any point in time. The default setting should be sufficient for almost all environments.