Monitoring Disk Activity

Microsoft® SQL Server™ uses Microsoft Windows NT® I/O calls to perform disk reads and writes. SQL Server manages when and how disk I/O is performed, but relies on Windows NT to perform the underlying I/O operations. The I/O subsystem includes the system bus, disk controller cards, disks, tape drives, CD-ROM drive, and many other I/O devices. The disks are frequently the biggest bottleneck in a system.

Monitoring Disk I/O and Detecting Excess Paging

Two of the counters that can be monitored to determine disk activity include:

The PhysicalDisk: % Disk Time counter in Windows NT Performance Monitor monitors the percentage of time that the disk is busy with read/write activity. If the PhysicalDisk: % Disk Time counter is high (more than 90 percent), check the Physical Disk: Current Disk Queue Length counter to see how many system requests are waiting for disk access. The number of waiting I/O requests should be sustained at no more than 1.5 to 2 times the number of spindles making up the physical disk. Most disks have one spindle, although redundant array of independent disks (RAID) devices usually have more. A hardware RAID device appears as one physical disk in Windows NT Performance Monitor; RAID devices created through software appear as multiple instances.

Use the values of the Current Disk Queue Length and % Disk Time counters to detect bottlenecks within the disk subsystem. If Current Disk Queue Length and % Disk Time counter values are consistently high, consider using a faster disk drive, moving some files to an additional disk or server, or adding additional disks to a RAID array if one is being used.

If you are using a RAID device, the % Disk Time counter can indicate a value greater than 100 percent. If it does, use the PhysicalDisk: Avg. Disk Queue Length counter to determine how many system requests on average are waiting for disk access.

Applications and systems that are I/O-bound may keep the disk constantly active. This is called disk thrashing.

Monitor the Memory: Page Faults/sec counter to make sure that the disk activity is not caused by paging. In Windows NT, paging is caused by:

If you have more than one logical partition on the same hard disk, you should use the Logical Disk counters instead of the Physical Disk counters. Looking at the logical disk counters will help you determine which files are heavily accessed. Once you have found the disks with high levels of read/write activity, look at the read-specific and write-specific counters (for example, Logical Disk: Disk Write Bytes/sec) for the type of disk activity that is causing the load on each logical volume.

The PhysicalDisk: Avg. Disk Queue Length counter monitors the number of outstanding requests on the disk. Numbers that are consistently higher on one disk than other disks might indicate the need to redistribute data. This counter’s value is affected by the max async IO configuration option. If the counter value is consistently less than the configuration option value, it is an indication that the disk subsystem has enough capacity to handle the batched I/O being generated by SQL Server. In this situation, increasing the max async IO configuration value can improve performance. Conversely, if the counter is consistently greater than the configuration option value, then the disk subsystem is not keeping up with the I/O requests. If the overload is being caused by SQL Server I/O, then reducing the configuration value may be beneficial.


Note Because disk counters can increase disk access time on some older computers, Windows NT does not automatically activate the counters at system startup. To use these disk I/O counters, you must run diskperf -y (or diskperf -ye for RAID devices) from the command prompt. Then, you must shut down and restart the computer. For more information, see your Windows NT documentation.


Isolating Disk Activity Created by SQL Server

To determine the amount of I/O generated by SQL Server components, examine the following performance areas:

The number of page reads and writes that SQL Server performs can be monitored using the SQL Server: Buffer Manager Page Reads/sec and Page Writes/sec counters. If these values start to approach the capacity of the hardware I/O subsystem, you need to try and reduce the values by either tuning your application or database to reduce I/O operations (such as index coverage, better indexes, or normalization) or increasing the I/O capacity of the hardware.

  


(c) 1988-98 Microsoft Corporation. All Rights Reserved.