The EMC Disk I/O Tuning Scenario

For those implementing SQL Server database systems on the unique EMC Symmetrix Enterprise Storage Systems, some disk I/O balancing methods can help avoid disk I/O bottleneck problems and maximize performance.

Symmetrix storage systems contain up to 16 GB of RAM cache and contain internal processors within the disk array that help speed the I/O processing of data without using host server CPU resources. You must understand the four components in the Symmetrix storage system to balance disk I/O. One component is the 16-GB cache inside the Symmetrix. Up to 32 SA channels can be used to cable up to 32 SCSI cards from Windows NT-based host servers into the Symmetrix; all of these SA channels can be requesting data simultaneously from the 16-GB cache. Within the Symmetrix storage system, there are up to 32 connectors called DA controllers (internal SCSI controllers) that connect all of the internal disk drives within the Symmetrix into the 4-GB internal cache. And finally, there are the hard disk drives within the Symmetrix.

EMC hard disk drives are SCSI hard disk drives with the same I/O capability of the other SCSI drives referred to in this chapter. One feature commonly used with EMC technology is referred to as hyper-volumes. A hyper-volume is the logical division of an EMC hard disk drives, so the hyper-volume looks like another physical drive to the Windows NT Disk Administrator and can be manipulated with Windows NT Disk Administrator like any other disk drive. Multiple hyper-volumes can be defined on each physical drive. You should work closely with EMC field engineers to identify how hyper-volumes are defined when conducting database performance tuning on EMC storage. You can overload a physical drive with database I/O if you think two or more hyper-volumes are separate physical drives but actually are two or more hyper-volumes on the same physical drive.

SQL Server I/O activities should be divided evenly among distinct DA controllers because DA controllers are assigned to a defined set of hard disk drives. DA controllers are not likely to suffer an I/O bottleneck, but the set of hard disk drives associated with a DA controller may be more susceptible. SQL Server disk I/O balancing is accomplished the same way with DA controllers and their associated disk drives as with other vendors disk drives and controllers.

To monitor the I/O on a DA channel or on separate physical hard disk drives, get help from EMC technical support staff, because I/O activity occurs beneath the EMC internal cache and is not visible to Performance Monitor. EMC storage units have internal monitoring tools that allow an EMC technical support engineer to monitor I/O statistics within the Symmetrix. Performance Monitor can see I/O coming to and from an EMC storage unit only by the I/O coming from an SA channel. This is enough information to indicate that a SA channel is queuing disk I/O requests, but is not enough information to determine the disk or disks that are causing the disk queuing. If an SA channel is queuing, the bottleneck may be caused by the disk drives, not by the SA channel. One way to isolate the disk I/O bottleneck between the SA channels and the DA channels and drives is to add a SCSI card to the host server and connect it to another SA channel. If Performance Monitor indicates that I/O across both SA channels has not changed in volume and disk queuing is still occurring, then the SA channels are not causing the bottleneck. Another way to isolate the I/O bottleneck is to have an EMC engineer use EMC monitoring tools to monitor the EMC system and analyze the drives or DA channels that are bottlenecking.

Divide SQL Server activities evenly across as many disk drives as are available. When working with a smaller database that sustains a large amount of I/O and resides on EMC hardware using hypervolumes, you should continually monitor hyper-volumes definitions. Observing SQL Server activity will help you avoid overloading multiple hyper-volumes on one disk. For example, suppose the SQL Server consists of a 30-GB database. EMC hard disk drives can provide up to 23 GB in capacity, so you can fit the entire database onto 2 drives. For manageability and cost, this is appealing; but for I/O performance, it is not. An EMC storage unit can work with more than 100 internal drives. Involving only 2 drives for SQL Server can lead to I/O bottlenecks. Consider defining smaller hyper-volumes, perhaps of 2 GB each. Then approximately 12 hyper-volumes can be associated with a given 23-GB hard disk drive. Assuming 2-GB hyper-volumes, 15 hyper-volumes are required to store the database. Make sure that each hyper-volume is associated with a separate physical hard disk drive. Do not use 12 hyper-volumes from 1 physical drive and then 3 hyper-volumes associated on another physical drive, because this is the same as using 2 physical drives (150 nonsequential I/O / 300 sequential I/O across the two drives). But with 15 hyper-volumes, each of which is associated with a separate physical drive, SQL Server uses 15 physical drives for providing I/O (1125 nonsequential / 2250 sequential I/O activity per second across the 15 drives).

Also consider employing several SA channels from the host server to divide the I/O work across controllers for host servers that support more than a single PCI bus. Consider using one SA channel per host server PCI bus to divide I/O work across PCI buses as well as SA channels. On EMC storage systems, each SA channel is associated with a specific DA channel and a specific set of physical hard disk drives. Because SA channels read and write their data to and from the EMC internal cache, it is unlikely the SA channel is a point of I/O bottleneck. Because SCSI controller bottlenecks are not likely, it is probably best to concentrate on balancing SQL Server activities across physical drives rather than focus on how many SA channels to use.