INF: SQL Server and Caching Disk Controllers

ID: Q86903

The information in this article applies to:

Microsoft SQL Server versions 4.2x, 6.0, 6.5, 7.0

SUMMARY

Use of a write caching (also called write back caching) disk controller can improve SQL Server performance. Write caching controllers and disk subsystems are safe for SQL Server, if they are specifically designed for use in a data critical transactional database management system (DBMS) environment. These design features must preserve cached data if a system failure occurs. Using an external uninterruptible power supply (UPS) to achieve this is generally not sufficient, because failure modes that are unrelated to power can occur.

Caching controllers and disk subsystems can be safe for use by SQL Server. Most new purpose-built server platforms that incorporate these are safe. However, you should check with your hardware vendor to be sure that the disk subsystem has been specifically tested and approved for use in a data critical transactional relational database management system (RDBMS) environment.

MORE INFORMATION

SQL Server data modification statements generate logical page writes. This stream of writes can be pictured as going two places: the log and the database itself. For performance reasons, SQL Server defers writes to the database via its own cache buffer system. Writes to the log are only momentarily deferred until COMMIT time. They are not cached in the same manner as data writes. Because log writes for a given page always precede the page's data writes, the log is sometimes referred to as a "write-ahead" log.

Transactional integrity is one of the fundamental concepts of a relational database system. Transactions are considered to be atomic units of work that are either totally applied or totally rolled back. The SQL Server write-ahead transaction log is a vital component in implementing transactional integrity.

Any relational database system must also deal with a concept closely related to transactional integrity, which is recovery from unplanned system failure. A variety of non-ideal, real-world effects may cause this failure. On many database management systems, system failure may result in a lengthy human-directed manual recovery process.

In contrast, the SQL Server recovery mechanism is completely automatic and operates without human intervention. For example, SQL Server could be supporting a mission-critical production application, and experience a system failure due to a momentary power fluctuation. Upon restoration of power, the server hardware would restart, networking software would load and initialize, and SQL Server would restart. As SQL Server initializes, it will automatically run its recovery process based on data in the transaction log. This entire process occurs without human intervention. Whenever the client workstations restarted, users would find all of their data present, up to the last transaction they entered.

SQL Server transactional integrity and automatic recovery constitute a very powerful time-and-labor saving capability. If a write caching controller is not properly designed for use in a data critical transactional DBMS environment, it may compromise the ability of SQL Server to recover, hence corrupting the database. This can occur if the controller intercepts SQL Server transaction log writes and buffers them in a hardware cache on the controller board, but does not preserve these written pages during a system failure.

Most caching controllers perform write caching. The write caching function cannot always be disabled.

Even if the server uses a UPS, this does not guarantee the security of the cached writes. Many types of system failures can occur that a UPS does not address. For example, a memory parity error, an operating system trap, or a hardware glitch that causes a system reset can produce an uncontrolled system interruption. A memory failure in the hardware write cache can also result in the loss of vital log information.

Another possible problem related to a write-caching controller may occur at system shutdown. It is not uncommon to "cycle" the operating system or reboot the system during configuration changes. Even if a careful operator follows the operating system recommendation to wait until all disk activity has ceased before rebooting the system, cached writes can still be present in the controller. When the CTRL+ALT+DEL key combination is pressed, or the RESET button is pressed, cached writes can be discarded, potentially damaging the database.

It is possible to design a hardware write cache which takes into account all possible causes of discarding dirty cache data, which would thus be safe for use by a database server. Some of these design features would include intercepting the RST bus signal to avoid uncontrolled reset of the caching controller, on-board battery backup, and mirrored or ERC (error checking & correcting) memory. Check with your hardware vendor to ensure that the write cache includes these and any other features necessary to avoid data loss.

Additional query words: prodsms

Keywords : kb3rdparty kbhw SSrvServer
Version : winnt:4.2x,6.0,6.5,7.0
Platform : winnt
Issue type : kbinfo