INF: How to Improve DBCC Performance on SQL ServerLast reviewed: April 3, 1997Article ID: Q140569 |
The information in this article applies to:
SUMMARYThe DBCC (DataBase Consistency Checker) utilities are a collection of programs used to verify integrity of a SQL Server database. They are conceptually similar to file system checking programs such as CHKDSK in MS- DOS, Windows 95, and Windows NT, and fsck in UNIX. Like file system checking programs, DBCC can take a significant amount of time to run on large data sets. This article discusses ways to improve DBCC performance. References to 6.0 also apply to 6.5 unless otherwise noted.
MORE INFORMATION
Overview of DBCCMost file systems and database systems provide integrity checking utilities to periodically verify data structure consistency. For example the Rdb Management Utility (RMU) performs functions similar to DBCC on the Oracle Rdb database. Integrity checking utilities generally have a long running time when used on large data sets, and DBCC is no different. However there are several actions which can maximize the performance and minimize the impact of DBCC. First, a review of the DBCC utilities most commonly used for integrity checking: NEWALLOC and CHECKTABLE/CHECKDB. CHECKDB performs the same checks as CHECKTABLE, but for each table in the database in succession. NEWALLOC checks the integrity of all table and index page chains, and checks that each page is correctly allocated. It checks the integrity of all extent chains in the database, and verifies that each chain is correctly linked. NEWALLOC also checks for object ID consistency between the page, the extent, and sysindexes. In general, NEWALLOC checks things at a page and inter-page level. CHECKTABLE/CHECKDB by contrast spends most of its time checking things at an intra-page level. In addition to checking the page chain, it checks that indexes are in properly sorted order, that the data information inside each page is reasonable, that page offsets are reasonable, that the sysindexes entries for the table are correct, and that the data row count equals the leaf row count for non-clustered indexes. DBCC NEWALLOC and CHECKDB perform largely complementary checks and should generally both be run to get a comprehensive check of the database. By necessity DBCC is I/O intensive. For NEWALLOC, running time will be roughly proportional to the database space in use. This is because most of NEWALLOC's time is spent doing sequential reads. Time for this will scale linearly with the size of the database. For this same reason NEWALLOC can be significantly faster on version 6.0 because the parallel read-ahead facility increases the scanning rate. Running time for CHECKDB is more dependant on the number of indexes in the database and number of rows per table, so the time cannot be predicted by the database size alone. Version 6.0 uses parallel threads for CHECKTABLE, one to check each index. When combined with the 6.0 read-ahead facility, on computers with sufficient memory, CHECKTABLE/CHECKDB can be significantly faster on 6.0 than on 4.2. The amount of memory needed to ensure this would be approximately the amount where the SQL page cache (visible with dbcc memusage) is larger than the largest individual table in the database. On computers with less memory than this, you may want to experiment with disabling read ahead for the CHECKTABLE/CHECKDB session by using trace flag 653. You can disable parallel CHECKTABLE by using trace flag 2508. Setting the sp_configure "RA worker threads" parameter to 0 disables both read ahead and parallel CHECKTABLE. See the version 6.0 online documentation for details on how to use trace flags. for Tests indicate that SQL Server 6.0 can run DBCC NEWALLOC on a 2 GB database in less than 30 minutes, when running on a typical industry standard symmetric multiprocessor (SMP) platform. As for concurrency impact of DBCC, NEWALLOC and CHECKDB are both online checks in that the database can be in use while they run. There is concurrency impact, but in many cases this is acceptable. NEWALLOC impact is primarily the I/O and CPU load from the check. In addition to the I/O and CPU load, CHECKDB places a shared table lock on each table while it's being checked. While allowing SELECTs, this will prevent modification of the table until CHECKDB moves on to the next table in the database. The share lock is necessary to achieve a reliable check. It is usually best to run NEWALLOC when update activity in the database is at a minimum, since there is a possibility of spurious errors caused by in- progress changes during the check.
Specific Steps to Minimize DBCC Impact and Increase PerformanceGeneral Advice:
One good way to increase I/O capacity is by placing the database on a large RAID disk array, using as many physical drives as possible. (if you have text/image data) in conjunction with your database dumps. This verifies the state of the database at the time of the dump. However, if time does not permit all of these checks, the best single check to run is NEWALLOC. It is typically faster than CHECKDB, and provides a good overall check of the database. It is better to run NEWALLOC alone than to run no checks at all. NOTE: 6.5 Service Pack 2 NEWALLOC has been enhanced to avoid most spurious errors when running in multi-user mode. It is likely that you can avoid the need to place the database in single-user mode for NEWALLOC simply by running 6.5 Service Pack 2 or later. See the 6.5 Service Pack 2 readme file for details. consider running NEWALLOC as the more frequent check, owing to the non- blocking nature and quicker running time it provides. Then on a less- frequent basis, supplement this with CHECKDB.
DBCC CHECKDB(MYDB, NOINDEX) A DBCC run on the backup computer is just as valid and effective a test as when run on the main server. If you do not have a hot backup server, the combined benefits of the safety net it provides plus the DBCC solution can be strong arguments for getting one.How to Run CHECKTABLE While In Production: Sometimes you may need to check a specific table or its indexes. On large memory computers, there is a technique that can sometimes allow this with very little concurrency impact, even while the computer is at modest production activity levels. It basically consists of pre-loading the page cache with the table and index contents using a non-blocking SELECT with the NOLOCK optimizer hint, then checking a single index at a time to minimize the time the table is share locked. In some cases, it's possible to check a 200 MB table in less than 60 seconds using this technique. The following are the steps needed to achieve this:
and observe what queries are active. SELECT SPID, STATUS, HOSTNAME, PROGRAM_NAME, CMD, BLOCKED FROM MASTER..SYSPROCESSES WHERE STATUS <> 'SLEEPING' Table and index size in 2048-byte pages can be seen by issuing this query, where TAB1 represents your table name: SELECT NAME, INDID, DPAGES FROM SYSINDEXES WHERE ID=OBJECT_ID("TAB1") Execute DBCC MEMUSAGE and inspect the size of the returned "Page Cache" value. Only if the sum of your table and index size is less than the page cache size is it possible to use this technique. Ideally, page cache should be significantly larger to allow servicing of other requests during execution of DBCC, without depleting the cache holding the table being checked. NOLOCK and INDEX optimizer hints to ensure a non-blocking SELECT and that each index is covered. This could take several minutes, depending on table size and computer I/O bandwidth. select * from tab1 (nolock index=0) where key_id < 0 select * from tab1 (nolock index=1) where key_id < 0 select * from tab1 (nolock index=2) where key_id < 0 After all the data is pre-loaded, you can usually re-scan the entire table with one of the above commands within a few seconds, during which CPU should be high and I/O almost 0. Do this to verify the data is still in cache. (depending on the computer speed and table size) CHECKTABLE will finish checking a single index within 30 seconds to two minutes. For example: DBCC CHECKTABLE(TAB1, 2) Then proceed and check the other indexes on the table.
SELECT SPID, STATUS, HOSTNAME, PROGRAM_NAME, CMD, BLOCKED FROM MASTER..SYSPROCESSES WHERE BLOCKED > 0 Conclusion: Integrity checking programs for file systems or databases tend to have long running times when applied to large data sets. Performance of SQL Server 6.0 DBCC has been significantly improved via read ahead and parallel index checking. When SQL Server 6.0 is run on an adequately equipped platform, and the techniques mentioned in this article are used, it is usually possible to minimize the impact of DBCC to a reasonable level. Specific techniques include running only the necessary DBCC utilities, understanding the concurrency impact of each utility and hence when during the day it can be run, running DBCC on a "hot backup" server, and pre-loading the page cache prior to checking a specific table.
|
Additional query words: 6.00 sql6 winnt check chk DBCC performance
© 1998 Microsoft Corporation. All rights reserved. Terms of Use. |