INF: How to Troubleshoot Long Recovery on Databases
ID: Q170115
|
The information in this article applies to:
-
Microsoft SQL Server versions 6.0, 6.5
SUMMARY
Database recovery during SQL Server startup or loading transaction logs may
take a long period of time, during which the state of recovery is not easy
to determine. This article describes steps to troubleshoot this scenario.
MORE INFORMATION
When SQL Server recovers a database, transactions can roll forward or back.
You can reduce the amount of time required to roll forward transactions by
performing a normal shutdown. However, you cannot control the amount of
time required to roll back transactions. So, if a long running query is
aborted by shutting down SQL Server, the rollback of the transaction may
cause the recovery of the database to exceed the recovery interval setting
time.
During a normal shutdown, SQL Server checkpoints all databases where the
recovery interval is exceeded. SQL Server is normally shut down by the
following means:
- Stopping the service with SQL Service Manager, SQL Enterprise Manager, or Control Panel Services.
- Shutting down Windows NT.
- Running the net stop msssqlserver command.
- Running the SHUTDOWN query.
SQL Server is shut down without checkpointing databases by the following
means:
- Restarting the computer.
- Quitting SQL Server with CTRL+C when it was started from the command
line.
- Killing the SQL Server process by using a KILL command or the Windows NT Task Manager.
- Attaching to SQL Server with a debugger and closing the debugger.
- SQL Server encountering an exception that it does not handle (normally Dr. Watson would report this situation).
- Running the SHUTDOWN WITH NOWAIT query.
Additionally, loading transaction logs after loading a database dump may
also cause long recovery.
In a long recovery situation, there are three possibilities of transactions
that need to roll forward or back:
- Many small transactions
- One large transaction
- A combination of either of the above
In these cases, it is difficult to determine the amount of time that
recovery will take on a database. However, if you perform the following
steps, you can see that SQL Server is progressing in the database recovery:
- Determine the spid that is running recovery for the database by
examining the errorlog for a message such as the following (note that in
the example below, the spid is 10):
spid10 Recovering database 'pubs'
- From this point, there are three simple ways to check that database
recovery is still occurring:
- Look for the recovery spid number from step one in the sysprocesses table or sp_who output. If the spid is not in the output, the database recovery has not completed.
- Look at the status column of the sysdatabases system table to see if the "Database not recovered yet" status bit (64) is still enabled and that the "Database is suspect" status bit (256) is disabled. If so, the database recovery has not completed.
- Examine the error log for a transaction summary from the recovery
spid. If it is not there, the database recovery has not completed.
The following is an example of the transaction summary:
spid10 1 transactions rolled forward in dbid 4.
- If database recovery has not completed yet, examine the error log for the checkpoint message from the recovery spid. This message means that the database has entered the active part of recovery, where transactions are rolled forward and back. The following is an example:
spid10 Recovery dbid 4 ckpt (1128,31) oldest tran=(1128,0)
- Normally, during the active part of recovery, the hard disk drive that contains the database's transaction log will see heavy activity. You can see this activity by either looking at the drive activity light on the computer or by using Windows NT Performance Monitor and looking at the "LogicalDisk: % Disk Time" counter, if the disk counters were already enabled with the DISKPERF -Y command.
- Enable trace flag 3412, which reports when each transaction is rolled forward or back, and examine the error log for progress. However, you will not see any progress if SQL Server is rolling a large transaction forward or back. Additionally, this trace flag duplicates the sp_configure setting Recovery flags. The following is an example of the output:
spid10 Roll forward transaction 'user_transaction' in dbid 4.
If recovery still seems to be progressing, or if it cannot be determined,
you have the following three options:
- Allow recovery to continue until it eventually completes.
- Stop recovery and load the database from a backup.
- Stop recovery and truncate the transaction log. For additional information, please see the following article in the Microsoft Knowledge Base:
Q165918 INF: Bypass (Emergency) Mode and DUMP TRANSACTION WITH NO_LOG
Additional query words:
delay extended very period errorlog
Keywords : kbtshoot kbusage SSrvAdmin SSrvGen
Version : winnt:6.0,6.5
Platform : winnt
Issue type : kbhowto kbinfo
|