INF: Recovering from Media Failure in SQL Server

ID Number: Q46479

1.00 1.10 1.11 4.20

OS/2

Summary:

This article provides guidance on how to recover as much data as

possible when the disk media is physically damaged or becomes

unreadable. These procedures are sometimes called "disaster recovery"

or "media failure" procedures.

More Information:

To recover user databases from media failure, re-create the lost

device(s), then restore the affected database(s) from backups.

Anything done after the backups were made must be done again.

To restore a backup, the target database must already exist. The

target database does not have to occupy the same device(s), it only

needs to have enough space allocated to hold what was in that database

at the time it was dumped.

The steps necessary to restore a backup are as follows:

1. Re-create the lost device(s) as follows:

a. Bring up SQL Server in single-user mode.

b. Drop each database that has space allocated on a lost device

using the DBCC DBREPAIR (dbname,DROPDB) command.

The following SQL command will list the affected databases:

SELECT DISTINCT DB_NAME(DBID) FROM SYSUSAGES,SYSDEVICES WHERE

VSTART>=LOW AND VSTART<=HIGH AND

NAME IN ('devname1','devname2',...)

c. Drop lost device(s) with the sp_dropdevice() function.

d. Use the CHECKPOINT command and shut down SQL Server, then

bring up SQL Server in normal mode.

e. Re-create lost device(s) with the DISK INIT command using the

VDEVNO and SIZE of the original device(s).

2. Re-create and reload the affected database(s) as follows:

a. Re-create the affected database(s) using the same size and device

assignment as the original database(s).

b. Assign LOG device(s) if necessary with the sp_logdevice()

function.

c. LOAD the affected database(s) and transaction log(s) from the

latest backups.

Notes

-----

SQL Server version 4.2 allows for the dumping of the transaction log

if the device containing the data portion of a database has been

damaged and the device containing the log portion is still intact. The

transaction log can be dumped using "DUMP TRAN WITH NO_TRUNCATE" and

can then be reapplied to a prior database dump (after all previous

transaction log dumps have been applied) to bring the database back to

the state it was in immediately prior to the media failure.

DBCC will produce error messages about not being able to access the

lost database devices; however, it will still drop the rows from

SYSDATABASES and SYSUSAGES.

If the information required in steps 1e and 2a is not known, it can be

derived from the SYSDEVICES, SYSDATABASES, and SYSUSAGES tables in the

master database.

VDEVNO and device SIZE can be derived from LOW and HIGH in SYSDEVICES

as follows:

VDEVNO is LOW divided by 16777216 (0x01000000)

SIZE is HIGH - LOW (2K blocks)

SIZE and device assignment of databases can be derived from

SYSDEVICES, SYSUSAGES, and SYSDATABASES.

Each chunk of disk space assigned to a database is represented by a

row in SYSUSAGES. The SIZE of each chunk is in the row. The dbid of

the database to which the chunk is assigned is also in the row and the

corresponding database name can be obtained from SYSDATABASES. The

device containing the chunk of disk space can be deduced by noticing

which low/high range in SYSDEVICES contains the VSTART specified in

SYSUSAGES.

Whether or not a device is a log device can be determined from the

value of SEGMAP in SYSUSAGES. A 7 indicates LOG and data, 4 indicates

LOG only, and 3 indicates data only.

Example 1

---------

Assume two user databases in addition to the master database and two

database devices in addition to the master device. Device 1 is 4 MB

and was created first (VDEVNO 1), while device 2 is 6 MB and was

created after device 1 (VDEVNO 2). User database 1 was allocated with

2 MB on device 1, 2 MB on MASTER.DAT, and a 1 MB log on device 2. User

database 2 was allocated with 2 MB on device 2 and 2 MB on device 1.

For the first disaster scenario, assume that user database device 1

has been lost and the other devices are intact. According to the

procedure described above, the following should be performed:

1. Re-create the lost device(s) as follows:

a. Bring up SQL Server in single-user mode (-m option).

b. Determine which databases are affected as follows:

SELECT DISTINCT DB_NAME(DBID) FROM SYSUSAGES,SYSDEVICES WHERE

VSTART>=LOW AND VSTART<=HIGH AND

NAME IN ('device1')

Drop the affected databases:

DBCC DBREPAIR(database1,DROPDB)

DBCC DBREPAIR(database2,DROPDB)

c. Drop device 1 with the sp_dropdevice() function.

d. CHECKPOINT and shut down the server, then bring it back up

normally.

e. Re-create device 1 with DISK INIT with a SIZE = 4 MB and

VDEVNO = 1.

2. Re-create and reload the affected database(s) as follows:

a. Re-create user database 1 with 2 MB on device 1, 2 MB on

MASTER.DAT, and 1 MB on device 2.

Re-create user database 2 with 2 MB on device 2 and 2 MB on

device 1.

b. Use the sp_logdevice() function to designate the log for

database 1.

c. Load database 1 and its transaction log from backup. Load

database 2 from backup.

Example 2

---------

For the second disaster scenario, assume that both user database

devices have been lost and MASTER.DAT is intact. The same procedure

applies:

1. Re-create the lost device(s) as follows:

a. Bring up SQL Server in single-user mode (-m option).

b. Determine which databases are affected:

SELECT DISTINCT DB_NAME(DBID) FROM SYSUSAGES,SYSDEVICES WHERE

VSTART>=LOW AND VSTART<=HIGH AND

NAME IN ('device1','device2')

Drop the affected databases:

DBCC DBREPAIR(database1,DROPDB)

DBCC DBREPAIR(database2,DROPDB)

c. Drop devices 1 and 2 with the sp_dropdevice() function.

d. CHECKPOINT and shut down the server, then bring it back up

normally.

e. Re-create device 1 with DISK INIT with SIZE = 4 MB and

VDEVNO = 1.

Re-create device 2 with DISK INIT with SIZE = 6 MB and

VDEVNO = 2.

2. Re-create and reload the affected database(s) as follows:

a. Re-create user database 1 with 2 MB on device 1, 2 MB on

MASTER.DAT, and 1 MB on device 2.

Re-create user database 2 with 2 MB on device 2 and 2 MB on

device 1.

b. Use the sp_logdevice() function to designate the log for

database 1.

c. Load database 1 and its transaction log from backup.

Load database 2 from backup.

Example 3

---------

For the third disaster scenario, assume that all three database

devices have been lost. In this case, MASTER.DAT must be restored

first.

Run BLDMASTR to re-create MASTER.DAT using the original size and

case-sensitivity option.

Bring up SQL Server in single-user mode and restore the latest backup

of the master database. If the dump of the master was made with a

user-supplied dump device, that dump device must be re-added to

SYSDEVICES before the restore can take place.

The quickest way to do this is to INSERT a row into SYSDEVICES. It can

also be done with the sp_addumpdevice() function; however, that

approach requires the INSTMSTR.SQL script to be run first to re-add

the stored procedures.

The server will shut itself down when the restore is complete.

The remainder of the procedure is identical to the second example.

Example 4

---------

For the fourth disaster scenario, assume that only MASTER.DAT has been

lost.

1. Run BLDMASTR to re-create MASTER.DAT using the original size and

case-sensitivity option.

2. Bring up SQL Server in single-user mode and restore the latest

backup of the master database. The same considerations apply as in

the third disaster scenario. The server will shut itself down when

the restore is complete.

If no changes were made to the master database after the most recent

dump was taken, then recovery is complete. Restart the server

normally. This is why it is a good idea to dump the master after such

changes.

If not, then SYSDATABASES, SYSDEVICES, and SYSUSAGES must be brought

up to date. If this must be done, bring the server up in single-user

mode again.

Use the DISK REINIT command to re-create rows in SYSDEVICES for all

database devices added after the most recent dump was taken. The DISK

REINIT command is used because it updates SYSDEVICES just as the DISK

INIT command does; however, it does not format the physical disk file,

therefore the existing data is preserved.

Use the DISK REFIT command to re-create rows in SYSUSAGES and

SYSDATABASES for all CREATE and ALTER DATABASE commands that were

performed after the most recent dump was taken. The DISK REFIT command

scans the physical file associated with each database device listed in

SYSDEVICES. It adds entries in SYSUSAGES to define the space that is

allocated to databases. It also adds the corresponding SYSDATABASES

entries. Some of the information is not reconstructed perfectly. For

example, the original VDEVNO is not assigned because it is not known.

Instead, VDEVNOs are assigned sequentially. The database owner is not

extracted while scanning the physical files; it is just set to "sa".

It is not possible to determine how many SYSUSAGES entries originally

existed. The DISK REFIT command inserts a separate entry for each

different segment type.

When this is done, correct the entries made by DISK REFIT to

SYSDATABASES and SYSUSAGES (if desired) and also add to SYSLOGINS any

login-ids that were not captured by the most recent dump. Then, shut

down the server and bring it back up normally.

Additional reference words: 1.00 1.10 1.11 4.20