Chapter 11: Backup and Recovery

You should back up your databases and transaction logs on a regular basis. In case of system problems, you can restore the backed up files and return to work more quickly than if you had neglected to back up your database. Additionally, every time Microsoft SQL Server starts, recovery is performed on all system and user databases.

The following sections address frequently asked questions about SQL Server 7.0 backup and restore functionality, as well as problems you may encounter when backing up and restoring databases and transaction logs. Additionally, information about resolving performance and insufficient disk space problems related to recovery is included.

Frequently Asked Questions

Here are some frequently asked questions regarding backup and restore functionality, as well as problems you may encounter when backing up and restoring databases and transaction logs.

Q: What are the conditions under which deferred updates occur in SQL Server 7.0?

A: In SQL Server 7.0, all updates are in-place (direct), provided that the column or columns participating in the clustered index key are not changed. If a change is made to a UNIQUE clustered key, the update plan runs in a hybrid mode where the query processor combines direct and deferred updates. If the clustered key is not UNIQUE, all of the updates are run as deferred.

You can use SHOWPLAN to see examples of this behavior. Look for the SPLIT and COLLAPSE operators in the plan. If you find a SPLIT below the clustered index update, one of the clustering keys has been changed. If a COLLAPSE operator is found, the update is running in a hybrid mode. SQL Server collapses delete and insert to the same key values into an in-place update.

Note that this behavior holds true for any index.

Q: Does enabling the torn page detection database option add any measurable performance overhead to a server?

A: The torn page detection option does not add much CPU cost, but it can increase contention on hot pages. With torn page detection off, a page can be accessed while it is being written to disk. This is not true if torn page detection is on.

Q: When does SQL Server check for torn pages?

A: torn page detection is performed whenever a page is read from disk. In practice, this is likely to happen during recovery, because any page on which the write did not complete during normal operations is very likely to be read by recovery (except for nonlogged operations, such as index creation, bulk copy, and so on).

Q: What happens when SQL Server detects a torn page?

A: When a torn page is detected, a severe I/O error is raised. This error closes the connection. The database is marked suspect only if the torn page is detected during recovery.

Q: How can I recover from torn pages?

A: Restoring the database from a backup and rolling the transaction log forward should correct the problem with no data loss.

Q: What situations are most likely to cause torn pages?

A: Lab tests have shown that torn pages are quite likely to happen when disk writes are occurring and power is lost. If you do not have a battery backup or Uninterruptible Power Supply (UPS), you should consider enabling this option.

Q: Will my query result sets be returned in order if I am running in SQL Server 6.5 compatibility mode?

A: If you do not provide a GROUP BY clause explicitly while in SQL Server 6.5 compatibility mode, the query processor adds one. In SQL Server 7.0 compatibility mode, not having an ORDER BY clause means that any ordering is acceptable to the user or application.

GROUP BY is always sorted in SQL Server 6.5 because the only way that SQL Server 6.5 can form groups is by first sorting the data. However, SQL Server 7.0 has other algorithms for grouping data (most involving hashing of some sort) that can work many times faster than sorting the data to form the groups. This prevents SQL Server from paying the penalty for a slower sort if one is not needed.

Adding an index hint does not force order; it only forces a scan of that particular index. If there is no ORDER BY clause in the query, the query processor is free to decide the cheapest execution strategy. This can be either a logical order scan of the index or a physical order scan of the index. The latter scan may return rows out of order for the index. If ordering is required, use the ORDER BY clause.

Q: Why is my reported log space never 0 (zero), even after truncating the log?

A: In SQL Server 7.0, the log truncation granularity is the virtual log file; in SQL Server 6.x it is a page. Consider an example in which a log configuration consists of four virtual log files. Even if there are no outstanding or unreplicated transactions that prevent truncation of the log after backup, at least 25 percent of the log is always in use, thereby causing at least one virtual log file to be marked as busy.

Efficiency is one of the primary reasons for implementing this schema in SQL Server 7.0. In SQL Server 6.x, truncating the log (even to throw it away) requires scanning through the page chain and deallocating pages. In SQL Server 7.0, truncating the log is as simple as changing the status on a virtual log file from a "used but doesn't contain active log" state to "usable."

It is still possible to have a process back up the log when it reaches some level of being full. However, SQL Server 7.0 differs from SQL Server 6.x in that the smallest fullness level that can be achieved through a transaction log backup and truncation is (1/n * 100) percent, where n is the number of virtual log files in the database configuration.

Q: Why does using SET QUOTED_IDENTIFIER ON within my stored procedures not affect the stored procedure's behavior?

A: Stored procedures maintain the QUOTED_IDENTIFIER setting that was active at the time the stored procedures were compiled. Stored procedures go through distinct compile and execute phases, and the entire stored procedure is compiled as a unit. This means that by the time the SET QUOTED_IDENTIFIER statement is executed, the entire stored procedure has already been compiled. Therefore, changes to the setting cannot affect the stored procedure.

Furthermore, because the QUOTED_IDENTIFIER setting actually affects parsing, a change to the setting cannot be caught midway through compilation and enforce a mode switch.

Pushing the unit of compilation down to individual statements can potentially resolve this situation. In this case, doing so would require that all subsequent statements be recompiled when a SET QUOTED_IDENTIFIER statement is encountered, thereby negating the performance benefits of a stored procedure.

Q: How does SQL Server decide whether to use indexes?

A: SQL Server is a cost-based query optimizer, not a rule-based system. Being cost-based, SQL Server is syntax independent and literally analyzes the cost of each execution strategy based on the projected number and size of the result sets. If you want to force table scans or index strategies, you can guarantee their use only by using index hints. This is generally not recommended, although at times it may become necessary.

It is difficult to generalize and specify a basic set of rules under which the query processor will always pick a table scan or index seek. In general, the use of an index access strategy is favored over table scans unless the choice is very clear, for example, if all rows are wanted. Table scans acquire shared locks and thereby can reduce concurrency (that is, multiuser access) greatly. As a result, table scans are avoided whenever possible.

Scan decisions are based on anticipated execution costs, so there is no size limit below which indexes are ignored. However, if the entire table fits on a single page, there are very few cases, if any, in which indexes will be of value.

Q: DBCC SHRINKDB is not shrinking my log. Why?

A: DBCC SHRINKDB shrinks data files on a per-file basis, but shrinks log files as if all the log files existed in one contiguous log pool. The shrinking of log files is not immediate and does not occur until the active portion of the log moves. As updates are performed on the database, the shrink operation occurs at checkpoints or during transaction log backups. Each log file is marked with the target_percent for the shrink operation. Each subsequent log backup or log truncation attempts to shrink the file as close as possible to the target_percent size. Because a log file can be shrunk only to a virtual log file boundary, it may not be possible to shrink a log file to a size smaller than the size of a virtual log file, even if it is not being used.

Q: Why does the creation of a UNIQUE index stop on multiple null values in a column?

A: SQL Server does not support the occurrence of multiple null values in a UNIQUE index. For the purposes of the index key, a NULL is considered a value and can occur only once in a given UNIQUE index.

Q: What is the bulk copy TABLOCK hint or property?

A: The TABLOCK hint or property increases the performance of bulk copy operations by reducing the locking contention of the operation. SQL Server 7.0 introduces a new bulk update (BU) lock type. BU locks have reduced contention with other types of locks acquired by non-bulk copy operations. Bulk copy operations use bulk update locks by specifying the TABLOCK hint or property on either the bcp command prompt utility, the BULK INSERT Transact-SQL statement, the bulk copy API supported by the SQL Server ODBC driver, or the IRowsetFastLoad interface supported by the OLE DB Provider for SQL Server. If TABLOCK is not specified in a bulk copy operation, it acquires row locks on the bulk copied rows. These row locks have more overhead than a BU lock taken at the table level.

Tips for Using Backup and Restore

Several differences between the backup and restore functionality of Microsoft SQL Server 7.0 and earlier versions warrant a brief explanation and tips about how to exploit their benefits fully.

In SQL Server 7.0, database creation and backup
operations use parallelism

The ALTER DATABASE statement uses parallelism to extend each file on a distinct logical drive. Likewise, the BACKUP DATABASE statement uses a separate thread per logical drive. Although this is a useful performance feature in certain circumstances, it does not mean that creating many logical drives is generally preferred over combining those drives in a RAID array. SQL Server uses asynchronous I/O and can usually leverage all available I/O capacity in a multidisk RAID array. Combining most available drive spindles to a large RAID array (of the appropriate type) often provides the best performance for the broadest circumstances.

SQL Server 7.0 documents in the error log
backups with INIT or FORMAT

It may appear that SQL Server backup no longer documents whether a backup device has been initialized in the error log. Actually, this is documented, but not in the same format as in SQL Server 6.5.

The FILE= attribute in the error log provides the necessary information. For example, if the error log says FILE = 1, either an INIT or a FORMAT command was issued.

The following examples show SQL Server 7.0 messages in the error log:

1998-11-17 18:17:47.98 backup   Database backed up with following information: Database: bck_db, creation date and time: 11/17/98(18:09:42), pages dumped: 107, first LSN: 4:32:1, last LSN: 4:34:1, sort order: 52, striped: 0, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'c:\temp\fb1.bck'}).

  

1998-11-17 18:33:54.28 backup   Database backed up with following information: Database: bck_db, creation date and time: 11/17/98(18:09:42), pages dumped: 106, first LSN: 4:42:1, last LSN: 4:44:1, sort order: 52, striped: 0, number of dump devices: 1, device information: (FILE=2, TYPE=DISK: {'c:\temp\doit.bck'}).

  

The following examples show SQL Server 6.5 messages in the error log:

98/12/07 10:54:00.30 backup   DATABASE dumped with following info: Database Name:pubs, Creation Date and Time:Dec 7, 98(10:53), Pages dumped:161, Current Sequence:36075 11572328, Sort Order:52, Striped:NO, Number of Dump Devices:1, device info:(VOLID=SQL0001 NAME=D:\temp\65pubs.dat TYPE=DISK FILE=1)

  

98/12/07 10:54:13.34 backup   DATABASE dumped with following info: Database Name:pubs, Creation Date and Time:Dec 7, 98(10:54), Pages dumped:161, Current Sequence:36075 11572328, Sort Order:52, Striped:NO, Number of Dump Devices:1, device info:(VOLID=SQL0001 NAME=D:\temp\65pubs.dat TYPE=DISK FILE=2)

  

Differential backup time considerations

A differential backup records only changes made to the database since the last full backup. Although typically much faster than a full database backup, elapsed time for SQL Server 7.0 differential backup is roughly proportional to the allocated space in the database. Consequently, a differential backup on a very large database may take longer than expected, even if few changes have occurred since the last full database backup.

If the amount of time required for a differential backup is too large, you should consider doing log backups instead.

Common Backup and Restore Problems

Here are the solutions to some problems you may encounter when backing up and restoring databases and transaction logs:

Troubleshooting Orphaned Users

When restoring a database backup to another server, you may experience a problem with orphaned users. This scenario displays and resolves the problem:

  1. Alias the login janetl to dbo by executing sp_addlogin.

    sp_addlogin 'janetl', 'dbo'

      
  2. Back up a database. In this example, back up Northwind.

    BACKUP DATABASE Northwind

    TO DISK = 'c:\mssql7\backup\northwnd'

      
  3. Drop the database that was just backed up.

    DROP DATABASE Northwind

      
  4. Drop the login.

    sp_droplogin 'janetl'

      
  5. Restore the backed up database.

    RESTORE DATABASE Northwind

    FROM DISK = 'c:\mssql7\backup\northwnd'

      

    The janetl login is not allowed into the Northwind database unless the guest login is allowed. Even though the janetl login has been deleted, it still shows up (as an orphaned row) in the sysusers table:

    USE Northwind

    SELECT *

    FROM sysusers

    WHERE name = 'janetl'

      

To resolve orphaned users

  1. Add a temporary login using sp_addlogin. Specify the security identifier (SID) (from sysusers) for the orphaned user.

    sp_addlogin @loginame = 'nancyd',

    @sid = 0x32C864A70427D211B4DD00104B9E8A00

      
  2. Drop the temporary alias that belongs to the aliased SID using sp_dropalias.

    sp_dropalias 'nancyd'

      
  3. Drop the original user (now orphaned) using sp_dropuser.

    sp_dropuser 'janetl'

      
  4. Drop the original login using sp_droplogin.

    sp_droplogin 'nancyd'

      

Recovery Performance

Recovery time is determined by how much work has been done since the last checkpoint, and by how much work has been done by all active transactions at the time of the server crash. Microsoft SQL Server uses the recovery interval configuration option to set the maximum number of minutes per database that SQL Server needs to recover databases. This recovery interval option controls checkpoint frequency. For an online transaction processing (OLTP) system (using short transactions), recovery interval is the primary factor determining recovery time.

After installation, SQL Server sets recovery interval to a default value of 0. As long as recovery interval is at the default setting and long-running transactions are not present, recovery for each database should take approximately one minute or less. If long-running transactions were active at the time of the server crash, recovery time is controlled by the time it takes to roll back the effects of these transactions.

If recovery routinely takes significantly longer than one minute for a database, recovery interval has a value of 0, and there are no long-running transactions to roll back, consider contacting your primary support provider to resolve the recovery performance problem.

Recovery reports progress (based on the virtual log files for a database). At the beginning of recovery, recovery analyzes and scans the log since the last checkpoint. Based on the analysis phase, recovery estimates how much log will be read during recovery. The amount of log read is used to report recovery progress.

If recovery interval is changed from the default value, database recovery takes that many times longer to complete. For example, if recovery interval is changed to 10, recovery would take approximately 10 times longer to complete than if recovery interval remained at the default setting of 0.

When growing the log, use larger chunks to ensure a shorter startup time for SQL Server. The greater the number of small chunks, the longer SQL Server takes to initialize them.

If a long-running transaction is terminated, let the server finish the rollback process. If you are concerned about the length of the rollback process, ask your system administrator to confirm that activity is taking place on the server. Terminating the server process during the rollback of a long-running transaction results in long recovery time.

If you have a long-running transaction and a crash occurs during this transaction, SQL Server begins the recovery process. This may take some time. If you are concerned that this recovery process is taking too long and you believe it is halted, contact your system administrator.

Insufficient Disk Space

During recovery, it is a rare but possible occurrence for the server to require additional log or data space. If additional space is unavailable and either the log or data files cannot grow, the server:

To resolve the 9002 error message and bring the database online

  1. Free disk space on any disk drive containing the log file for the related database. Freeing disk space allows the recovery system to grow the log file automatically.
  2. Reset the suspect status by executing sp_resetstatus.
  3. Run recovery by executing DBCC DBRECOVER (database).

    Or

  4. Free disk space on a different disk drive.
  5. Move the transaction log files with an insufficient amount of free disk space to the disk drive in Step 1.
  6. Detach the database by executing sp_detach_db.
  7. Attach the database by executing sp_attach_db, pointing to the moved files.

    Or

To resolve the 1105 error message and bring the database online

  1. Free disk space on any disk containing a file in the filegroup mentioned in the 1105 error message. Freeing disk space allows the files in the filegroup to grow.
  2. Reset the suspect status by executing sp_resetstatus.
  3. Run recovery by executing DBCC DBRECOVER (database).

    Or

  4. Free disk space on a different disk drive.
  5. Move the data files in the filegroup with an insufficient amount of free disk space to the disk drive in Step 1.
  6. Detach the database by executing sp_detach_db.
  7. Attach the database by executing sp_attach_db, pointing to the moved files.

    Or