Suspect Database, Registry, and RAISERROR

Kalen Delaney

Yesterday, when I tried to connect to SQL Server using SQL Enterprise Manager (EM), one of my databases was grayed out and had the word “suspect” next to it. What does this mean, and how can I fix it?

SQL Server will mark a database “suspect” when—for any reason—the database can’t be accessed. SQL Server marks a database suspect by turning on one of the bits in the status column of the sysdatabases table in the row for the suspected database. If you look in the documentation for the sysdatabases table, you’ll see an explanation of most of the bits in the status column.

There are some very obvious reasons why a database might be marked suspect—and some not so obvious. One of the easiest reasons to troubleshoot is the case where one of the devices that the database uses isn’t available. This can happen when the physical file has been moved or renamed at the NT level. It can also happen if the device is on a drive with its own power supply and the drive hasn’t been powered on when SQL Server is started. In either of these cases, as SQL Server starts up, it will record (in the error log) that the device can’t be opened. You should inspect your SQL Server error log for any message indicating this.

If it does turn out that a device is unavailable, you’re generally in luck—the fix is easy. You must first make sure that the device is available, either by renaming the physical file to the expected name or by turning on the drive’s power. Then you must manually update the status column to turn off the status bit. An easy way to do that is to set the status to 0 so that all special bits are off, but it’s actually safer to just turn off the suspect (256) bit using the bitwise AND and NOT operators, as shown below:

update sysdatabases
set status = status & ~256
where name = ‘<name of suspect database>’

If you’re interested, you can find out more about “Bitwise Operators” in Books Online (BOL). Also remember that in order to manually update a system table, you must enable such updates by turning on the configuration option “Allow Updates.” And, by the same token, remember to turn the option off when you’re finished modifying the table. After turning off the suspect bit, you should stop and restart SQL Server. If the problem was only that a device was unavailable, and the device is now available, you should see the database appear normally, without the suspect flag.

If the database is still marked suspect, though, or if there was never a device offline, the problem is more serious. You can check the error log to see if any other possible reasons are recorded as the SQL Server attempts to recover the database, but in many cases you won’t be able to make use of any such information. Typically, the problem is due to some kind of internal corruption of the transaction log that keeps that database from being recovered.

If this is the case, you have two options. The safest option is to restore from a recent backup of the database. This gives you the greatest chance, guaranteeing the integrity of your data. If your latest backup isn’t recent enough, the other option is to put the database into “Emergency Mode.” Emergency Mode is enabled by setting another bit in the status column of sysdatabases, but the actual value corresponding to this bit setting is misdocumented. BOL says that the value for a database in Emergency mode is 32768, but it really needs to be a negative value—that is, -32768. Once again, you’ll need to enable updates to system tables and then turn on this Emergency mode bit using the bitwise OR operator:

update sysdatabases
set status = status | -32768
where name = ‘<name of suspect database>’

Stop and restart SQL Server. The Emergency Mode bit will tell SQL Server not to try to recover this database, but simply to make it available. If the corruption is severe, you still might not be able to access the database at all—but in many cases, you will be able to get to your database. Then you can use bcp, the Transfer Management Interface, or a SELECT/INTO operation to get the data to another database.

Keep in mind that the data hasn’t been through recovery and its integrity isn’t guaranteed. If there were transactions in progress in this database when the server came down (before it was marked suspect), the tables involved might reflect partial transactions or invalid data. Furthermore, there’s no way to tell what’s good data and what isn’t. If getting some of the data out is more important than a (possibly slight) loss of integrity, you can try using Emergency Mode. It should never be your first response to a suspect database, however, because of potential integrity problems.

I tried to look up my sa password in the Registry, as you described in your November 1997 column (see “SQL Essentials: Keys, Devices, and In-Place Updates”), but I couldn’t find it. Are there some cases for which this “trick” won’t work?

Yes, there are a couple of additional considerations when using the Registry key HKEY_CURRENT_USER/SOFTWARE/MICROSOFT/MSSQLSERVER/SQLEW/Registered Servers/SQL6.5 to find your sa password.

First of all, if you’ve registered your SQL Server using integrated security, the password won’t be stored at all. It will only be stored in the Registry if you’ve actually registered your server by typing in a password for sa.

In addition, you should be aware that you can configure how “private” your EM configuration information is. If you’ve defined several server groups and registered several servers, do you want everyone who uses this particular machine to use the same configuration, or should every user logged in separately maintain his or her own registrations in EM? The default is to have all users of the same machine share this information, but that can be changed. In EM, from the Tools Menu, choose Preferences/Configure, and you’ll get a three-tabbed dialog box. In the first tab (Application), you can see that the default setting for Server Registration Information is to “Store User Independent.”

If you deselect this box, each separate login will maintain its own registration information, and in this case, the information will be stored in a different Registry key. If you’ve unchecked this box, you might need to look in HKEY_LOCAL_MACHINE/SOFTWARE/MICROSOFT/MSSQLSERVER/SQLEW/Registered Servers/SQL6.5 to find the registration information. I haven’t been able to determine exactly when it shows up one place or the other, and sometimes I’ve found it in both places. Just be sure you look under both Registry keys before you give up. (Thanks to Martin Widyono for the suggestion to use this second Registry key!)

I’m trying to find a way to stop a script from executing when a particular error condition is encountered. I read in BOL that generating an error with a severity of 127 will cause the script to be terminated. I’m trying to test this using isql/w and I can’t get it to work. Can you help?

There a couple of different things going on here. In BOL, the discussion of severity 127 refers only to the text-based isql and not to isql/w. In addition, there’s a typo in the description. If you look at the example of using this functionality, you’ll see the following:

RAISERROR(50001, 10, 127)

If you refer back to the description of the RAISERROR command, you’ll see that the third argument is for the “state” and not for the severity. So, you need to set the state to 127—not the severity. In order to run the previous command, you should have previously executed sp_addmessage to define text for error 50001. Alternatively, you can simply supply a literal message instead of the error number. If you put the previous command in a script, and run isql from a command prompt with this script as input, you’ll see that after the message is returned, the rest of the script isn’t executed and you’re back at the command prompt. You can try something like this to see for yourself. Create this script and save it in a file:

use pubs
go
raiserror (‘Exiting script NOW’, 10, 127)
go
select * from publishers
go

Now run the script from a command prompt:

C:\isql /Usa /P<password> /S<servername> /i<location of               
   script file> 

You should see that the error message is returned, but the contents of the publishers table is never displayed. You can also stop a script from within isql/w by setting the severity (not the state) to a value between 20 and 25. There’s also an error in BOL in the description of this functionality, as it’s reported that you can use a severity between 19 and 25. This is incorrect; severity 19 won’t work. From an isql/w window, you can type a script very similar to the previous one.

use pubs
go
raiserror (‘Exiting script NOW’, 20, 1) with log
go
select * from publishers
go

SQL Server does require that any RAISERROR with a severity over 18 be written to the error log (and NT application log). If you execute the previous script, you’ll see that the connection is terminated by the RAISERROR, but it’s terminated before the error message can be returned to the client. The client will simply get a message that the connection is broken (and, of course, won’t get any data from the publishers table), and the only way to see what happened is to look in the SQL Server error log or use the NT Event Viewer.

Can you tell me why the initial value for procedure cache is set at 30 percent? This seems high to me, but how can I tell if it’s really too big?

The configuration value for procedure cache tells SQL Server how to divide memory between procedure and data cache at the time SQL Server is started. You’ve configured a total amount of memory to give to SQL Server, and from this some space is used for the SQL Server executable code, some is used for special internal structures, and some is used to allow for your configured number of open objects, open databases, locks, and user connections. The remaining memory is split between data cache—where all data, index, and log pages will be manipulated—and procedure cache. Procedure cache holds compiled plans for any stored procedure that needs to be executed and will also temporarily hold plans for ad hoc queries and server cursors. In addition, pages needed for sorting will be taken from procedure cache. (The number of pages taken at a time for sorting is configurable using the “sort pages” configuration option.) The value configured for “procedure cache” is a percentage of this remaining memory that’s to be allocated to procedure cache, and the rest of the memory will be used for data cache.

The initial value of 30 percent was chosen because that’s the minimum amount of procedure cache needed to run EM if your SQL Server is started with the minimum amount of total memory. As your total memory increases, your procedure cache size will increase almost propor-tionally. For higher amounts of total memory, you might find that you’ve over-configured for procedure cache and you can reduce the percentage value. Reducing the percentage given to procedure cache means that the percentage given to data cache will increase, and this is almost always a good thing. You can determine the actual size of procedure and data cache by running the command:

DBCC MEMUSAGE

The first section of output will show you exactly how memory is divided up, and the figures are given in three different units: MB, pages (2K each until SQL Server 7.0 ships) and bytes. Here’s some sample output:

Memory Usage:
   Meg.   2K Blks   Bytes
 
   Configured Memory:   16.00   8192   16777216
   Code size:   2.45   1254   2568192
   Static Structures:   0.26   131   267472
   Locks:   0.29   147   300000
   Open Objects:   0.11   59   120000
   Open Databases:   0.02   13   26080
   User Context Areas:   0.75   383   783870
   Page Cache:   8.40   4299   8803456
   Proc Headers:   0.20   104   212036
   Proc Cache Bufs:   3.40   1739   3561472

The size of the data cache is listed as Page Cache, and in the above report, it’s using 4299 pages. Procedure cache is the total of the Proc Headers and Proc Cache Bufs values, so in this report, procedure cache is using 1843 pages. Prior to SQL Server 6.5, it was very difficult to know whether procedure cache space was being wasted. This latest release, however, added several Performance Monitor counters to allow you to determine whether you’ve allocated too much space for procedure cache. Since SQL Server will usually keep procedure plans in cache even when they aren’t being executed, there are Perf Mon counters to show us how many procedure cache buffers are active (contain currently running procedures) and how many are used, which includes plans not currently running, but using space in procedure cache. In PerfMon, if you look under the SQL Server-Procedure Cache Object, you’ll see counters for buffers used and buffers active—as well as for the maximum buffers used and active. These Max counters are high-water marks and show you the maximum value ever reached during a monitoring session. You can select the single counter Max Procedure Cache Buffers Active %, and watch it on a PerfMon chart. If the value never goes above 75 percent, I’d say you have too much space for procedure cache. You can try lowering SQL Server’s configured value for procedure cache until you find one where the Performance Monitor value for Max Procedure Cache Buffers Active % gets up to 75 percent. I’d leave this extra 25 percent available so there’s room for some inactive procedures to remain in cache and allow faster access to those procedures.

Kalen Delaney started working with SQL Server in 1987, when she worked for Sybase in Tech Support. Since then, she’s done SQL Server training both for Sybase and for Microsoft. Delaney currently provides independent training and consulting from her Seattle-area consultancy, Solid Quality Learning, Inc. (www.sqlinc.com). Kalen_Delaney@compuserve.com.