Prepare (Now) to Rebuild Your Server

Andrew Zanevsky

You'll want to keep this article. Andrew provides an invaluable checklist of items that you'll need to rebuild a SQL Server from scratch. Unfortunately, database dumps alone are usually insufficient to reincarnate your system. You need additional pieces of information and files in order to do a complete restoration. Read this month's column carefully and profit from Andrew's wisdom.

If aliens from another star system abducted your SQL Server tonight, what would you do tomorrow? Do you have a sound plan in case of a complete loss of your database server? Hopefully, you perform periodic database backups according to a rigorous schedule and keep backup media either offsite or at least somewhere other than the same room as your SQL Server. But will you be able to completely recover from these backups? Why risk finding out that a small but important piece of information is missing only after you begin the recovery? Your career could be on the line.

It's quite possible that after successfully restoring all databases you'll spend several days figuring out other missing bits and pieces, such as command files, T-SQL scripts, scheduled tasks, network shares, and so on. In order to avoid this and simplify your SQL Server recovery process, I invite you to review the items that I mention in this article and create your own checklist. [You might also want to review Paul Munkenbeck's February 1997 article "A DBA's Remote Control Toolbox Explored." -- Ed.]

SQL Server emergency kit

I think it's a good idea to prepare an emergency kit in case you ever need to completely rebuild your SQL Server from the ground level. One of my clients has put all the documentation and software he'd need in one box and stored it in a secure location. It contains:

We haven't had to use this box yet -- aliens undoubtedly found more valuable SQL Servers to abduct!  But recovery procedures have come handy on a number of occasions -- we rebuilt two development SQL Servers, moved a production server onto a new hardware system, and set up a new machine once. Every time, I essentially had a chance to verify the accuracy of my recovery documentation and further refine it.

Hopefully, you'll never experience a loss of SQL Server due to a theft, fire, or natural disasters. But you'll certainly appreciate having good recovery procedures in situations like the ones I mentioned. Upgrading NT to a new version or moving all databases to a bigger and faster server might turn out to be a disaster unless you're prepared.

Know your hardware

Your recovery documentation should describe the hardware requirements for your SQL Server. In case the hardware is physically lost or destroyed and you need to replace it, you'll want to have detailed specification notes readily available to help you purchase or borrow a new server right away.

Also, add a section on minimal requirements for your server. It behooves you to know the absolute minimum configuration that could get you (or your client) up and running fast. It's my experience that disk space is the most restrictive parameter. You can't load your databases onto a smaller storage system, but you can operate on a server with fewer processors or slower ones. Also bear in mind that a CD-ROM drive isn't an absolute necessity if you can access one across the network.

When you're documenting your hardware specification, make sure to include not only the total amount of disk space, but also information about disk partitioning and RAID configuration. Although a replacement storage system doesn't have to be set up precisely the same way on the new server, disk partitions and names are very important. If your new server has different drive letters and partition sizes deviate from those on the original server, then you'll have to do extra work changing physical device locations and correcting any hard-coded references to drive letters. This manual work creates a potential for errors and slows down the recovery process. Ideally, you need a new server with partitions named the same as on the old one and partition sizes the same or larger. In other words, be sure to document these details.

Strictly speaking, the amount of memory installed on SQL Server doesn't have to be the same as on the old machine. A server with less memory might still allow you to work, though it might be significantly slower.

Versions and service packs

SQL Server 6.0 and 6.5 currently each have four service packs (and remember, SPs are cumulative, so if your server is up to SP3, all you need is SP3 –– you don't need to apply SP1 and SP2 first). When you consider the combinations and permutations of SQL Server with Windows NT and e-mail clients, you'll be surprised what a large number you get. If your site has numerous SQL Servers, some of them might use older versions or don't have most recent service packs applied. When you restore SQL Server, you'll obviously want to use the same software as was installed on the lost server. Therefore, you need to keep track of what's installed where –– perhaps in easy-to-read table format.

You might execute extended stored procedure master..xp_msver in order to obtain information about NT and SQL Server versions as well as the number of processors, processor type, and the total amount of memory. Unfortunately, it won't tell you which service packs have been applied in "plain" English, so you'll have to interpret the version number. The ERRORLOG file also contains SQL Server version information. [Another approach is to piece together the NT version info from the Help/About dialog box and execute a SELECT @@version to get SQL Server version information. However, you still need to be able to interpret the output in the form of "Microsoft SQL Server 6.50 - 6.50.240 (Intel X86)" as SQL Server 6.5 with SP2. -- Ed.]

Have version numbers documented in your recovery manual.

E-mail connectivity

If your SQL Server is configured to send or receive e-mail messages, it can be very difficult (and time-consuming) to duplicate the setup on a new server. Document all of the steps that you've taken in order to configure and activate your SQL Mail client and remember that you need a separate mail account to run SQL Mail. Make sure that the name of the account appears in the recovery manual and that key people remember the password. The name and password are required during SQL Server installation.

Setup and configuration options

Your recovery manual should include all setup and configuration choices that you need to make during SQL Server installation. Here's the checklist:

Devices and databases

You might have learned by painful experience that just having a database dump file isn't enough to successfully restore it. First, you need to create a database of the same size or larger, with the same order, type, and size of data and log segments. And even if you create a database large enough to load the backup, you'll probably still end up with error 2529 once you restore it. Information about segments is stored in system table master..sysusages. You might include it in your recovery documentation. I also include output produced by system stored procedures master..sp_helpdatabases, master..sp_helpdevices, and master..sp_helpservers (in case there are any remote servers).

In addition to extracts from system tables, I recommend saving the output of system stored procedure master..sp_help_revdatabase and also including it as an appendix to your recovery manual. That way, you'll have both a hard copy (in the appendix) and a text file of all the T-SQL commands needed to recreate each device, database, place data, and logs on the right segments. It will also show which database options are set. When you need to rebuild a server, the script generated by this procedure is a lifesaver. The only deficiency of this script is that it doesn't provide an option to build new databases FOR LOAD. I always manually modify all CREATE DATABASE and ALTER DATABASE commands to include the FOR LOAD option. It saves me several hours during recovery.

Logins and passwords

Consider making a complete backup of the master..syslogins table every day via bcp. It's much easier to allow updates on system tables and reload logins than to set them up manually. I use a command similar to the following to make a backup:

exec master..xp_cmdshell 
'bcp master..syslogins out 
c:\temp\syslogins.bcp -c -Usa -P'

I chose to use xp_cmdshell because I run this as part of a stored procedure, but you might as well execute bcp directly. In this example, I use C:\TEMP directory and my SQL Server is set up with no password for sa (don't worry, it's just SQL Workstation on my notebook PC, so there's no danger that a hacker will get in). You have to substitute your own settings. Remember that the syslogins content is sensitive and you should store it in a secure directory where only the DBA has access. It contains all user passwords in an encrypted form, but it only takes between several hours and a couple of days to decrypt a password by a brute-force attack.

When you need to reload syslogins, you can execute a script similar to the following:

exec master..sp_configure 'allow updates', 1
go
reconfigure with override
go
-- create a table with the same 
-- structure as syslogins
select * 
into   ##logins 
from   master..syslogins 
where  1=0

-- restore syslogins into a temporary table
exec   master..xp_cmdshell 
  'bcp tempdb..##logins in 
  c:\temp\syslogins.bcp -c -Usa -P'

-- report logins that cannot be restored
select 'cannot setup login', a.name, 
       'because suid is used by', b.name
from   ##logins a, master..syslogins b 
where  a.suid = b.suid 
and    a.name != b.name

-- restore syslogins
insert master..syslogins 
select * 
from   ##logins a 
where  not exists ( 
    select * 
    from   master..syslogins b 
    where  b.suid = a.suid )
msdb gotcha

You can restore the msdb database from a backup just as you would any user database, and this is very convenient when you need to restore all scheduled tasks. But you might encounter a problem if the server that you're rebuilding was upgraded through several SQL Server versions. In old versions, msdb used to be smaller -- 2M for data and 2M for transaction log. In SQL Server 6.5, it's 6M for data and 2M for log. When you upgraded an old server, msdb size increased. The problem is that an upgraded msdb has three segments: 2M data segment, 2M log segment, and a new 4M data segment added during the upgrade. If you restore a backup of this database into a freshly built msdb on a newly installed SQL Server, you'll get data and log segments overlap. New servers have just two segments in msdb -- 6M data and 2M log.

This is another example of something you wouldn't expect to have to deal with during a database recovery, but it's a nasty issue that can bite you if you're not careful. One possible solution to this problem is to drop the genuine msdb, then create a new database under the same name with a 2M data segment and a 2M log segment. Then you need to alter it by adding another 4M data segment. This database might be used to load an old msdb dump.

Another way to handle the problem is to create a dummy database of 8M, load your msdb backup there, and then copy the tables' contents directly from this database into the msdb created at installation. You'll need to copy tables MSwebtasks, MSWork, sysalerts, sysappname, syshistory, syshostname, sysnotifications, sysoperators, sysservermessages, systasks, and sysvolumelabel, and some of them might be empty. If you take this approach, don't forget to stop SQLExecutive service until you're done.

If you only have a few scheduled tasks, you might decide to manually reschedule them in case of recovery. If you opt for this approach, remember that you need to preserve the list of tasks and their frequency, and also that you'll lose task execution history if you reschedule. Try to avoid msdb hassles by anticipating potential problems and carefully documenting the desired restore action for msdb in the recovery manual.

Customized objects

You probably have your own system stored procedures installed in the master database. Some DBAs even create their own tables or other objects in master. Ask yourself what happens to all of these objects when you need to recover. Do you want to restore master from a backup? It's possible to avoid performing this task if you make certain preparations in advance.

First, don't create any custom-made tables in master. Limit your own objects there to system stored procedures (since you have no other choice with them). If you need a special table, create a separate database for DBA purposes.

Second, keep a complete T-SQL script needed to recreate all your own system stored procedures in master. Once you rebuild your SQL Server, execute this script to restore your procedures.

Things outside of SQL Server

Most SQL Servers have a certain number of command files, data files, and executables stored outside of SQL Server databases. They might be used for DBA maintenance tasks, data loads, or different application purposes. You need to back up these files together with your database dumps. If you need to rebuild server on a new box, you'll want all of your command and data files restored as well as databases.

Don't forget DLLs with your extended stored procedures. Back them up because they aren't a part of SQL Server databases and aren't protected from a loss. Keep the list of extended stored procedures, because you'll need to re-register them once you restore the server.

Specific directories and shares on the SQL Server and the network might also be critical to your applications and scheduled jobs functioning. Keep a list of those in the recovery document. It's also necessary to note permissions on these shares and directories (folders) that might be important for SQL Server.

If your T-SQL scripts or command files contain references to specific drive letters, you need to document every such place in the recovery manual and keep it current. If you must restore to a server with different partitions and drive names, it will allow you to quickly find all places that might need correction.

Even though SQL Server provides a very good scheduling mechanism that's more flexible than the Schedule service on Windows NT, some installations use the NT scheduler or a third-party scheduling package. If you have any database-related tasks scheduled outside of SQL Server, be sure to include their descriptions and frequency in the recovery manual. In order to get the list of tasks scheduled on NT, you can simply execute the at command from the command prompt.

If you're lucky, you won't ever have to go through the recovery process. Extra-terrestrials might go elsewhere and steal databases from other people. But you must have a plan in case anything destroys your SQL Server. It'll pay you back if you must use it even once. And even if it only collects dust, you'll sleep better knowing you've got the bases covered.

View sample code for this article here

Andrew Zanevsky is a Microsoft SQL Server and Sybase DBA, database and data warehouse designer and developer. He's writing a book, Transact-SQL Programming, with two co-authors that's scheduled for publication in the secpnd quarter of this year by O'Reilly & Associates. He provides his consulting services in the Chicago area through his firm AZ Databases, Inc. 708-609-8783, fax 847-419-0190, zanevsky@azdatabases.com.