Coding T-SQL Like an Expert

Paul Munkenbeck

To gain competence in a language like Transact-SQL (T-SQL), it's not enough just to know the syntax and grammar. Knowledge of good techniques comes from experience and a good mentor. In this article, Paul describes some of his favorite tips and tricks as used by the experts who developed SQL Server's system procedures. Shoulders of giants and all that . . .

With more than a decade of DBA experience under my belt, I've written a lot of code, but I've also read a lot of code. Frankly, I consider the code in the system stored procedures to be a real treasure chest. Quite often I've come across a scrap of code that's made me think, "I didn't know you could do it like that!" and ultimately, I've found a "home" for the idea. I thought it would be worthwhile to share some of my favorites with you. Although the research for this article was from the version 6.5 procedures, I've checked that the techniques are still perfectly valid under version 7.0.

Studying (and mining) the system stored procedures
SQL Server comes with many stored procedures for retrieving or manipulating data stored in the system tables. Most of these are created in the master database, although those that manage scheduled tasks for SQL Executive can be found in the msdb database. The system stored procedures are well-documented in Books Online. But not everyone realizes that you can view the code of any of these procedures under Enterprise Manager (EM) just like the stored procedures you've developed yourself.

System stored procedures all have names that start with the prefix sp_ (in lower case). SQL Server recognizes these and gives them some special characteristics. You can also create your own stored procedures with names starting with sp_, and they'll inherit these special properties:


As well as the sp_ procedures, there are also some system extended stored procedures, whose names are prefixed with xp_. Each of these carries out its processing using a dynamic link library (DLL) rather than interpreted T-SQL code.

But now let's look at some of the techniques of the experts.

EXECUTE()
The EXECUTE() statement, along with the character string you pass it, works hand in glove with system stored procedures, and you're likely to find yourself using it time and again. You can construct the character string from any number of literals and string variables. At runtime, EXECUTE() resolves all the values and executes the string as a command.

For instance, the final report in procedure sp_helprotect contains a SELECT statement with an ORDER BY clause. The column used in the ORDER BY depends on the parameters chosen by the user, so it's specified as a variable that's substituted for a real column name at runtime. Here's an example that illustrates the technique using the authors table in the pubs database:

 DECLARE @OrderBy varchar(255)
 SELECT @OrderBy = 'au_lname'
 EXECUTE('
 SELECT au_fname, au_lname, phone
 FROM authors
 WHERE state = ''CA''
 order by ' + @OrderBy + '
 ')


Note that if you need to include a literal string like `CA' within the EXECUTE string, then you need to enclose it in pairs of single quotes. Within a delimited string, SQL Server interprets two delimiters together as the character itself rather than an end of string marker.

Reports with dynamic column widths
How often have you produced a report that contains variable length column data and ended up with mostly white space? Isn't it amazing how many of your varchar(255) columns end up with fewer than 15 characters in them? Wouldn't it be convenient if you could shrink the column width in the report so it just fits the longest piece of real data? Well, system stored procedure sp_lock2 shows you how to do it.

Once again, I've taken the principle behind the stored procedure code and used it in a more typical report, using the pubs database. Suppose you're enhancing the previous report against the authors table, showing first name (au_fname), last name (au_lname), and phone number for all authors based in California. The name fields are both varchar, whereas phone is fixed length. First, determine the longest first and last names for the selected authors:

 DECLARE @MaxLen_lname int, 
   @MaxLen_fname int
 SELECT @MaxLen_lname 
   = isnull(max(datalength(au_lname)),8),
   @MaxLen_fname 
   = isnull(max(datalength(au_fname)),8)
 FROM authors
 WHERE state = 'CA'


Once this has run, the @MaxLen variables hold the widths of the longest names. The isnull function copes with the possibility of the max() function returning a NULL, and, in this case, it sets the width to 8. This is simply the length of the column title -- au_lname. Guess how we're going to code a report that uses variables to set column widths dynamically? That's right, with EXECUTE():

 -- convert width values to strings for EXECUTE:
 DECLARE @width_fn varchar(10),
   @width_ln varchar(10)
 SELECT @width_fn=convert(varchar,@MaxLen_fname),
   @width_ln=convert(varchar,@MaxLen_lname)
 -- run the report:
 EXECUTE ('
 SELECT au_fname=SUBSTRING(au_fname,1,'+@width_fn+'),
   au_lname=SUBSTRING(au_lname,1,'+@width_ln+'),
   phone
 FROM authors 
 WHERE state = ''CA''
 ')


Although I've called the report columns au_lname and au_fname -- the same as the original columns -- they are, in fact, new columns derived from a computed expression. Because of this, SQL Server calculates their widths as the lengths of the result of the substring expressions. By the time the EXECUTE() string has been resolved, the length of each substring is exactly the value of the @MaxLen variables. The result of the query looks something like the following:

 au_fname au_lname   phone        
 -------- ---------- ------------ 
 Johnson  White      408 496-7223 
 Marjorie Green      415 986-7020 
 Cheryl   Carson     415 548-7723 
 Michael  O'Leary    408 286-2428 
 Dean     Straight   415 834-2919 
 -- other names omitted


Isn't that neat? Try it for yourself with different WHERE conditions, and see the column widths change dynamically. The full code can be found in script widths.sql in the accompanying Download file.

Return codes and row counts
Until I actually inspected the system stored procedure sp_certify_removable, I thought I understood the power of the EXECUTE() syntax. But in reading the sp's code, I realized that you could include a whole batch of commands in one EXECUTE. Now it seems so obvious that I'm hesitant to admit to not knowing it all along. Nevertheless, I'll bravely forge forward with some explicit examples under the assumption that some of you might have missed the technique as well.

Within a single EXECUTE() batch, sp_certify_removable switches to the relevant database, runs a nested stored procedure, checks its success, and passes this back to the main code through clever use of the @@ROWCOUNT global variable. It relies on the following properties of an EXECUTE() batch:


For illustrative purposes, I've extracted the pertinent code, fixed a minor bug (an unnecessary RETURN statement that caused an immediate exit even when run from within an EXECUTE() batch), and used it as the basis for a stored procedure I've called sp_object_test. Given a database name, all this does is determine whether or not a certain object (test_object) is set up in the database.

 CREATE PROCEDURE sp_object_test (
 @dbname varchar(30) ) AS 
 execute('
 use ' + @dbname + '
 declare @inx_ret_value int ,@int1 int
 select  @inx_ret_value = 1
 exec @inx_ret_value = sp_help ''test_object''
 set nocount off
 if @inx_ret_value <> 0	--bad
   begin
   select @int1 = suid from master.dbo.syslogins 
   where suid=1 --SA
   end
 else
   begin
   select @int1 = suid from master.dbo.syslogins 
   where suid=1 and 1=2
   end
   ')
 if @@rowcount > 0
   begin
   raiserror ('non-zero return from sp',16,1)
         return (1)  --Error was returned
   end


If the nested stored procedure fails (yet is coded to allow processing to continue for error handling), then @inx_ret_value will be set non-zero and suid=1 is selected from syslogins. Since suid=1 refers to the sa who is always present, the number of rows affected by the select is 1 -- setting @@rowcount to 1. If, instead, the stored procedure is successful, then the select from syslogins includes the test "and 1=2". As this is always false, the number of rows returned will be zero, and so this becomes the value of @@rowcount. The full code for this example can be found in script obj_test.sql in the accompanying Download file.

Reducing temporary table names
Procedure sp_lock2 demonstrates an elegant technique for displaying temporary table names in reports. When displaying lock information against tables in tempdb, sp_lock2 reduces the temporary table names by stripping off the system-generated suffix. You probably know that when you create a temporary table -- called, say, #mywork -- then SQL Server actually creates a new table in the tempdb database.

To ensure that this table has a unique name, it pads it out to the maximum length (30 in version 6.x, 128 in 7.0) with a suffix consisting of underscore characters followed by a unique number (10 digits in version 6.x, 12 in 7.0) at the end. This is why the rules for creating temporary tables state that the user-defined name, including the # character, shouldn't exceed a certain length (20 in version 6.x, 116 in 7.0). So in version 6.5, temporary table #mywork will have a physical table name something like `#mywork_____________0000000005'.

If you choose a short temporary table name, then there will be at least two underscore characters together, and these will mark the end of your chosen name. Like sp_lock2, you can use this to derive the original temporary table name from its real physical name, as follows:

 SELECT Physical_Name = name,
 Name=SUBSTRING(name,1,PATINDEX('%[_][_]%',name)-1)
 FROM   tempdb..sysobjects
 WHERE  name like '#%[_][_]%'


The PATINDEX function searches for the starting position of one string within another, using a LIKE expression to define the characters to search for. In this case, both the PATINDEX and LIKE are looking for two underscore characters together. The complication is that an underscore has a special meaning in LIKE expressions, and therefore each instance of it has to be enclosed in square brackets. This convention tells SQL Server to ignore the special meaning and look for the underscore character value itself. Hence '#%[_][_]%' means "names that start with a hash sign and then two underscores together anywhere in the string." Having found the position of the double underscore, the SUBSTRING function then pulls off all the characters of the name up to this point. This is exactly the original temporary table name defined by the user.

Testing for object existence
Along with several other system stored procedures, sp_helprotect uses an interesting technique for determining whether an object exists in a database. When you use Enterprise Manager to generate the script for dropping an object, it builds code like this:

 if exists (select * from sysobjects 
   where id = object_id('dbo.authors') 
   and sysstat & 0xf = 3)
 drop table dbo.authors
 GO


The WHERE clause against systat contains the expression "& 0xf", which tests the low byte of the sysstat column. The value of this byte indicates the type of database object; in this case, it's type 3, which means a user table (see sysobjects in Books Online for other values). When writing my own code, I often want to do something similar to this, but I can never remember the exact syntax or the various values of sysstat! I can understand that system-generated scripts have to do the check just in case the same name has been given to two objects of different types. But my object naming conventions ensure that each object has a completely unique name within the database, so I can dispense with the object type test in my own code. I was delighted to see a simplified version of the syntax in system stored procedure sp_helprotect. The technique is to test for a NULL object_id value rather than an existence test in sysobjects:

 IF (OBJECT_ID('dbo.authors') IS NOT NULL)
   DROP TABLE dbo.authors


The OBJECT_ID function returns the identifier of any object in sysobjects. If the object doesn't exist, then it returns a NULL. Of course, you can use this whenever you need to test for an object's existence, and not just because you want to drop it. Interestingly, the technique also works for temporary tables, and you don't need to know the full system-generated name (as described in the section "Reducing Temporary Table Names"). The test can be coded as follows:

 IF (object_id('tempdb..#mytemp') IS not Null)
             Drop Table #mytemp


Although `tempdb..#authors_temp' is a truncation of the actual physical name of the object as held in tempdb..sysobjects, somehow OBJECT_ID() recognizes that you mean "your" temporary table called #mytemp.

Reading the Registry
Not all the system stored procedures have been described in the SQL Server manuals. I don't usually like to rely on undocumented features. However, when Microsoft's own developers have seen fit to use them in the system stored procedures, then I feel more confident that they're just documentation omissions rather than components that are likely to change in future releases.

With this in mind, I've been happy to exploit the undocumented extended stored procedure xp_regread. I first saw this in the expertly written sp_certify_removable. The xp_regread procedure navigates the local system Registry and returns the value for an entry that you specify. In sp_certify_removable, it's used to determine the pathname under which SQL Server has been installed:

 declare @SQLPath varchar(255)
 exec master..xp_regread
        'HKEY_LOCAL_MACHINE'
       ,'SOFTWARE\Microsoft\MSSQLServer\Setup'
       ,'SQLPath',@SQLPath   OUTPUT


The four parameters it takes are:


Note that there's also a complementary procedure xp_regwrite for updating the Registry. However, this sort of thing always makes me extremely nervous, and I'm delighted to say I haven't yet found an application for it. The structure and names of the parts of the Registry are explained in the NT Resource Kit resource guides, and various SQL Server Registry entries are described in the MS SQL Server Resource Guide. The main Registry sub-trees for SQL Server 6.x are:

 \HKEY_LOCAL_MACHINE
      \SOFTWARE
           \Microsoft
                \MSSQLServer
 \HKEY_LOCAL_MACHINE
      \SYSTEM
           \CurrentControlSet
                 \Services
                       \MSSQLServer
 \HKEY_LOCAL_MACHINE
      \SYSTEM
            \CurrentControlSet
                  \Services
                        \SQL Executive


Archiving or deleting files created before a certain date
I found xp_regread to be invaluable recently while developing a housekeeping application to archive external files that were older than a certain number of days. My preferred way of doing this is to generate the cut-off date using T-SQL and then use this as a parameter to the archiving utility PKZIP. PKZIP has a "move" mode that deletes files after archiving them, and this performs the housekeeping function I need. So to completely remove all those old files, all I need to do is just delete the ZIP archive file.

The problem I had to overcome is one that might be unknown to those of you who develop applications just for North America, but it's one that the rest of the world has to suffer -- namely, the localization of settings such as dreaded date formats! PKZIP interprets the value in its date selection parameter according to the date layout as specified in the Control Panel International or Regional Settings.

Where I work, we try to ensure that all of our servers are configured to use the English U.K. settings -- that is, with a date format of "ddmmyy." Unfortunately, some of our servers inadvertently get set up with the default U.S. English format "mmddyy." For my archiving application, I need to know which configuration has been used. The only answer was to delve into the Registry, and so I turned to xp_regread, as follows:

 declare @sShortDate varchar(255)
 exec master.dbo.xp_regread 'HKEY_CURRENT_USER',
   'Control Panel\International',
   'sShortDate',@sShortDate OUTPUT


This query works as long as you're running it from a service started under a logged in account. If not (that is, if you're running under local system account), then there will be no current user, and the HKEY_CURRENT_USER sub-tree won't exist. Then the call will fail with the message "RegConnectRegistry - #6 The handle is invalid." If this is the case, then you need to interrogate the "default" user sub-tree instead:

 exec master.dbo.xp_regread 'HKEY_USERS',
   '.DEFAULT\Control Panel\International',
   'sShortDate',@sShortDate OUTPUT


The sShortDate key in the Registry holds the server date format as a "date picture" string, which is described in the Windows Developer Guide. I know that if the date picture starts with a "d," then the server was configured for a U.K. date format; otherwise, it was left to the default U.S. layout. With my cut-off date held in a datetime variable, I need to use the CONVERT function to get it into a string variable with the nearest layout acceptable to PKZIP. The following code carries out the necessary date manipulation:

 -- archive files more than 2 weeks old:
 declare @cutoffdate datetime
 select @cutoffdate = dateadd(day,-14,getdate())
 -- interpret sShortDate as convert date style:
 DECLARE @style int
 SELECT @style= CASE substring(@sShortDate,1,1)
        WHEN 'd' THEN 103 -- dd/mm/yyyy
                 ELSE 101 -- mm/dd/yyyy
        END
 -- convert cut-off date into string in local format:
 DECLARE @cutoff char(10)
 SELECT @cutoff = CONVERT(char(10),@cutoffdate,@style)
 -- strip out separators for pkzip format:
 DECLARE @pkdate char(6)
 SELECT @pkdate = SUBSTRING(@cutoff,1,2)
        + SUBSTRING(@cutoff,4,2)
        + SUBSTRING(@cutoff,9,2)


Note that I always use century date formats, even if the century digits need to be stripped off for the target application. At least my code fragments are year 2000 compatible! Finally, I build the PKZIP command line, using the -T parameter (select files older than a date) and the -m parameter (delete files after archiving). As I run this script from command line isql, I use the PRINT statement to output the PKZIP command into a command file. I can then switch to the appropriate directory containing the files to archive and then CALL the generated PKZIP command file to execute it.

 -- build PKZIP command string:
 DECLARE @command varchar(255)
 select @command = 'pkzip.exe -T ' + @pkdate 
        + ' -m archive.zip *.* ' 
 PRINT @command
 go


The full code for this example can be found in script archive.sql in the accompanying Download file.

Writing external text files
Recently I needed to create a DOS file from within a T-SQL application. It had to contain certain information such as the host SQL Server name, which I use to set the isqlserver environment variable. If no server name is specified on a batch isql.exe command, then it defaults to the value in isqlserver. Luckily I had seen an example of how to do this in my favorite source for this article -- namely, system stored procedure sp_certify_removable. The trick is to use the NT command ECHO from within xp_cmdshell as follows:

 USE master
 GO
 DECLARE @sql varchar(255)
 SELECT @sql =  'exec xp_cmdshell "ECHO'+ 
        ' REM Set server variable'+
        ' > c:\mssql\data\SQLSET.CMD"'
 EXEC (@sql)
 SELECT @sql =  'exec xp_cmdshell "ECHO'+ 
        ' set isqlserver='+@@servername+
        ' >> c:\mssql\data\SQLSET.CMD"'
 EXEC (@sql)


Within a DOS window, the NT command ECHO normally sends output to the screen. Alternatively, the redirection symbol > can be used to send the output direct to a printer or a file. The single redirection symbol indicates that the output overwrites the existing contents of the target file, if any. The double redirection symbol causes the output to be appended to the end of the file.

I'm sure you won't be surprised to see the preceding use of EXECUTE(). In this case, it resolves the @@servername variable into a string to be output by ECHO. So, on a server named MYSERVER, the preceding code will create the following command file:

 REM Set server variable
 set isqlserver=MYSERVER


Using this method, you can easily use T-SQL to produce command files that are customized to the environment you want. The code for this example can be found in script DosFile.sql in the accompanying Download file.

Smarten up your RAISERROR output
RAISERROR is a very flexible tool for reporting error conditions back to an application that executed the T-SQL code. To quote from the T-SQL Reference: "RAISERROR lets the client retrieve an entry from sysmessages table or build a message dynamically with user-specified severity and state information."

Note: For a full description of RAISERROR syntax and the meanings of the severity levels, see Books Online. The examples in this article use a simplified version of the RAISERROR syntax, as follows:

 RAISERROR ({msg_id | msg_str}, severity, state) 


where msg_id is a user-defined error message stored in the sysmessages table, msg_str is an ad hoc message string, severity is the severity level associated with this message, and state is an integer that identifies the source of a message if it can be issued from more than one place.

Many system stored procedures use RAISERROR to report information messages, and it is this technique that I want to explore here. Take a look at the following code:

 RAISERROR('Hello!',-1,-1)
 RAISERROR('',0,1)
 RAISERROR('My message',2,4)
 RAISERROR('',0,1)
 RAISERROR('Goodbye!',-1,-1)


This produces output similar to the following:

 Hello!
 Msg 50000, Level 2, State...
 My message
 Goodbye!


First of all, note that RAISERROR(' ',0,1) produces a single blank line in the message output. This can be very handy for spacing out a long message to make it more readable. Next, notice that the lines with the low severity numbers display just the message text, whereas higher severity numbers cause the additional line `Msg xxxxx, Level y, State z.' It's most interesting to experiment and find which severity values will suppress this additional line. In fact, it's any negative number, zero, one, and 10! (Apparently 10 is included for backward compatibility.)

There's one more interesting trick when using RAISERROR with pre-defined messages. In sysmessages, these will be defined with a default severity level. This is normally overridden when issuing RAISERROR. But what if you want to use the default value? One way is to look it up in sysmessages and code it explicitly. The system stored procedures show us an easier way -- just code a negative severity level, and RAISERROR will automatically revert to the default. If the state is coded as a negative, then it defaults to 1. So this:

 RAISERROR (14027,-1,-1,'My object') 

produces the message:

 Msg 14027, Level 11, State 1
 My object does not exist in the current database.


With these tips in mind, you can now control exactly how your information messages appear.

Follow the leaders
I hope you've found these expert techniques -- no, I'm not tooting my own horn, just acknowledging the value of the code in SQL Server's own stored procedures -- to be valuable. They've certainly helped to make my T-SQL coding more productive. Now, when I hear about a new stored procedure in a SQL Server upgrade, I check out the manual and then go straight to the code to see what new methods the experts have come up with. I think I'm going to be very busy for a while with version 7.0! [In the new SQL7 Books Online, the "System Stored Procedures (T-SQL)" topic categorizes SQL7's system and extended stored procedures into 12 categories. Why not use this as a "table of contents" for additional code reading on your own? -- Ed.]

Download MUNKSQL.exe

Paul Munkenbeck has been a DBA for more than 15 years. He has worked on all types of database, from desktop to multi-gigabyte mainframe. He was first exposed to Microsoft SQL Server as a database consultant on a project early in 1995. Having gained two years' experience of strategic SQL Server applications with a leading U.K. real estate company, Paul is now DBA for a services company that builds and runs a variety of database applications for its clients. MUNKY@compuserve.com.