Coding T-SQL Like an Expert
Paul Munkenbeck
To gain competence in a language
like Transact-SQL (T-SQL), it's not enough just to know the syntax and grammar.
Knowledge of good techniques comes from experience and a good mentor. In
this article, Paul describes some of his favorite tips and tricks as used
by the experts who developed SQL Server's system procedures. Shoulders of
giants and all that . . .
With more than a decade of DBA experience under my belt, I've written a
lot of code, but I've also read a lot of code. Frankly, I consider
the code in the system stored procedures to be a real treasure chest. Quite
often I've come across a scrap of code that's made me think, "I didn't
know you could do it like that!" and ultimately, I've found a "home"
for the idea. I thought it would be worthwhile to share some of my favorites
with you. Although the research for this article was from the version 6.5
procedures, I've checked that the techniques are still perfectly valid under
version 7.0.
Studying (and mining) the system stored
procedures
SQL Server comes with many stored procedures
for retrieving or manipulating data stored in the system tables. Most of
these are created in the master database, although those that manage scheduled
tasks for SQL Executive can be found in the msdb database. The system stored
procedures are well-documented in Books Online. But not everyone
realizes that you can view the code of any of these procedures under Enterprise
Manager (EM) just like the stored procedures you've developed yourself.
System stored procedures all have names that start with the prefix sp_ (in
lower case). SQL Server recognizes these and gives them some special characteristics.
You can also create your own stored procedures with names starting with
sp_, and they'll inherit these special properties:
- sp_ procedures can be executed from any database.
- SQL Server 6.x first looks for the procedure
in the current database; if it can't find it, then it tries the master
database.
- SQL Server 7.0 first looks for the procedure
in master database; if it can't find it, then it tries the master database
-- hence, you can no longer override system procedures with your own versions
unless you qualify them with the database name.
- If tables referenced within an sp_ procedure
aren't found in the current database, then SQL Server looks for them in
master.
- Objects referenced within an sp_ procedure aren't
resolved at compile time, but only when it's executed.
As well as the sp_ procedures, there are also some system extended stored
procedures, whose names are prefixed with xp_. Each of these carries out
its processing using a dynamic link library (DLL) rather than interpreted
T-SQL code.
But now let's look at some of the techniques of the experts.
EXECUTE()
The EXECUTE() statement, along with the
character string you pass it, works hand in glove with system stored procedures,
and you're likely to find yourself using it time and again. You can construct
the character string from any number of literals and string variables. At
runtime, EXECUTE() resolves all the values and executes the string as a
command.
For instance, the final report in procedure sp_helprotect contains a SELECT
statement with an ORDER BY clause. The column used in the ORDER BY depends
on the parameters chosen by the user, so it's specified as a variable that's
substituted for a real column name at runtime. Here's an example that illustrates
the technique using the authors table in the pubs database:
DECLARE @OrderBy varchar(255)
SELECT @OrderBy = 'au_lname'
EXECUTE('
SELECT au_fname, au_lname, phone
FROM authors
WHERE state = ''CA''
order by ' + @OrderBy + '
')
|
Note that if you need to include a literal string like `CA' within the EXECUTE
string, then you need to enclose it in pairs of single quotes. Within a
delimited string, SQL Server interprets two delimiters together as the character
itself rather than an end of string marker.
Reports with dynamic column widths
How often have you produced a report that
contains variable length column data and ended up with mostly white space?
Isn't it amazing how many of your varchar(255) columns end up with fewer
than 15 characters in them? Wouldn't it be convenient if you could shrink
the column width in the report so it just fits the longest piece of real
data? Well, system stored procedure sp_lock2 shows you how to do it.
Once again, I've taken the principle behind the stored procedure code and
used it in a more typical report, using the pubs database. Suppose you're
enhancing the previous report against the authors table, showing first name
(au_fname), last name (au_lname), and phone number for all authors based
in California. The name fields are both varchar, whereas phone is fixed
length. First, determine the longest first and last names for the selected
authors:
DECLARE @MaxLen_lname int,
@MaxLen_fname int
SELECT @MaxLen_lname
= isnull(max(datalength(au_lname)),8),
@MaxLen_fname
= isnull(max(datalength(au_fname)),8)
FROM authors
WHERE state = 'CA'
|
Once this has run, the @MaxLen variables hold the widths of the longest
names. The isnull function copes with the possibility of the max() function
returning a NULL, and, in this case, it sets the width to 8. This is simply
the length of the column title -- au_lname. Guess how we're going to code
a report that uses variables to set column widths dynamically? That's right,
with EXECUTE():
-- convert width values to strings for EXECUTE:
DECLARE @width_fn varchar(10),
@width_ln varchar(10)
SELECT @width_fn=convert(varchar,@MaxLen_fname),
@width_ln=convert(varchar,@MaxLen_lname)
-- run the report:
EXECUTE ('
SELECT au_fname=SUBSTRING(au_fname,1,'+@width_fn+'),
au_lname=SUBSTRING(au_lname,1,'+@width_ln+'),
phone
FROM authors
WHERE state = ''CA''
')
|
Although I've called the report columns au_lname and au_fname -- the same
as the original columns -- they are, in fact, new columns derived
from a computed expression. Because of this, SQL Server calculates their
widths as the lengths of the result of the substring expressions. By the
time the EXECUTE() string has been resolved, the length of each substring
is exactly the value of the @MaxLen variables. The result of the query looks
something like the following:
au_fname au_lname phone
-------- ---------- ------------
Johnson White 408 496-7223
Marjorie Green 415 986-7020
Cheryl Carson 415 548-7723
Michael O'Leary 408 286-2428
Dean Straight 415 834-2919
-- other names omitted
|
Isn't that neat? Try it for yourself with different WHERE conditions, and
see the column widths change dynamically. The full code can be found in
script widths.sql in the accompanying Download
file.
Return codes and row counts
Until I actually inspected the system stored
procedure sp_certify_removable, I thought I understood the power of the
EXECUTE() syntax. But in reading the sp's code, I realized that you could
include a whole batch of commands in one EXECUTE. Now it seems so
obvious that I'm hesitant to admit to not knowing it all along. Nevertheless,
I'll bravely forge forward with some explicit examples under the assumption
that some of you might have missed the technique as well.
Within a single EXECUTE() batch, sp_certify_removable switches to the relevant
database, runs a nested stored procedure, checks its success, and passes
this back to the main code through clever use of the @@ROWCOUNT global variable.
It relies on the following properties of an EXECUTE() batch:
- You can declare and use local @variables within
a batch, but they're destroyed on return from the batch.
- You can execute stored procedures from within
the batch and test the return code as usual.
- The count of rows affected by each statement
in the batch populates the @@rowcount global variable as usual.
- The value of @@rowcount is preserved through
the end of the batch.
For illustrative purposes, I've extracted the pertinent code, fixed a minor
bug (an unnecessary RETURN statement that caused an immediate exit even
when run from within an EXECUTE() batch), and used it as the basis for a
stored procedure I've called sp_object_test. Given a database name, all
this does is determine whether or not a certain object (test_object) is
set up in the database.
CREATE PROCEDURE sp_object_test (
@dbname varchar(30) ) AS
execute('
use ' + @dbname + '
declare @inx_ret_value int ,@int1 int
select @inx_ret_value = 1
exec @inx_ret_value = sp_help ''test_object''
set nocount off
if @inx_ret_value <> 0 --bad
begin
select @int1 = suid from master.dbo.syslogins
where suid=1 --SA
end
else
begin
select @int1 = suid from master.dbo.syslogins
where suid=1 and 1=2
end
')
if @@rowcount > 0
begin
raiserror ('non-zero return from sp',16,1)
return (1) --Error was returned
end
|
If the nested stored procedure fails (yet is coded to allow processing to
continue for error handling), then @inx_ret_value will be set non-zero and
suid=1 is selected from syslogins. Since suid=1 refers to the sa who is
always present, the number of rows affected by the select is 1 -- setting
@@rowcount to 1. If, instead, the stored procedure is successful, then the
select from syslogins includes the test "and 1=2". As this is
always false, the number of rows returned will be zero, and so this becomes
the value of @@rowcount. The full code for this example can be found in
script obj_test.sql in the accompanying Download
file.
Reducing temporary table names
Procedure sp_lock2 demonstrates an elegant
technique for displaying temporary table names in reports. When displaying
lock information against tables in tempdb, sp_lock2 reduces the temporary
table names by stripping off the system-generated suffix. You probably know
that when you create a temporary table -- called, say, #mywork -- then SQL
Server actually creates a new table in the tempdb database.
To ensure that this table has a unique name, it pads it out to the maximum
length (30 in version 6.x, 128 in 7.0) with a suffix consisting of underscore
characters followed by a unique number (10 digits in version 6.x, 12 in
7.0) at the end. This is why the rules for creating temporary tables state
that the user-defined name, including the # character, shouldn't exceed
a certain length (20 in version 6.x, 116 in 7.0). So in version 6.5, temporary
table #mywork will have a physical table name something like `#mywork_____________0000000005'.
If you choose a short temporary table name, then there will be at least
two underscore characters together, and these will mark the end of your
chosen name. Like sp_lock2, you can use this to derive the original temporary
table name from its real physical name, as follows:
SELECT Physical_Name = name,
Name=SUBSTRING(name,1,PATINDEX('%[_][_]%',name)-1)
FROM tempdb..sysobjects
WHERE name like '#%[_][_]%'
|
The PATINDEX function searches for the starting position of one string within
another, using a LIKE expression to define the characters to search for.
In this case, both the PATINDEX and LIKE are looking for two underscore
characters together. The complication is that an underscore has a special
meaning in LIKE expressions, and therefore each instance of it has to be
enclosed in square brackets. This convention tells SQL Server to ignore
the special meaning and look for the underscore character value itself.
Hence '#%[_][_]%' means "names that start with a hash sign and then
two underscores together anywhere in the string." Having found the
position of the double underscore, the SUBSTRING function then pulls off
all the characters of the name up to this point. This is exactly the original
temporary table name defined by the user.
Testing for object existence
Along with several other system stored
procedures, sp_helprotect uses an interesting technique for determining
whether an object exists in a database. When you use Enterprise Manager
to generate the script for dropping an object, it builds code like this:
if exists (select * from sysobjects
where id = object_id('dbo.authors')
and sysstat & 0xf = 3)
drop table dbo.authors
GO
|
The WHERE clause against systat contains the expression "& 0xf",
which tests the low byte of the sysstat column. The value of this byte indicates
the type of database object; in this case, it's type 3, which means a user
table (see sysobjects in Books Online for other values). When
writing my own code, I often want to do something similar to this, but I
can never remember the exact syntax or the various values of sysstat! I
can understand that system-generated scripts have to do the check just in
case the same name has been given to two objects of different types. But
my object naming conventions ensure that each object has a completely unique
name within the database, so I can dispense with the object type test in
my own code. I was delighted to see a simplified version of the syntax in
system stored procedure sp_helprotect. The technique is to test for a NULL
object_id value rather than an existence test in sysobjects:
IF (OBJECT_ID('dbo.authors') IS NOT NULL)
DROP TABLE dbo.authors
|
The OBJECT_ID function returns the identifier of any object in sysobjects.
If the object doesn't exist, then it returns a NULL. Of course, you can
use this whenever you need to test for an object's existence, and not just
because you want to drop it. Interestingly, the technique also works for
temporary tables, and you don't need to know the full system-generated name
(as described in the section "Reducing Temporary Table Names").
The test can be coded as follows:
IF (object_id('tempdb..#mytemp') IS not Null)
Drop Table #mytemp
|
Although `tempdb..#authors_temp' is a truncation of the actual physical
name of the object as held in tempdb..sysobjects, somehow OBJECT_ID() recognizes
that you mean "your" temporary table called #mytemp.
Reading the Registry
Not all the system stored procedures have
been described in the SQL Server manuals. I don't usually like to rely on
undocumented features. However, when Microsoft's own developers have seen
fit to use them in the system stored procedures, then I feel more confident
that they're just documentation omissions rather than components that are
likely to change in future releases.
With this in mind, I've been happy to exploit the undocumented extended
stored procedure xp_regread. I first saw this in the expertly written sp_certify_removable.
The xp_regread procedure navigates the local system Registry and returns
the value for an entry that you specify. In sp_certify_removable, it's used
to determine the pathname under which SQL Server has been installed:
declare @SQLPath varchar(255)
exec master..xp_regread
'HKEY_LOCAL_MACHINE'
,'SOFTWARE\Microsoft\MSSQLServer\Setup'
,'SQLPath',@SQLPath OUTPUT
|
The four parameters it takes are:
- the name of a sub-tree
- the path to a Registry key within that sub-tree
- the value entry name to be interrogated
- an output variable to contain the value found
Note that there's also a complementary procedure xp_regwrite for updating
the Registry. However, this sort of thing always makes me extremely nervous,
and I'm delighted to say I haven't yet found an application for it. The
structure and names of the parts of the Registry are explained in the NT
Resource Kit resource guides, and various SQL Server Registry entries
are described in the MS SQL Server Resource Guide. The main Registry
sub-trees for SQL Server 6.x are:
\HKEY_LOCAL_MACHINE
\SOFTWARE
\Microsoft
\MSSQLServer
\HKEY_LOCAL_MACHINE
\SYSTEM
\CurrentControlSet
\Services
\MSSQLServer
\HKEY_LOCAL_MACHINE
\SYSTEM
\CurrentControlSet
\Services
\SQL Executive
|
Archiving or deleting files created before a certain
date
I found xp_regread to be invaluable recently
while developing a housekeeping application to archive external files that
were older than a certain number of days. My preferred way of doing this
is to generate the cut-off date using T-SQL and then use this as a parameter
to the archiving utility PKZIP. PKZIP has a "move" mode that deletes
files after archiving them, and this performs the housekeeping function
I need. So to completely remove all those old files, all I need to do is
just delete the ZIP archive file.
The problem I had to overcome is one that might be unknown to those of you
who develop applications just for North America, but it's one that the rest
of the world has to suffer -- namely, the localization of settings such
as dreaded date formats! PKZIP interprets the value in its date selection
parameter according to the date layout as specified in the Control Panel
International or Regional Settings.
Where I work, we try to ensure that all of our servers are configured to
use the English U.K. settings -- that is, with a date format of "ddmmyy."
Unfortunately, some of our servers inadvertently get set up with the default
U.S. English format "mmddyy." For my archiving application, I
need to know which configuration has been used. The only answer was to delve
into the Registry, and so I turned to xp_regread, as follows:
declare @sShortDate varchar(255)
exec master.dbo.xp_regread 'HKEY_CURRENT_USER',
'Control Panel\International',
'sShortDate',@sShortDate OUTPUT
|
This query works as long as you're running it from a service started under
a logged in account. If not (that is, if you're running under local system
account), then there will be no current user, and the HKEY_CURRENT_USER
sub-tree won't exist. Then the call will fail with the message "RegConnectRegistry
- #6 The handle is invalid." If this is the case, then you need to
interrogate the "default" user sub-tree instead:
exec master.dbo.xp_regread 'HKEY_USERS',
'.DEFAULT\Control Panel\International',
'sShortDate',@sShortDate OUTPUT
|
The sShortDate key in the Registry holds the server date format as a "date
picture" string, which is described in the Windows Developer Guide.
I know that if the date picture starts with a "d," then the server
was configured for a U.K. date format; otherwise, it was left to the default
U.S. layout. With my cut-off date held in a datetime variable, I need to
use the CONVERT function to get it into a string variable with the nearest
layout acceptable to PKZIP. The following code carries out the necessary
date manipulation:
-- archive files more than 2 weeks old:
declare @cutoffdate datetime
select @cutoffdate = dateadd(day,-14,getdate())
-- interpret sShortDate as convert date style:
DECLARE @style int
SELECT @style= CASE substring(@sShortDate,1,1)
WHEN 'd' THEN 103 -- dd/mm/yyyy
ELSE 101 -- mm/dd/yyyy
END
-- convert cut-off date into string in local format:
DECLARE @cutoff char(10)
SELECT @cutoff = CONVERT(char(10),@cutoffdate,@style)
-- strip out separators for pkzip format:
DECLARE @pkdate char(6)
SELECT @pkdate = SUBSTRING(@cutoff,1,2)
+ SUBSTRING(@cutoff,4,2)
+ SUBSTRING(@cutoff,9,2)
|
Note that I always use century date formats, even if the century digits
need to be stripped off for the target application. At least my code fragments
are year 2000 compatible! Finally, I build the PKZIP command line, using
the -T parameter (select files older than a date) and the -m parameter (delete
files after archiving). As I run this script from command line isql, I use
the PRINT statement to output the PKZIP command into a command file. I can
then switch to the appropriate directory containing the files to archive
and then CALL the generated PKZIP command file to execute it.
-- build PKZIP command string:
DECLARE @command varchar(255)
select @command = 'pkzip.exe -T ' + @pkdate
+ ' -m archive.zip *.* '
PRINT @command
go
|
The full code for this example can be found in script archive.sql in the
accompanying Download
file.
Writing external text files
Recently I needed to create a DOS file
from within a T-SQL application. It had to contain certain information such
as the host SQL Server name, which I use to set the isqlserver environment
variable. If no server name is specified on a batch isql.exe command, then
it defaults to the value in isqlserver. Luckily I had seen an example of
how to do this in my favorite source for this article -- namely, system
stored procedure sp_certify_removable. The trick is to use the NT command
ECHO from within xp_cmdshell as follows:
USE master
GO
DECLARE @sql varchar(255)
SELECT @sql = 'exec xp_cmdshell "ECHO'+
' REM Set server variable'+
' > c:\mssql\data\SQLSET.CMD"'
EXEC (@sql)
SELECT @sql = 'exec xp_cmdshell "ECHO'+
' set isqlserver='+@@servername+
' >> c:\mssql\data\SQLSET.CMD"'
EXEC (@sql)
|
Within a DOS window, the NT command ECHO normally sends output to the screen.
Alternatively, the redirection symbol > can be used to send the output
direct to a printer or a file. The single redirection symbol indicates that
the output overwrites the existing contents of the target file, if any.
The double redirection symbol causes the output to be appended to the end
of the file.
I'm sure you won't be surprised to see the preceding use of EXECUTE(). In
this case, it resolves the @@servername variable into a string to be output
by ECHO. So, on a server named MYSERVER, the preceding code will create
the following command file:
REM Set server variable
set isqlserver=MYSERVER
|
Using this method, you can easily use T-SQL to produce command files that
are customized to the environment you want. The code for this example can
be found in script DosFile.sql in the accompanying Download
file.
Smarten up your RAISERROR output
RAISERROR is a very flexible tool for reporting
error conditions back to an application that executed the T-SQL code. To
quote from the T-SQL Reference: "RAISERROR lets the client retrieve
an entry from sysmessages table or build a message dynamically with
user-specified severity and state information."
Note: For a full description of RAISERROR syntax and the meanings of the
severity levels, see Books Online. The examples in this article use
a simplified version of the RAISERROR syntax, as follows:
RAISERROR ({msg_id | msg_str}, severity, state)
|
where msg_id is a user-defined error message
stored in the sysmessages table, msg_str is an ad hoc message
string, severity is the severity level associated with this message,
and state is an integer that identifies the source of a message if
it can be issued from more than one place.
Many system stored procedures use RAISERROR to report information messages,
and it is this technique that I want to explore here. Take a look at the
following code:
RAISERROR('Hello!',-1,-1)
RAISERROR('',0,1)
RAISERROR('My message',2,4)
RAISERROR('',0,1)
RAISERROR('Goodbye!',-1,-1)
|
This produces output similar to the following:
Hello!
Msg 50000, Level 2, State...
My message
Goodbye!
|
First of all, note that RAISERROR(' ',0,1) produces a single blank line
in the message output. This can be very handy for spacing out a long message
to make it more readable. Next, notice that the lines with the low severity
numbers display just the message text, whereas higher severity numbers cause
the additional line `Msg xxxxx, Level y, State z.' It's most interesting
to experiment and find which severity values will suppress this additional
line. In fact, it's any negative number, zero, one, and 10! (Apparently
10 is included for backward compatibility.)
There's one more interesting trick when using RAISERROR with pre-defined
messages. In sysmessages, these will be defined with a default severity
level. This is normally overridden when issuing RAISERROR. But what if you
want to use the default value? One way is to look it up in sysmessages
and code it explicitly. The system stored procedures show us an easier way
-- just code a negative severity level, and RAISERROR will automatically
revert to the default. If the state is coded as a negative, then it defaults
to 1. So this:
RAISERROR (14027,-1,-1,'My object')
|
produces the message:
Msg 14027, Level 11, State 1
My object does not exist in the current database.
|
With these tips in mind, you can now control exactly how your information
messages appear.
Follow the leaders
I hope you've found these expert techniques
-- no, I'm not tooting my own horn, just acknowledging the value of the
code in SQL Server's own stored procedures -- to be valuable. They've certainly
helped to make my T-SQL coding more productive. Now, when I hear
about a new stored procedure in a SQL Server upgrade, I check out the manual
and then go straight to the code to see what new methods the experts have
come up with. I think I'm going to be very busy for a while with version
7.0! [In the new SQL7 Books Online, the "System Stored Procedures
(T-SQL)" topic categorizes SQL7's system and extended stored procedures
into 12 categories. Why not use this as a "table of contents"
for additional code reading on your own? -- Ed.]
Download MUNKSQL.exe
Paul Munkenbeck has been a DBA
for more than 15 years. He has worked on all types of database, from desktop
to multi-gigabyte mainframe. He was first exposed to Microsoft SQL Server
as a database consultant on a project early in 1995. Having gained two years'
experience of strategic SQL Server applications with a leading U.K. real
estate company, Paul is now DBA for a services company that builds and runs
a variety of database applications for its clients. MUNKY@compuserve.com.