This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.
|
Using Stored Procedures in SQL Server
Marco Tabini |
If you're a programmer, you know that SQL is becoming more and more prevalent. Here's a guide to one of its basic building blocks the stored procedure.
We live in a world of data-driven applications. Even the simplest Web page, with nothing but a bunch of text and some pictures, can effectively distribute information, making it a data-driven site.
Because databases are, by definition, applications that have been designed to manage information, their marriage to Web sites was inevitable. Of course, this inevitability is also the result of increasingly affordable enterprise-level database applications like Microsoft® SQL Server. Only a few years ago, the cost of creating a database-driven Web site on a mainframe or minicomputer would have been so high as to make the investment difficult to justify. You can now set up a very good Web server with less than $10,000 worth of hardware and software, so customers these days request data-driven Web applications even if they don't fully understand what they are. All they know is that it's the latest cool thing and they must have it. In many cases, if you decide on using Microsoft products as your Web development and deployment platform, your DBMS of choice will be SQL Server. Microsoft Access, although viable for small-scale solutions, is not a good choice for high-traffic sites, since it does not provide all the flexibility and enterprise-level reliability and functionality that SQL Server offers. The Right Tools The choice of what development program to use is yours, but here are some suggestions that you might find useful. SQL Server comes with a set of client tools for developing database applications. Query Analyzer (the successor to iSQL/W) lets you open several connections to different databases on different servers at the same time, write SQL statements, execute them, save them, or load them from disk (see Figure 1). This tool is a little crude when it comes to development managementit offers no clear overview of the content of a database, no user-friendly interface for modifying the structure declaration of a table, and so on. Query Analyzer is a rather simple interface to the database engine that simply opens a connection, sends in the text that you type for execution, and diligently returns the output from SQL Server. If you write ASP code, you probably are using Microsoft Visual InterDev 6.0. If you aren't, I strongly recommend that you consider adopting itVisual InterDev offers an integrated development environment that is complete, user-friendly, and powerful for database work as well as Web development. As you can see from Figure 2, you have the whole database structure at your fingertips at any time. In addition, any errors in your code are reported in a way that makes them easy to find and explains them well. The only drawback to Visual InterDev, from my point of view, is that when you execute a query of any kind, the results appear in a tiny child window at the bottom of your screen. This leaves you scrolling up and down checking for the query's result in a fairly limited space. If you make the window bigger, it becomes difficult to edit your code afterwards. Query Analyzer, on the other hand, adopts a multipanel approach, giving each window the same real estate. This makes scrolling through the results much easier. I use both programs. Query Analyzer's use of the screen's real estate makes for a great prototyping tool. I can do a lot of trial-and-error work with it, and the help, although not very user friendly, is always at hand. Visual InterDev is better when some code has to become part of the database. In that scenario, having the ability to browse through the database's full structure makes life a lot easier. ADO Under the Hood If you develop using ASP, you probably access databases with the ActiveX Data Objects (ADO) interface. ADO offers a wide range of functions that can be used as a uniform interface to access any databaseincluding SQL Server, Oracle, and Microsoft Access. But how does it really work? ADO relies on a relatively recent Microsoft technology known as OLE DB, which was designed to provide data- base access through COM-based interfaces. OLE DB, in turn, superseded ODBC. OLE DB has a number of advantages over ODBC, particularly thanks to certain Win32®-friendly design decisions. For instance, OLE DB provides mandatory support for multithreading (a firm requirement for COM objects). Where ODBC relies on the functionality provided by specific drivers to access the various DBMSs, OLE DB works by delegating this task to a provider. A provider is simply a piece of software that has been developed to be compatible with OLE DB and supply it with data. Because of the vast quantity of ODBC legacy code, ODBC is an OLE DB provider. ODBC isn't always a safe choice, particularly because it does not guarantee its drivers to be thread-safe, but you should be okay if you use the latest SQL Server, Microsoft Access, or Oracle drivers. As you can see, there are quite a few things going on behind the scenes when you use ADO to access a database. Your requests go through up to four different pieces of middleware before they actually end up on the database engine. Stored Procedures So what happens when your queries actually get to the database server? The answer is quite astonishing (or at least it was to me when I first learned about it): they are compiled. Most developers use SQL as if it were an interpreted language; they send the whole query text to the server every single time, where it's compiled and then executed. If you think about it, this is not the ideal approach. It's a bit like shipping the source code for a C++ application, then recompiling it every time it has to run. (This is an extreme example; the kind of compilation performed by the server is not the same as compiling a C++ program.) A much better solution would be to have the database server compile a query once, store it in a compiled format, and run it on request without having to recompile it every single time. This is what stored procedures do. The structure of a stored procedure is relatively simple. It can be declared using the following syntax: |
|
Procedure-name is an arbitrary name that you can use to recall the procedure later on. Owner is the SQL Server user who ownsor createdthe procedure. Parameter1 through parametern are parameters that can be passed at runtime rather than at compilation time. Datatype is the data type of each parameter ( "int", "char (100)", and so on), and default-value is the value that they must assume by default.
The simplest useful stored procedure is a query. For example, let's consider this query: |
|
If you were to run the query by passing it to the server every time, this is what would happen:
Using the same query syntax, you can create a stored procedure that will look something like this: |
|
From ASP, you can call a stored procedure by simply passing its name, rather than the complete query text. Just use the ADODB.Connection object's Execute method.
When a user calls a stored procedure, the server doesn't create a new data access plan when retrieving information. Clearly, when a procedure references a database table and the table's structure changes, the plan has to change, too. Unfortunately, SQL Server does not keep a complete database of dependencies and is not able to determine which procedures must be recompiled when a structural change takes place. The only way to avoid doing it yourself is to have SQL Server recompile the procedure every time that it is run. This is accomplished by using the keywords WITH RECOMPILE as part of your procedure declaration: |
|
This approach might seem like overkilland with good reason, since the advantage of substituting queries with stored procedures is almost completely lost. If you find yourself in a situation in which you have a considerable number of stored procedures and your table structures change quite often (say you are prototyping a new system and are using a trial-and-error approach), keeping track of all the dependencies can be pretty painful. You might want to force a just-in-time compilation of the procedures.
You will also want to check that your OLE DB providerODBC in this caseis not playing tricks on you behind your back. In fact, in an attempt to optimize the efficiency of generic SQL statements, the ODBC driver for SQL Server will convert all the queries it receives into temporary stored procedures, and then translate similar queries into calls to the procedures it has created. You might want to turn off this behavior to avoid the unnecessary overhead caused by the driver trying to convert everything you feed it into stored procedures. (The driver will not actually do this, but it will try to.) Just go to the Windows Control Panel and double-click on ODBC. Select the datasource that you will be using to access the database, click on Next until the configuration wizard asks if you want to create temporary stored procedures for prepared statements (see Figure 3), and make sure that checkbox is unselected. |
Figure 3: Deselecting Temporary Stored Procedures |
Although you might not be interested in temporary stored procedures, temporary entities such as tables can be very useful when you're working with stored procedures. Let's assume, for example, that your database contains two tables: AsiaBooks and AmericaBooks. These tables contain the titles, authors, and prices of books coming from Asia and from the U.S. Since the contents of the two tables are so similar, it would make a lot of sense to extract data from them using a single statement. One possible way to do this is to create a temporary table and then store the data from both tables in it.
A temporary table is created exactly the same way as a normal table. The only difference is that the name is preceded by one or two # characters and cannot be longer than 20 characters. A single # character means that the table should be available to the owner of a specific user session (in the local temporary table), while two # characters mean that the table should be available to all users (in the global temporary table). Local temporary tables are dropped from the database automatically when the user session that created them is closed; global temporary tables are killed when the last user session using them ends. Keep in mind that "user session" means a connection to the database, not the ASP user session that you may be used to. SQL doesn't really know that it's being called by a Web applicationand doesn't really care.
Variables
There are times when queries alone are not enough to retrieve the information you want from the database. This might occur when you need to search for particular information, or when you need to modify the data before it reaches the ASP application. This annoying problem is usually solved by writing long scripts that create incredibly complex SQL queries. The result is usually something that doesn't really work too well. Remember that when you decide to process the information on the ASP side, you force the transmission of data that you probably don't need from the SQL server to the Web server. If you were to do all the calculations on the SQL server, then return only the data the Web server needs, you could experience significant gains in speed and efficiency.
When creating complex procedures, variables are a helpful tool. Like any other programming language, a SQL variable's function is to store values of various types. Similar to C and C++, SQL is very strict about a variable's type; you can't ever store a string value in a variable that has been declared as integer (unless, of course, you first typecast the value to a string). SQL doesn't support a variant type, which is important to keep in mind when working from ASP, where all variables are variants.
To declare a variable, you can use the following syntax:
|
Varname is the name of the variable, and type is a valid SQL type name (like varchar). Remember that variables must always be preceded by an @ character. SQL Server also defines a number of global variables that are available to all users. Global variables can only be read and are available without being declared. They are identified by double @ characters preceding their names (for example, @@VERSION).
To store a value in a variable, you should use the select keyword, which should be familiar from creating queries. |
|
As you can see, the syntax is a little different in this case: variable is the name of a variable, while value is a valid value for the variable's type. So, for example |
|
is a valid statement that stores the value 10 in the variable @i. On the other hand |
|
will produce an error because the maximum length of the @s variable is five characters.
Variables can be manipulated using standard arithmetical operators, like addition (+), subtraction (-), multiplication (*), and division (/), as well as bitwise operators for the logical AND (&), OR (|), exclusive OR (^), and NOT (~). In addition, SQL Server offers a wide variety of functions that can be used to manipulate variables and their content. (Check the help files that come with SQL Server for more information.) Execution Flow Control Now that I've added variables to the repertoire of development tools for stored procedures, I need to be able to control the flow of execution in response to certain conditions. This is an essential aspect of software programming. Note that you can group portions of SQL code into blocks. If you develop in VBScript, you are probably used to identifying blocks of code, for example, as part of an if-then-else statement. In fact, the following |
|
works just as well as this: |
|
Although the first expression is more elegant than the second, they produce the same results. Of course, you will want to use the second form if you need to execute more than one statement in the true condition of the statement.
SQL uses a similar approach to let you identify a block of commands and distinguish between a succession of instructions and a group of instructions. The keywords BEGIN and END are used for this purpose: |
|
SQL also includes support for if-then-else statements using a syntax that is very similar to what VBScript uses: |
|
Here, condition is any valid logical comparison. SQL supports the same exact set of comparison operators used by VBScript, as you can see in Figure 4. Instruction-block is a group of one or more instructions that must be executed if condition is true or false, depending on where they are located. If only one instruction must be executed, it can appear on the line right after the condition: |
|
Otherwise, the BEGIN/END keyword pair must be used: |
|
As in VBScript, you can nest if-then-else statements within others. There is no limit to the number of nesting levels that can be used. Another interesting construct that's related to if-then-else is the case expression. Unlike VBScript, where this construct has a strong relationship to if-then-else, SQL Server supports case statements a bit differently: |
|
This type of case construct can be used to replace one expression dynamically with another in response to certain conditions. For example, let's consider this query: |
|
In the Books table, the column Type is an integer that contains 0 for fiction books and 1 for nonfiction books. If you were to print out this query, you would only see a list of numbersnot very useful from a user-friendliness point of view. You could rewrite the query like this: |
|
This modified query will produce a much more readable result with very little effort. Suppose you also want to add a column that tells you the year when the book was published, as well as another column to categorize the year as follows:
You obviously can't resolve this problem with the case construct. However, SQL supports a different kind of case construct that just might help:
|
|
The results here are actually triggered by a Boolean expression, usually the result of a comparison. In this case, you can change the query, as shown in Figure 5.
Finally, the last flow control construct to discuss is WHILE, which can be used to create loops inside a stored procedure. Its syntax is the following: |
|
The WHILE construct repeats its block of instructions until the expression boolean-expression becomes false. This means that you can create infinite loops inside stored procedures. Usually, you can stop an infinite loop by closing the database connection that initiated it, even though doing so may lead to locking problems within the database's tables.
The BREAK and CONTINUE keywords can be used to forcibly exit the loop and short-circuit it, respectively. BREAK will cause the loop to conclude and will move the execution pointer to the first instruction after the WHILE statement. CONTINUE will force the execution pointer to move to the first statement inside the WHILE block, therefore restarting the next instance of the loop. You can generally use both instructions in response to an if-then-else condition (that is, "if we are finished with the analysis, end the loop"). Navigating a Table One of the most important advantages of writing procedures on a database system is that you can use the data stored in tables. Until now, all that I have done with it is use a select statement to either return a recordset to the user or transfer records from one table to another. Using a cursor, it is possible to browse through the contents of a tableor the resultset of a stored procedure, for that matterwhile accessing records randomly. The syntax for declaring a cursor is a little complex (see Figure 6). As you can see, there are many ways in which you can create a cursor, although you will generally choose between scrollable or forward-only and read-only or updateable cursors. Once a cursor has been declared, it must be opened using the OPEN statement: |
|
Once a cursor has been opened, you can check the number of rows that have been returned by your select statement by checking the value of the @@CURSOR_ROWS global variable. If @@CURSOR_ROWS contains a negative value, the server decided to start a new asynchronous execution thread to handle a large number of return rows. This allows the cursor to be opened immediately, even though its resultset has not been fully generated yet. In this case, the absolute value of @@CURSOR_ROWS corresponds to the number of rows currently available in the cursor.
To retrieve the values of a particular row in local variables, you can use the FETCH statement: |
|
As you may have noticed, there are several ways to access a specific record. Just remember two things. First, unless you've declared the cursor as scrollable, you will not be able to use any access method other than NEXT. Second, you must make sure that the variables named in the INTO section correspond to the columns of the table in the order in which they will come out of the select statement. If they don't, or if there is an incompatibility in the data types reported by the statements and your variables, SQL Server will generate an error.
To fetch all the available rows in a cursor, you can use one of two methods. The first is to create a WHILE loop that is executed as many times as the value of the @@CURSOR_ ROWS variable. This approach has two disadvantages. First, @@CURSOR_ROWS does not always return a correct representation of the select statement's return set. Second, @@CURSOR_ROWS is modified every time a new cursor is open, which can lead to a plethora of problems. A far better method is to create a WHILE statement that checks the value of the @@FETCH_STATUS global variable. This variable contains 0 if a valid row was retrieved by the last FETCH operation, -1 if the end of the recordset has been reached, and -2 if the current row is invalid (it has been deleted). If you have declared an updateable cursor, you can change the values of a row in a table referred to by a cursor by using the familiar UPDATE statement. All you have to do is add these keywords: |
|
Finally, when you are finished with your cursor, you can close it using the CLOSE statement: |
|
You also have to destroy all the structures that are allocated by SQL Server when you create the cursor by using the DEALLOCATE statement: |
For More Information
For some reason, I haven't found many books about how to program with SQL Server. The best source of information (at least for version 6.5) is the incredibly complete SQL Server manuals. You'll also find a lot of assistance in the help file for SQL programming that comes with Query Analyzer. The help file is especially handy when you have to understand why a query that would so easily solve all your problems just doesn't want to work. Finally, your best choice for things that you really can't figure out is the Microsoft Knowledge Base, which you can find at
http://support.microsoft.com. There are a lot of entries about SQL Server that illustrate the solutions to several bugs and explain how to solve some really hairy problems. |
|
From the March 1999 issue of Microsoft Internet Developer.