INSIDE SQL
SERVER Stored
Procedure Plans Improve performance through
caching
In SQL Server 7.0, reusing compiled plans for stored procedures
can make a dramatic difference in overall system performance. (For
more information on plan reuse, see "SQL Server 7.0 Plan Caching,"
September 1999.) The greater complexity of the Query Optimizer saves
parse-and-compile time (which includes query optimization). But
reuse of compiled plans isn't always the most important reason for
using stored procedures. With earlier SQL Server releases, the
savings aren't great, and SQL Server 7.0 includes other methods of
saving and reusing query plans. So why should you use stored
procedures?
Stored procedures let you develop SQL code in programmable units,
which anyone working on the application can use. Stored procedures
allow controlled access to data, so a user can have permission to
run a procedure without having ad hoc access to the object that the
stored procedure references. From a performance standpoint, the
savings in network traffic of not needing to send the entire block
of code from client to server can be an important reason for using
stored procedures. So let's look closer at stored procedure plan
caching and how best to use it.
Stored Procedure Caching
vs. Plan Caching SQL Server uses several mechanisms
to save and reuse plans for ad hoc or autoparameterizable queries.
Most of these techniques deal with optimizing a SQL statement. With
stored procedure plan caching, you're dealing with multiple
plans-one plan for each statement in the procedure. Generally, all
statements in a procedure are compiled at the same time, so you can
consider the procedure's plan to be the combination of the plans for
all the statements.
Procedures can be parameterized, and in general, the plan for a
procedure is based on the arguments that are passed the first time
the procedure is called. Regardless of the value or values passed
in, subsequent executions of the same procedure use the same plan,
which isn't always a good idea. For example, suppose you have a
stored procedure that computes information about customers'
purchasing habits by country. If the procedure accepts a parameter
for the country, it might contain a statement such as SELECT country(*) FROM sales
WHERE country = @country
Suppose you have a nonclustered index on the country column. If,
the first time the procedure is called, the country you supply as
the parameter occurs only a few times in the table, the optimizer
might use the index. (Nonclustered indexes are typically a good
choice only when the rows that need to be accessed form a small
percentage of the table.) After that, the saved plan for this stored
procedure always tells SQL Server to use the nonclustered index on
the country column. If someone executes the procedure with a country
value that occurs in many rows, the original plan is still used,
even though it's not an optimal plan for this second case. The
mechanisms that SQL Server uses to autoparameterize queries that
aren't stored procedures are very conservative, and SQL Server
doesn't autoparameterize any query in which a parameter change could
lead to a different plan. However, when writing stored procedures,
you can control when a stored procedure is recompiled.
The mechanisms for recompiling procedures are well-described in
the online documentation, so I won't go into detail here. If you
check the syntax for the CREATE PROCEDURE statement, you'll see that
you can add the option WITH RECOMPILE to the procedure definition.
This option causes SQL Server to compile and optimize the procedure
every time it runs. If you know that different parameters might
require vastly different plans, consider this option. Also, you can
recompile one execution of a procedure by using the WITH RECOMPILE
option with the EXECUTE statement. Any time a procedure is executed
with this option, SQL Server doesn't check for any existing plans in
cache, and the new plan developed for this single execution is saved
in cache.
The option of recompiling a stored procedure is different from
the new SQL Server 7.0 capability of automatically updating
statistics. (See "SQL Server Statistics: A Useful Query Optimizer
Tool," August 1999, for details on automatic updating of
statistics.) Automatic updating of statistics doesn't mean that
plans automatically recompile. The statistics update only if SQL
Server recompiles and reoptimizes a plan or if the query optimizer
detects that the statistics are out of date. However, to take
advantage of updated statistics after SQL Server saves a procedure
plan, you need to specify one of the recompile options to invoke the
optimizer.
Another feature of stored procedures in SQL Server 7.0 sometimes
can cause a recompilation. If you build a temporary table, then
create an index on that table, and later in the same procedure add
data to the table, your stored procedure might be compiled several
times. If SQL Server tries to compile the entire plan at once before
executing the statements in the procedure that populate the table,
the optimizer has no way to know what the data will be when you
execute the procedure. In fact, if a procedure contains the creation
of a table, SQL Server compiles it only up to the first reference to
the table. At execution time, it executes the plan up to the first
reference to the table and recompiles from that point. As you add
more data to the table, the procedure might be recompiled again. If
SQL Server detects that a table was empty when the procedure was
first compiled (e.g., if you'd just created the table), the query
processor sets a low recompilation threshold for the table. At
present, this threshold is five data-modification statements
(INSERT, UPDATE, or DELETE). So every time five or more changes are
made to the newly created table, the procedure recompiles.
You can use SQL Server Profiler to watch automatic recompilation
occur: Trace the event called SP:Recompile in the Stored Procedures
category. If you also trace the event called SP:StmtStarting, you
can see at what points the procedure is recompiling. Note that this
event shows up in the Profiler's event list only if you enable the
Profiler option to show all events. One way to avoid repeated
recompilation is to include, at the beginning of the procedure, all
data-definition statements dealing with temporary tables and
inserting rows into the temporary tables. So if the procedure must
be recompiled, the compilation won't happen more than once. Another
way to prevent recompilation is to include the query hint KEEP PLAN
in your statements that access the temporary tables. This hint is
part of the OPTION clause and looks something like SELECT <some columns>
FROM #newly_built_temp_table
OPTION (KEEP PLAN)
This hint tells SQL Server to use any plan that has been saved
for this statement, even if you've made several changes to the
temporary table.
Limiting the Number of
Cached Plans SQL Server limits the number of plans
for each stored procedure. Because plans are reentrant in SQL Server
7.0, limiting plans is much easier than in previous releases. The
online documentation states that any procedure can have, at most,
two compiled plans (one for parallel plans-those that execute on
multiple processors-and one for nonparallel plans), but there are
exceptions. Certain situations cause SQL Server to store multiple
plans for the same procedure. The most likely situation is a
difference in certain SET options, database options, or
configuration options. For example, a stored procedure that
concatenates strings might compile the concatenation differently
depending on whether the option CONCAT_NULL_YIELDS_NULL is on, or
whether the corresponding database option is true. If a user
executes the procedure with an option on, that execution uses a
different plan from the one it would use if the option were off.
As with ad hoc and autoparameterized queries, the system table
syscacheobjects can help you see when new plans are created and when
existing ones are reused. You can get this information from the
following query: SELECT sql, cacheobjtype, usecounts
FROM master..syscacheobjects
WHERE objtype = 'Proc'
The value in the sql column is the name of the stored procedure.
The value of the cacheobjtype column identifies whether the plan is
a compiled plan (the part of a plan that is reentrant) or an
executable plan (an instance of the compiled plan that contains
information describing a process that is executing the query). In
most cases, both compiled and executable plans remain in the memory
cache, but you'll find exceptions if you run tests. In particular,
if your stored procedure involves any sorting operations, you'll see
the compiled plans in cache, but the executable plans aren't saved.
Watching the usecounts result column from this query shows you how
often a particular plan is reused. In addition, if running this
query shows you multiple plans for the same procedure (i.e., the
same procedure name shows up more than once with the same
cacheobjtype value), you might want to return the value of the
setopts column. This value is a bitmap that tells you which session
and database options are enabled for the procedure. Two plans for
the same procedure usually have different values for setopts.
Although many people assume that recompilation is something to
avoid, this assumption is not always valid. If you know that updated
statistics can improve the query plan, or if you know that you have
wildly different possible parameter values, recompilation can be a
good thing. Using stored procedures when developing your
applications lets you determine when SQL Server reuses an existing
query plan and when SQL Server creates a new one.
|