Microsoft Corporation
March 1999
Summary: This article is a guide to storage engine architecture and recommends configuration parameters, best hardware to purchase with SQL Server 7.0, and innovative ways to store large amounts of data through files and file groups. This article also discusses new features in the SQL Server 7.0 storage engine. (29 printed pages)
Storage Engine Capacity Planning Overview
SQL Server 7.0 Storage Engine Highlighs
Files and Filegroups
Assigning Files to Filegroups for Maintenance Reasons
Assigning Files to Filegroups for Performance Reasons
Hardware I/O Layout for Files and Filegroups
SQL Server Memory Planning
SQL Server 7.0 Processor Planning
The I/O subsystem (storage engine) is a key component of any relational database management system (RDBMS). A successful RDBMS implementation requires careful planning at the early stages of the project. The storage engine comprises much of this planning, including what hardware to purchase, how to store data on the hardware, and how to set the configuration parameters appropriately. To accomplish these tasks requires an extensive understanding of the RDBMS architecture.
This article includes:
Microsoft SQL Server 7.0 reduces the configuration and tuning required to implement and run database applications. A major goal of SQL Server 7.0 is to minimize the need for database expertise. This is accomplished by new features, including but not limited to: on-demand memory, on-demand disk, and dynamic tuning of configuration parameters. Many users can now implement a database application successfully without knowledge about the internal architecture of the database system; however, a small percentage of high-end applications always require more detailed knowledge.
The following table describes the changes and new features included in the Microsoft SQL Server 7.0 storage engine:
Table 1. Microsoft SQL Server 7.0 storage engine features
New features | Description |
Performance improvements | Performance for both OLTP and decision support is improved by implementing new features and improving existing systems. New features include:
|
Complete row-level locking | The benefits of dynamic locking (choosing the right level of lock: row, key range, page, multiple pages, or table on the fly) are extended to all database operations: updates, deletes, and reads. |
Scalable storage, including VLDB support |
The on-disk format and the storage subsystem can provide storage that is scalable from the very small database (fits on a floppy) to very large databases (VLDB). Specific changes include:
|
Improved text and image support | SQL Server has complete content text query capability with new full-text search engine additions, which include full-text indexing. |
In Microsoft SQL Server version 6.5, devices and segments are used to physically allocate space for storing data, indexes, and logs. This can be cumbersome for applications with large amounts of data. In Microsoft SQL Server version 7.0, files and filegroups are used to allocate space for data, indexes, and logs, giving you many more choices for setting up data.
A file is a physical allocation of space and can be one of three types: primary, secondary, or log.
A primary file is the starting point of the database and contains the pointers to the rest of the files in the database. Every database has one primary data file. The recommended file extension for primary data files is: .mdf (E:\Mssql7\Data\Master.mdf). Some databases may not have secondary data files, and others may have multiple secondary data files. The recommended file extension for secondary data files is: .ndf (E:\Mssql7\Data\Mydata1.ndf). Log files hold all of the log information used to recover the database. There must be at least one log file for each database, although there can be more than one. The recommended file extension for log files is: ldf (E:\Mssql7\Data\Mylog.ldf).
Database objects can be grouped in filegroups for allocation, performance, and administration purposes. There are two types of filegroups: user-defined filegroups and the primary filegroup. The primary filegroup or any of the user filegroups can be the default filegroup. The primary file is assigned to the primary filegroup. Secondary files can be assigned to user filegroups or the primary filegroup. Log files are never a part of a filegroup. Log space is managed separately from data space.
Note The preceding illustration shows a primary filegroup assigned as the default.
Care must be taken with the size of the primary filegroup, especially if it is also the default filegroup. Because the primary filegroup contains all the system tables, if it runs out of space, no new catalog information can be added to the system tables. In contrast, if a user-defined filegroup fills up, only the user tables specifically allocated to that filegroup are affected. The primary filegroup fills if the autogrow feature is off or if the disk holding the primary filegroup runs out of space. If either of these should occur, set autogrow on or move other files off the disk to free more space.
When creating objects such as tables and indexes, you can specify the filegroup to which the tables and indexes are assigned. A table can be assigned to one filegroup, and that table’s index(es) can be assigned to a different filegroup. If you do not specify a filegroup, the table or index is assigned to the default filegroup.
Recommendations:
Several questions that you may have about files and filegroups are:
The answers to these questions depend on three factors:
Maintenance operations such as backup and restore can be performed at the file or filegroup level. It may be appropriate to perform some operations such as backup, restore, update statistics, or DBCC against some objects (tables) more often than others. By placing tables with similar maintenance requirements in the same filegroup, the maintenance operations can be executed against this specific filegroup.
Recommendations:
Setting up database objects on files and filegroups for either maintenance or performance reasons is not necessarily mutually exclusive. You can set the database for maintenance and still exploit file setup for performance.
Considerations for files and filegroups include:
Multiple files can be created on separate disk drives and then assigned to the same filegroup. A table is then created and assigned to a filegroup. Filegroups use a proportional fill strategy across all the files within each filegroup. As data is written to the filegroup, an amount proportional to the free space in the file is written to each file within the filegroup, rather than writing all the data to the first file until it is full, and then proceeding to the next file. For example, if file 1 has 100 MB free, and file 2 has 200 MB free, one extent is allocated from file 1, and two extents from file 2, and so on. This way both files become full at about the same time and simple striping is achieved.
Each file is physically placed on a disk or set of disks (if the hardware disks are striped). SQL Server maintains a map of each file’s location on the disk. If one file is created across a hardware stripe of four disks, one map points to the location of data on all four disks. If four files are created across a hardware stripe of four disks, four maps point to the location of the data on all four disks, one map in each file for each object in that file.
Whenever a data object (table) is accessed sequentially, a separate thread is created for each file in parallel. Therefore, a tablescan for a table that is assigned to a filegroup with four files uses four separate threads to read the data in parallel.
Due to the parallel data access from multiple files in a filegroup, a general rule is more files are better. Eventually, a saturation point is reached at which too many files are created, resulting in too many parallel threads and bottlenecks on the I/O subsystem. You can identify these bottlenecks by using Windows NT Performance Monitor to check the Physical Disk counter and Counter Disk Queue Length. If Disk Queue length is above 3, consider reducing the number of files. (Counter Disk Queue length can be affected by other things, such as insufficient physical disks or poor placement of data on disks.)
Recommendations:
Prior to the creation of RAID, a big performance boost was attained by physically separating data objects such as tables and indexes on disparate physical devices. SQL Server 7.0 accomplishes the same task with files and filegroups. A possible performance gain can be realized if you separate indexes from tables and read both objects in parallel.
Here is an example involving two filegroups (Filegroup 1 and Filegroup 2): Filegroup 1 is assigned to files on one stripe set with four disks and Filegroup 2 is assigned to files on another stripe set also with four disks. Table A is created and put in Filegroup 2 and the index for table A is put on Filegroup 1. If both table A and index A are being used for the same query, good performance results as both filegroups are being fully used with no contention. On the other hand, if only table A is being scanned and index A is not being used, Filegroup 1 is used and Filegroup 2 is doing nothing. The best I/O performance comes from all disks being accessed at one time. If you cannot predict the type of access that will take place and when, it is a safer decision to place the objects across all disks as shown in the following illustration. This guarantees that all disks are being accessed because all data and indexes are spread evenly across all disks, no matter which way the data is accessed. This is also a simpler approach for database administrators.
Separating tables and indexes into different filegroups can provide potential performance improvements, but spreading all objects across as many physical disks as possible is easier and more reliable.
Recommendation:
Spreading data across as many physical drives as possible is advantageous because it improves throughput by using parallel data access with multiple files. With hardware striping, all files are evenly spread across all disks in the stripe set. The data objects are assigned to a filegroup that proportionately fills each file.
This is an easy setup for the database administrator because it is one big logical drive. All files are created on the same large logical file, but physically they are spread across all disks.
In the previous section, the illustration of hardware striping shows one disk controller for all of the disks. Multiple disk controllers potentially can improve I/O throughput. The illustration of separating data objects for performance shows an I/O subsystem setup with multiple disk controllers. You can set up only one hardware stripe set per controller, therefore the illustration shows two separate stripe sets.
To spread the data evenly across multiple controllers requires one of three options:
Use hardware striping only (and/or Microsoft Windows NT® striping). This configuration requires a Windows NT stripe on top of a hardware stripe for each controller to make it look like one logical drive.
Use SQL Server to stripe. The capabilities of SQL Server files and filegroups allow you to spread the data evenly across all disks without hardware striping. If a physical file is created on each disk and a filegroup is assigned to each of these files, SQL Server uses proportional fill to evenly distribute the data across all of the files and all of the disks. The following illustration shows this option. Some disadvantages of this configuration are:
Because this configuration can be accomplished by using hardware striping with fewer disks and easier administration, this is not recommended.
Use a combination of SQL Server and hardware striping. This configuration uses the best features of both options.
For example, two controllers can point to two hardware striped sets. An alternative to the Windows NT striping method that assigns data from all objects to all disks, a filegroup is assigned to all files on both stripe sets. This option spreads the data evenly across all disks while keeping administration simple, unlike options 1 and 2.
Recommendation:
One of the bigger challenges with SQL Server 6.5 and all other RDBMSs is the management of the physical space allocated to database objects such as tables and indexes. In SQL Server 6.5, if a database runs out of space, the database is inoperable until more space is added.
In SQL Server 7.0, files can grow automatically (autogrow) from their originally specified size. When you define a file, you can specify a growth increment. Each time the file fills, it increases its size by the growth increment set by the database creator. If there are multiple files in a filegroup, none of them grows automatically until all are full. Growth then occurs in a round-robin algorithm.
Each file can also have a specified maximum size. If a maximum size is not specified, the file continues to grow until it has used all available space on the disk. The lack of a maximum size is especially useful if a SQL Server database is embedded in an application for which no system administrator is readily accessible or if SQL Server is implemented in a distributed location without a database administrator. The files can grow automatically, reducing the administrative burden of monitoring the amount of free space in the database and of allocating additional space manually.
For example, a Sales database is created as a 50 MB data file. The autogrow feature is on, and it is set for 10 percent growth to a 500 MB maximum.
As soon as this database reaches 50 MB, 5 MB is added (50 MB + 50 MB*.10 = 55 MB). As soon as the database reaches 55 MB, 5.5 MB is added. This process continues until 500 MB is reached or until the database shrinks.
Performance degrades when a database must be altered to grow. If sufficient space is not initially assigned to a database, the database could grow continuously and performance would degrade slightly. The default autogrow setting is 10 percent of the size of the data file. For example, a database created at 1 MB with a maximum size of 500 MB requires multiple restarts of the autogrow feature if you add a large amount of new data. The database would start at 1 MB, grow to 1.1 MB, then grow to 1.22 MB, and then to 1.34 MB, and so on.
Although this method works, performance is improved if the initial file size and the percent growth are set to a reasonable size to avoid the frequent activation of the autogrow feature. In the previous example, a growth set to an increment of 5 MB would be better.
Recommendations:
You must decide whether to implement the autoshrink feature or manually shrink your data. The autoshrink feature is on by default for the Desktop Edition of SQL Server 7.0 and off by default for the Standard and Enterprise editions. The autoshrink feature can be set by using the sp_dboption stored procedure. When autoshrink is on, it executes every 30 minutes and shrinks in increments of 25-percent free space.
When sp_dboption is used to shrink the database automatically (autoshrink), shrinking occurs whenever a significant amount of free space is available in the database. However, the free space to be removed automatically cannot be configured. To remove a certain amount of free space, such as only 50 percent of the current free space in the database, you must run either the DBCC SHRINKFILE statement or DBCC SHRINKDATABASE statement manually.
For example, to manually decrease the size of the files in the UserDB database to allow 10-percent free space in the files of UserDB, use:
GO
DBCC SHRINKDATABASE (UserDB, 10)
When the autoshrink feature or manual shrink is started, the best point to shrink to is calculated by using the free space setting as a guideline. The user cannot configure the point to shrink to when using the autoshrink feature. The shrinking task shrinks to the original or altered size of the created file only, although this can be overridden with the manual shrink file command. The following illustration shows the movement of each individual heap page (nonclustered index pages) row by row to create free space. B-tree nodes (for example, clustered index pages) are moved as a page to create free space. This shrinking task is a low priority background thread, but it can still affect performance on a busy system due to the movement and locking of pages. Avoid continuously autoshrinking and then autogrowing. Nonclustered index maintenance may be required to point to the new location of the data. Also, clustered indexes have the potential to fragment the clustered index pages, and you might consider reorganizing the clustered indexes periodically after shrinking.
Recommendations:
A Microsoft SQL Server database has at least one data file and one transaction log file. Data and transaction log information is never mixed on the same file, and individual files are used by only one database.
SQL Server uses the transaction log of each database to recover transactions. The transaction log is a serial record of all modifications that have occurred in the database, and which transaction performed each modification. The log records the start of each transaction, the changes made to the data, and enough information to undo the modifications, if necessary. The log grows continuously as logged operations occur in the database. For some large operations, such as CREATE INDEX, the log records the fact that the operation took place. The log records the allocation and deallocation of pages, and the commit or rollback of each transaction. This allows SQL Server either to restore or to back out of each transaction:
At a checkpoint, SQL Server ensures that all log records and all database pages that have been modified are written to disk. During the recovery process of each database that occurs when SQL Server is restarted, a transaction must be rolled forward only when it is not known whether all the data modifications in the transaction were actually written from the SQL Server buffer cache to disk. Because a checkpoint forces all modified pages to disk, it represents the point at which the startup recovery must start rolling transactions forward. All pages that were modified before the checkpoint are guaranteed to be on disk, so there is no need to roll forward any transaction that was completed before the checkpoint.
Each log file is divided logically into smaller segments called virtual log files. Virtual log files are the unit of truncation for the transaction log. When a virtual log file no longer contains log records for active transactions, it can be truncated and the space becomes available to log new transactions.
The smallest size for a virtual log file is 256 KB. The minimum size for a transaction log is 512 KB, which provides two virtual log files of 256 KB each. The number and size of the virtual log files in a transaction log increase as the size of the log file increases. A small log file might have a small number of small virtual log files (for example, a 5 MB log file comprised of 5 virtual log files of 1 MB each). A very large log file has larger virtual log files (for example, a 500 MB log file comprised of 10 virtual log files of 50 MB each).
SQL Server avoids having many small virtual log files. When SQL Server recovers a database, it must read each virtual log file header. Each log file header costs one page I/O. The more virtual log files, the longer the time required for a database to start. Approximately 50 to 100 log files is acceptable; 1,000 log files may be too many. The number of virtual log files grows much more slowly than their respective sizes. If a log file grows in small increments, it tends to have many small virtual log files. If a log file grows in larger increments, SQL Server creates fewer larger virtual log files. For example, if the transaction log grows by 1 MB increments, the virtual log files are smaller and more numerous than those corresponding to a transaction log that grows at 50 MB increments.
By default, a log file automatically grows (autogrow) unless you specified otherwise when you created it. The log grows by the increment set at creation time. The increment can be a percentage of the file or a specific MB/GB value. Every time the autogrow feature activates, more virtual files are created. To avoid creating too many virtual files by preventing frequent activation of the autogrow feature, set the autogrow increment to a reasonable size. For example, if the log is 200 MB and grows to 400 MB daily, set the autogrow increment to 50 or 100 MB rather than to 1 MB.
The autogrow feature also stops the transaction log to allow for the addition of space. As a rough estimate, you can add 3 MB per second. As records are written to the log, the end of the log grows, moving from one virtual log file to the next. If there is more than one physical log file for a database, the end of the log grows, moving through each virtual log file in each physical file before it circles back to the first virtual log file in the first physical file. Only when all existing log files are full will the log begin to grow automatically.
The log can also be set to shrink automatically (autoshrink). By default, autoshrink is on for the SQL Server Desktop Edition and off for the Standard and Enterprise editions.
The autoshrink feature does not shrink logs smaller than their original sizes set by the database administrator. If the autoshrink feature is on, a calculation is performed every 30 minutes to determine how much of the log has been used. If the log size exceeds the original size, the log is marked for shrinkage, assuming available free space. The actual shrinking of the log takes place with the next log truncation or BACKUP TRANSACTION LOG statement. The shrinking of a log incurs a performance cost.
Recommendations:
In SQL Server 6.5, sp_configure options for improving the performance of the I/O subsystem include: max async IO, max lazywrite, RA cache hit limit, RA cache miss limit, RA delay, RA prefetches, RA slots, RA worker threads, recovery interval, and logwrite sleep. SQL Server 7.0 minimizes the manual tuning required of the database administrator when configuring a system. Most of the system configuration options in SQL Server 7.0 are self-tuning.
Table 3. Sp_configure options: SQL Server 6.5 compared to SQL Server 7.0
SQL Server 6.5 |
SQL Server 6.5 |
SQL Server 7.0 |
SQL Server 7.0 |
|
Configuration options | Min. | Max | Min. | Max |
max async IO | 1 | 1,024 | 1 | 255 |
max lazywrite | 1 | 1,024 | N/A | N/A |
RA cache hit limit | 1 | 255 | N/A | N/A |
RA cache miss limit | 1 | 255 | N/A | N/A |
RA delay | 0 | 500 | N/A | N/A |
RA prefetches | 1 | 1,000 | N/A | N/A |
RA slots | 1 | 255 | N/A | N/A |
RA worker threads | 0 | 255 | N/A | N/A |
recovery interval | 1 | 32,767 | 0 | 32,767 |
logwrite sleep | 1 | 500 | N/A | N/A |
In SQL Server 7.0, the max lazywrite option is no longer available. The SQL Server memory manager calculates the right time to flush the data cache based on free buffers and activity.
In SQL Server 7.0, max async IO is still available but has changed to represent the maximum number of outstanding asynchronous disk input/output (I/O) requests that the entire server can issue against a file (not database). Therefore, you must know how many files are in the database. The default has changed from 8 in SQL Server 6.5 to 32 in SQL Server 7.0 because of the vast improvements in hardware speeds. As in SQL Server 6.5, you should monitor disk activity to determine if this parameter should be raised or lowered. If disk activity is low, this parameter can be raised. If there are spikes in disk activity showing 100 disk time and queue length greater than 3, consider lowering max async IO.
Recommendations:
Though read-ahead (RA), or prefetch, has been dramatically improved in SQL Server 7.0, all of its parameters have been removed. Read-ahead is tuned automatically based on data access. By using file maps, called Index Allocation Maps (IAM), each file is prefetched in parallel on sequential reads. Because the optimizer informs SQL Server when data needs to be scanned, SQL Server automatically activates read-ahead without tuning configuration parameters.
The logwrite sleep option has also been removed from SQL Server 7.0. logwrite sleep is rarely used in SQL Server 6.5, and it is recommended that the default remain unchanged. The logwrite sleep option delays the 16-KB buffer written from memory to the log to ensure that the buffer is being filled prior to writing to the log. In SQL Server 7.0, the log write operation has been improved dramatically; this configuration option is no longer necessary.
The recovery interval option controls when SQL Server issues a checkpoint to each database. Checkpoints are issued on a per database basis. At a checkpoint, SQL Server ensures all log information and all modified pages are flushed from memory to disk, reducing the time needed for recovery by limiting the number of transactions rolled forward. Modifications performed before the checkpoint must not be rolled forward because they have been flushed to disk at the checkpoint. The actual checkpoints are controlled by a combination of the recovery interval setting and how much data was written to the log during that interval. For example, if the recovery interval is set to 5 minutes and very little log activity has taken place after 5 minutes, the system checkpoint will not trigger.
In SQL Server 6.5, the default for recovery interval is 5 minutes. In SQL Server 7.0, the default is 0, which means SQL Server automatically configures the option. The recovery process is much quicker and the system checkpoint is less intrusive, therefore SQL Server may checkpoint more often to guarantee faster recovery upon a failure. Monitor the disk writes for the data to determine if this is the appropriate setting. If you see spikes of activity that is causing heavy use of the disk, you can change this parameter to checkpoint less often. If you are going to change this parameter, you may want to change it to 5 and continue monitoring.
Recommendations:
The tempdb database holds all temporary tables and work tables in queries and handles any other temporary storage needs such as sorting, joining, and aggregation of data for queries. The tempdb database is a global resource: The temporary tables and stored procedures for all users connected to the system are stored there. The tempdb database is re-created every time SQL Server starts so the system starts with a clean copy of the database. Because temporary tables and stored procedures are automatically dropped on disconnect, and no connections are active when the system is shut down, there is never anything in tempdb to be saved from one session of SQL Server to another.
Because of enhancements in the query processor, the tempdb database is used more often in SQL Server 7.0 than it is in earlier versions. As a guideline, expect the size of tempdb to be 25 to 50 percent larger, depending on the size and complexity of the queries.
Calculating an appropriate size for tempdb is challenging in SQL Server 6.5. When tempdb runs out of space, the active query requesting tempdb terminates. Therefore, you must make tempdb very large, even if only 1 to 5 percent of your activity requires a large database. SQL Server 7.0 addresses this issue with the autogrow feature of tempdb. The tempdb database automatically grows as needed. Each time the system is started, tempdb is reset to its default size. Automatically growing tempdb results in some performance degradation, just like automatically growing other user databases and logs. Set a reasonable size for tempdb and the autogrow increment to ensure that the autogrow feature is not frequently activated.
The tempdb database is on disk, therefore your disk configuration can affect performance. If tempdb is stored on a single physical disk and multiple requests are made for tempdb use, requests are quickly queued on that disk. A simple solution is to stripe tempdb across the same drives as the data and indexes. A larger stripe set with availability (either RAID 1 or RAID 5), results in adequate performance for tempdb unless the disks are already busy. The fastest performance for tempdb may result from storing tempdb alone on a separate set of RAID 0 disks. There is no contention and only one write because RAID 0 does not have a high availability feature. You can run tempdb with no availability because your current active transaction is lost only during a disk failure. tempdb is not persistent and is never recovered, therefore availability features on tempdb disks are not needed. The disadvantage is that if a physical drive fails, the whole server is inaccessible to users while you replace it. The trade-off is the availability of the server versus the performance of tempdb.
Recommendations:
In SQL Server 7.0, individual text, ntext, and image pages are not limited to holding data for only one occurrence of a text, ntext, or image column. A text, ntext, or image page can hold data from multiple rows; the page can even have a mix of text, ntext, and image data. This feature along with larger page sizes (8 KB) helps to reduce the space requirements for storing text and image data.
text, ntext, and image values are not stored as part of the data row but in a separate collection of pages. For each text, ntext, or image value, only a 16-byte pointer is stored in the data row. Each pointer indicates the location of the text, ntext, or image data. A row containing multiple text, ntext, or image columns has one pointer for each text, ntext, or image column.
Although you always work with text, ntext, and image data as if it were a single long string of bytes, the data is not stored in that format. The data is stored in a collection of 8-KB pages that are not necessarily located next to each other. In SQL Server 7.0, the pages are logically organized in a B-tree structure, and in earlier versions of SQL Server they are linked in a page chain. The advantage of the method used by SQL Server 7.0 is that operations starting in the middle of the string are more efficient. SQL Server 7.0 can quickly navigate the B-tree; earlier versions of SQL Server have to scan through the page chain. The structure of the B-tree differs slightly depending on whether there is less than 32 KB of data or more.
Although this B-tree storage method for text data may require more I/Os than a normal char or varchar data retrieval, it provides much faster text and content text search than in SQL Server 6.5.
Because SQL Server pages have increased from 2 KB to 8 KB, the char and varchar data types now allow up to 8,000 bytes of character data. If you have a text field with less than 8,000 bytes of data, you can use varchar instead of the text data type to improve query performance by avoiding traversing the text B-tree.
Recommendation:
This section describes the architecture of Microsoft SQL Server memory algorithms. It makes recommendations for exploiting the new enhancements in memory, including: dynamic memory configuration, stored procedures, log, and other configuration settings.
The memory algorithms and use of memory by SQL Server objects are major changes in SQL Server 7.0 that improve the performance of the database and also minimize the work the database administrator must do to configure memory for good performance.
In Microsoft SQL Server 6.5, memory is segmented and manually managed. The database administrator must first determine how much memory SQL Server should use versus the operating system. For example, with 256 MB of memory, SQL Server may get 200 MB and leave 56 MB for the operating system. This in itself is an art, not a science. It is very difficult to plan how much the database alone needs, much less plan what the operating system and other applications, such as Web servers, running on the same computer might need. Use of memory is not stagnant; it is possible that SQL Server may need more memory from 8 A.M. to 5 P.M., and the operating system may need more memory from 5 P.M. to 8 A.M. to run nightly batch work. Changing the memory configuration requires a shutdown and startup of SQL Server 6.5.
After memory is allocated to SQL Server 6.5, it is segmented into four sections: static structures, data cache, stored procedure cache, and dynamic allocations (a subset of data cache).
Static structures are the preallocation of SQL Server memory to SQL Server components such as user connections, locks, open objects, and worker thread. The static structure memory is allocated upon SQL Server startup. The dynamic structures are allocations of memory for users, locks, and objects added above and beyond the static structure allocation and take memory away from the data cache. The rest of the memory is divided between the data cache and the stored procedure cache. The SQL Server 6.5 procedure cache configuration option sets the percentage of available memory for the procedure cache, and the remainder is assigned to data cache.
As an example of SQL Server 6.5 memory allocation, assume that the system has 256 MB. The SQL Server memory option allocates 200 MB to SQL Server, leaving 56 MB for Windows NT. Of the 200 MB of SQL Server memory, static structures require 10 MB, leaving 190 MB. The procedure cache is set to 30 percent; therefore, stored procedure memory receives 57 MB (190 * .3) and data cache receives 133 MB (190 - 57). Any further dynamic allocations are taken from the data cache.
SQL Server 6.5 memory manages data with a LRU (least recently used) and MRU (most recently used) link list. Data in memory that is active (being read or written to) links to the MRU part of the link list. Data in memory buffers that has not been accessed for a while is shuffled to the LRU part of the link list. The MRU/LRU link list is constantly being modified.
Microsoft SQL Server 7.0 has dramatically improved the way memory is allocated and accessed. Unlike SQL Server 6.5, in which memory is managed by the database administrator with configuration settings, SQL Server 7.0 has a memory manager to eliminate manual memory management.
SQL Server 6.5 has a memory configuration option that allocates a fixed amount of memory on startup. If the parameter is set too high, SQL Server cannot start. When SQL Server 7.0 starts, its dynamic memory allocation determines how much memory to allocate based on how much memory the Windows NT operating system and applications for Windows NT are using. For example, assume that Windows NT has a total of 512 MB of memory. When SQL Server starts up, Windows NT and the applications running on Windows NT are using 72 MB of memory. SQL Server uses available memory, leaving 5 MB free. Therefore, SQL Server uses 435 MB of memory (512 MB total – 72 MB for active Windows NT – 5 MB of free memory = 435 SQL Server MB memory). If another Windows NT-based application is started and uses the 5 MB of free space, SQL Server proactively releases memory to ensure that 5 MB of free space always remains free. Conversely, if Windows NT releases memory so that the free memory is more than 5 MB, SQL Server uses that memory and uses it for database operations.
This dynamic memory algorithm has many advantages. You no longer need to guess the correct memory percentages for Windows NT, Windows NT-based applications, and SQL Server. You can also avoid Windows NT paging during times of heavy Windows NT usage, and you can use Windows NT free memory during times of light Windows NT usage.
The memory algorithm for SQL Server 7.0 Desktop Edition works differently. Rather than taking memory when it is free, it gives memory back to the operating system when it is not needed. This is because it is more likely that the Desktop Edition is running other applications.
In SQL Server 7.0, the min server memory configuration option ensures that SQL Server starts with a minimum value and does not release memory below this value. SQL Server never goes below the min server memory value. If the value is set at 0, SQL Server memory manager manages this for you. The value of this option should be set based on the size and activity of the computer running SQL Server.
The SQL Server 7.0 max server memory configuration option determines the maximum amount of memory SQL Server can allocate at startup and during execution. A max server memory value of 0 indicates that there is no maximum and enables dynamic memory to determine the value. If max server memory is set to a value, SQL Server does not allocate memory above that value. Set the max server memory option if multiple applications are running on Windows NT and you want to guarantee these applications the appropriate amount of memory.
Recommendations:
The SQL Server 7.0 data access method uses a clock algorithm. Unlike the MRU/LRU link list in which pages are constantly linked and relinked, SQL Server 7.0 sweeps through the buffers and keeps two values in each buffer: Last Touched and Last Checked. Last Checked is the last time the clock has checked on this buffer. Last Touched is the last time this buffer page was active. Pages are marked as part of the free buffer list when Last Touched has not changed in some time. If Last Touched has changed (for example, data was read or written to this buffer page), this page stays out of the free buffer list for a longer period of time. The reason the clock algorithm is superior to the SQL Server 6.5 LRU/MRU link list is that links do not have to be updated, only the Last Touched and Last Checked values.
Because SQL Server 6.5 does not have a true memory manager like SQL Server 7.0, you can set several sp_configure options in SQL Server 6.5 to improve memory performance. These configuration options include: free buffers, hash buckets, locks, logLRU buffers, max lazywrite IO, max worker threads, memory, open databases, open objects, procedure cache, recovery interval, set working set size, sort pages, and tempdb in RAM. SQL Server 7.0 minimizes the amount of tuning needed by the database administrator for system configurations. Most of the system configuration parameters in SQL Server 7.0 are self-tuning.
Table 4. Sp_configure options: SQL Server 6.5 vs. SQL Server 7.0
SQL Server 6.5 |
SQL Server 6.5 |
SQL Server 7.0 |
SQL Server 7.0 |
|
Configuration options | Min. | Max | Min. | Max |
free buffers | 20 | 524,288 | N/A | N/A |
hash buckets | 4,999 | 65,003 | N/A | N/A |
locks | 5,000 | 2,147,483,647 | 0 | 2,147,483,647 |
max lazywrite IO | 1 | 1,024 | N/A | N/A |
max worker threads | 10 | 1,024 | 10 | 1,024 |
memory | 2,800 | 1,048,576 | N/A | N/A |
open databases | 5 | 32,767 | N/A | N/A |
open objects | 100 | 2,147,483,647 | 0 | 2,147,483,647 |
procedure cache | 1 | 99 | N/A | N/A |
recovery interval | 1 | 32,767 | 0 | 32,767 |
set working set size | 0 | 1 | 0 | 1 |
sort pages | 64 | 511 | N/A | N/A |
tempdb in RAM | 0 | 2,044 | N/A | N/A |
In SQL Server 6.5, the free buffers option determines the size of the free buffer list, which is 5 percent of SQL Server memory by default. The free buffers option is no longer tunable in SQL Server 7.0. With dynamic memory set on and memory size changing dynamically, it would be difficult for any database administrator to estimate the right size for free buffers.
The value of the hash buckets option is a prime number used as input into a memory-hashing algorithm. The larger the memory allocation, the larger the hash bucket value should be. Because dynamic memory changes the size of SQL Server memory, this option is now set automatically by the memory manager and no longer appears as a configurable option in SQL Server 7.0.
Locks are a perfect example of the power of the SQL Server 7.0 memory manager. In SQL Server 6.5, you must specify enough locks to prevent running out. If you run out of locks, the active thread requesting locks terminates. If you specify too many locks, the memory is preallocated and wasted. Each lock requires approximately 30 to 60 bytes of memory. In SQL Server 7.0, you can specify the configured value as 0 to allow SQL Server to manage it. In the case of locks set to 0, SQL Server 7.0 memory manager ensures you never run out. It adds more locks as you need them.
Recommendation:
The max lazywrite option is set by the database administrator to help the SQL Server 6.5 memory determine how often to flush the cache. The SQL Server 7.0 memory manager manages the lazywrite process automatically, and the option has been removed from the sp_configure stored procedure.
The max worker threads option indicates the amount of internal SQL Server threads used by SQL Server connected users to get SQL Server resources. The default setting for this option is 255, slightly too high. Each worker thread that is allocated, but not active, uses unnecessary resources such as memory that can be used better by other operations. This configuration value should be similar to the concurrent user connections to SQL Server, but cannot exceed 1024.
In SQL Server 6.5, the memory option indicates a fixed allocation of memory for SQL Server. SQL Server 7.0 memory allocation is dynamic at start time. SQL Server allocates as much as memory as is available, reserving 5 MB as free memory.
The open databases option, specifying the number of active databases SQL Server allows at any one time, has been removed for SQL Server 7.0. There is no longer a limit.
The open objects option indicates the number of actively open objects SQL Server allows at any point in time. Open objects include all tables, views, stored procedures, extended stored procedures, triggers, rules, defaults, and constraints. If you run out of open objects, the active thread requesting an object terminates. In SQL Server 6.5, this option should be set high enough to avoid running out, but not so high as to waste preallocated memory for objects. Each object in SQL Server 6.5 uses 240 bytes of memory. In SQL Server 7.0, you can set open objects to 0 so that SQL Server manages objects for you and dynamically adds them when you need them.
Recommendation:
Recommendation:
In SQL Server 6.5, the sort pages option indicates the amount of memory allocated per user for sorting operations. The larger the allocation, the faster the sort performs. This option was used for all sorting including query sorts and create index sorts. The query processor now manages queries, and index sorts are managed by the new index create memory configuration option. In SQL Server 7.0, the sort pages option is no longer valid.
All join and sort operations that are not performed in memory take place in the tempdb database. In SQL Server 6.5, you can move the entire tempdb into memory. Having the tempdb database in RAM is not a benefit because it takes away memory from other data access uses. Because SQL Server 6.5 memory allows only minimal size for sorting (64 2-KB pages to 511 2-KB pages), using the tempdb in RAM option is the only way to sort at RAM speeds. SQL Server 7.0 allows a much higher allocation of memory for sorting and the tempdb in RAM option is no longer a valid option.
These new sp_configure options allow values to be set as memory configurations to determine minimum and maximum values:
The extended memory size option is only available for Microsoft SQL Server 7.0, Enterprise Edition running under future versions of Microsoft Windows NT, Enterprise Edition, on an Alpha platform. This option indicates the number of megabytes of memory to use as a disk cache in addition to the conventional buffer pool. For example, on a computer with 8 GB of memory, a reasonable value for extended memory size might be in the range of 5,000 to 6,000. This allows 2 GB of conventional memory usage for SQL Server and most of the rest for use as an extended memory cache.
When SQL Server Enterprise Edition is used on future versions of Windows NT, Enterprise Edition, if the system administrator has configured the 3-GB switch in Boot.ini (referred to in Windows NT as 4GT tuning), the conventional memory can be a maximum of 3 GB instead of 2 GB. In this situation, the default value of 0 (self-configuring) for max server memory would cause the server to use up to 3 GB of conventional memory. Therefore, the absolute upper end of the extended memory range (for the hypothetical 8-GB system) would be 5,000.
The index create memory option controls the amount of memory used by index creation sorts. If you build large indexes, you may want to experiment with increasing the value of this setting from its default. Memory sort allocations are controlled through the min memory per query option. Therefore, if you request 2,000 KB for index create memory and min memory per query is set to 2,000 KB, the create index operation waits to execute until it can get that much memory. On a large scale production server, creating indexes is not a frequent task. Most often it is a scheduled job at off-peak times because of the intrusiveness of the create index process. Therefore, if you are running create index operations infrequently and at off-peak hours, increase this number significantly. Remember to keep min memory per query at a lower number so the job will activate even if all the requested memory is not available.
Recommendations:
Use the min memory per query option to specify the minimum memory (in kilobytes) to be allocated for the execution of a query. For example, if min memory per query is set to 2,048 KB, the query is guaranteed to receive at least that much total memory. You can set the min memory per query option to any value from 0 to 2,147,483,647KB (2 GB). The default is 1,024 KB.
Increasing the value of query memory generally improves the performance of queries that use hashing or sorting operations, particularly when there is significant memory available and few concurrent queries. min memory per query includes memory allocated for sorting and replaces the sort pages option in earlier versions of Microsoft SQL Server. Do not set min memory per query too high, especially on very busy systems, because the query must wait until it can get the minimum memory requested or until the query wait setting is reached. (query wait is another sp_configure option that specifies how long to wait for resources before canceling the query altogether.)
Recommendations:
In SQL Server 6.5, stored procedure cache is a fixed segment of memory based on the procedure cache option of the sp_configure stored procedure. In SQL Server 7.0, this option has been eliminated. SQL Server 7.0 memory manager dynamically manages how much memory stored procedures require and allocates it to them.
How stored procedures are used in memory has also changed. In SQL Server 6.5, if 100 users want to use a precompiled stored procedure at the same time, SQL Server must bring up 100 copies of that stored procedure in memory. In SQL Server 7.0, only one instance of that stored procedure is in memory and is used by all 100 users. Stored procedures often get passed variables and these variables may affect the query plan. The SQL Server 7.0 memory manager determines when to recompile that stored procedure and bring in a new plan based on the input variables.
In this example, stored procedure A has two variables passed to it that are inserted into a SQL statement as part of an equal clause:
"SELECT * FROM Table WHERE field1=&variable1 AND field2=&variable2"
The plan remains the same because it follows an equal predicate and the cached stored procedure is used.
In this example, stored procedure A has two variables passed to it that are inserted into a BETWEEN clause:
"SELECT * FROM Table WHERE field1 BETWEEN &variable1 AND &variable2"
The stored procedure may have to be recompiled to get a new plan because the variables may dictate a new plan.
Microsoft SQL Server 7.0 now caches the plan for dynamic Transact-SQL statements as well as static statements. This can be advantageous in an ad hoc environment in which many users are executing the same dynamic Transact-SQL statement. The memory manager gives a lower priority to dynamically cached Transact-SQL than it does to static Transact-SQL, such as stored procedures; therefore, the stored procedure stays in the buffer cache longer.
The SQL Server: Buffer Manager object in Windows NT Performance Monitor provides counters to monitor how SQL Server uses memory to store data pages and internal data structures. Monitoring the memory used by SQL Server can help determine whether bottlenecks exist due to a lack of available physical memory for storing frequently accessed data in cache, in which case SQL Server must retrieve the data from disk. By monitoring memory, you can determine if query performance can be improved by adding more memory, or making more memory available, to the data cache or SQL Server internal structures.
The following table describes some of the SQL Server: Buffer Manager counters:
Table 5. The SQL Server: Buffer Manager counters
SQL Server: Buffer Manager counters | Description |
Buffer Cache Hit Ratio | Percentage of pages that were found in the buffer cache without having to read from disk. This number is an accumulation from when SQL Server is started. |
Committed Pages | Number of buffer pages committed. |
ExtendedMem Cache Hit Ratio | Percentage of page requests that were satisfied from the extended memory cache. |
ExtendedMem Cache Migrations | Number of pages migrated into the extended memory cache region. |
ExtendedMem Requests | Number of requests for pages from large memory region. |
Free Buffers | Number of free buffers available. |
Lazy Writes | Number of buffers written by buffer manager’s lazy writer. |
Page Requests | Number of requests for buffer pages. |
Readahead Pages | Number of requests to asynchronously prefetch pages before they are actually encountered. |
Reserved Page Count | Number of buffer cache reserved pages. |
Stolen Page Count | Number of buffer cache pages that have been stolen to satisfy other server memory requests. |
Microsoft SQL Server 7.0 uses multiple processors for single tasks such as queries, utilities, and scanning data. This section describes how SQL Server exploits hardware processors. It can help you determine the hardware processors you need to get the most out of SQL Server as well as describe the threading model and parameters that you can set to get the best possible performance.
The threading model used by SQL Server affects how efficiently the processors are used. This includes how many concurrent users it can handle and how quickly each of the users gets processing power.
SQL Server 6.5 uses native Window NT threads. The Windows NT kernel uses its own scheduler to manage the SQL Server thread scheduling as well as synchronization services. Although this process has produced high concurrency on a small number of processors, it does have limitations. At high loads, the thread context switching has some overhead. (Although the implementation of I/O completion ports aided significantly here). Because SQL Server lets Windows NT manage the threads, it has little control over which threads are preempted on and off of the processor and how often. The Windows NT kernel scheduler and dispatch lock periodically sees contention with a large number of threads. Although this contention can exist, SQL Server 6.5 has achieved more than 5,000 concurrent users on a four-way processor in TPC-C testing.
SQL Server 7.0 has improved the threading model. The most important enhancement is the replacement of the Windows NT kernel schedule with the Microsoft SQL Server scheduler called User Mode Schedulers (UMS). SQL Server has one UMS for every processor. This UMS controls the scheduling of fibers and/or thread requests.
Fibers are a new Windows NT feature exploited in SQL Server 7.0, and are useful on CPU-busy systems with high-context switching. Fibers are lightweight and run on top of native Windows NT threading as shown in the preceding illustration. SQL Server now uses only one Windows NT thread per processor and many lightweight fibers on top of that thread. Fibers automatically assume the identity of the Windows NT thread they are running on and are nonpreemptive with respect to other SQL Server threads running on the other processors. Fibers, called lightweight pooling, should be turned on only if CPU is 100 percent and context switching is high.
The most CPU-intensive database applications benefit from this added thread scheduling flexibility. You now have the ability to move to an even higher number of concurrent users and achieve greater transaction throughput.
Recommendation:
The Microsoft SQL Server 7.0 query processor has been redesigned to support the large databases and complex queries found in decision support, data warehouse, and OLAP applications. The query processor includes several new execution strategies that can improve the performance of complex queries. One of the significant improvements is in the area of parallel queries.
SQL Server 7.0 supports parallel execution of a single query across multiple processors. A CPU-bound query that must examine a large number of rows often benefits if portions of its execution plan run in parallel. SQL Server 7.0 automatically determines which queries will benefit from parallelism and generates a parallel execution plan. If multiple processors are available when the query begins executing, the work is divided across the processors. Parallel query execution is enabled by default.
During query optimization, SQL Server looks for queries that might benefit from parallel execution. For these queries, SQL Server inserts exchange operators into the query execution plan to prepare the query for parallel execution. An exchange operator is an operator in a query execution plan that provides process management, data redistribution, and flow control. After exchange operators are inserted, a parallel query execution plan results. A parallel query execution plan can use more than one thread, whereas a serial execution plan, used by a nonparallel query, uses only a single thread for its execution. The actual number of threads used by a parallel query is determined at query plan execution and initialization and is called the degree of parallelism.
The advantage of the exchange operator is that parallel query is enabled for all aspects of query execution (insert, delete, and select operations), and is enabled in one place efficiently and reliably. This is more effective than implementing parallel query multiple times in multiple operations.
Microsoft SQL Server automatically detects the best degree of parallelism for each instance of a parallel query execution by considering the following:
Only computers with more than one processor can take advantage of parallel queries.
SQL Server monitors its CPU usage and adjusts the degree of parallelism at query startup time. Lower degrees of parallelism are chosen if the CPUs are already busy.
Each query requires a certain amount of memory to execute. Executing a parallel query requires more memory than a nonparallel query. The amount of memory required for executing a parallel query increases with the degree of parallelism. If the memory requirement of the parallel plan for a given degree of parallelism cannot be satisfied, SQL Server automatically decreases the degree of parallelism or completely abandons the parallel plan for the query in the given workload context and executes the serial plan.
Queries that ravenously consume CPU cycles are the best candidates for a parallel query, for example, joins of large tables, substantial aggregations, and sorting of large result sets. Simple queries, often found in transaction processing applications, illustrate that the additional coordination required to execute a query in parallel outweigh the potential performance boost. To distinguish between queries that benefit from parallelism and those that do not, SQL Server compares the estimated cost of executing the query with the cost threshold for parallelism value. Although not recommended, you can change the default value of 5 using sp_configure.
If the optimizer determines the number of rows in a stream is too low, it does not introduce exchange operators to distribute the stream. Consequently the operators in this stream are executed serially. Executing the operators in a serial plan avoids scenarios when the startup, distribution, and coordination cost exceeds the gains achieved by parallel operator execution.
Recommendations:
Microsoft SQL Server 7.0 can read sequential data in parallel by using files and filegroups. Having multiple CPUs available to you allows execution of these parallel data scans to take place quickly.
Recommendation:
Microsoft SQL Server 7.0 server operations, such as backup, restore, DBCC, bulk copy, and CREATE INDEX run much faster and have less impact on server operations. Performance for these operations has improved for two reasons: the way each of these operations uses the SQL Server 7.0 optimizer and the ability of SQL Server to run many of these operations in parallel.
Multiple backup devices can be used for backup and restore operations. This allows SQL Server to use parallel I/O to increase the speed of backup and restore operations because each backup device can be written to or read from concurrently with other backup devices. For example, if it takes four hours to back up a database to a single tape drive, then the backup speed when two tape drives are used is likely to be two hours. For enterprises with very large databases, using many backup devices can decrease the time required for backup and restore operations.
Multiple nonclustered indexes can be scheduled in parallel, significantly reducing the time it takes to create multiple nonclustered indexes on a single table. Multiple CPUs can help run these create index operations faster.
Recommendation:
SQL Server 7.0 exploits multiple processors better than SQL Server 6.5. SQL Server 7.0 does not require you to upgrade your computers to more processors, but can achieve much better performance if multiple processors are available
In SQL Server 7.0, three sp_configure stored procedure options that relate to processor use are:
In Windows NT, an activity or thread in a process can migrate from processor to processor, with each migration reloading the processor cache. Under heavy system loads, specifying which processor should run a specific thread can improve performance by reducing the number of times the processor cache is reloaded. The association between a processor and a thread is called processor affinity.
You can use the affinity mask option to increase performance on SMP systems with more than four microprocessors operating under heavy load. You can associate a thread with a specific processor and specify which processor(s) SQL Server will use. You can exclude SQL Server activity from processors that are given specific workload assignments by the Windows NT operating system.
Recommendations:
The lightweight pooling option provides a means of reducing the system overhead associated with the excessive context switching sometimes seen in SMP environments. When excessive context switching is present, lightweight pooling may provide better throughput by performing the context switching inline, thus helping to reduce user/kernel ring transitions. You can find the Context Switches/sec counter in Windows NT Performance Monitor under the object thread.
Setting the lightweight pooling option to 1 causes SQL Server to switch to fiber mode scheduling. The default value for this option is 0, or threading.
Recommendation:
The priority boost option is used to specify whether SQL Server should run at a higher Windows NT scheduling priority than other processes on the same computer. If you set this option to 1, SQL Server runs at a higher priority in the Windows NT scheduler. The default is 0 (run at priority base 7 on a single processor computer and at priority base 15 on an SMP computer). Setting the priority boost option to 1 changes the Windows NT priority base to 15 on a single processor computer and 24 on an SMP computer.
Although priority boost may seem like a useful option, it can cause more harm than good. By setting this option to 1 instead of 0, you increase the Windows NT priority for SQL Server threads over normal Windows NT user threads. This seems beneficial, but it can degrade the performance of users who are not SQL Server users.
If SQL Server is running on a computer with other applications and you want to ensure that all the applications get equal processor use, you can set the affinity mask option so that SQL Server gets only specific processors, and then set priority boost to 1, which will give SQL Server high priority on the specific processors and leave the other processors available for other applications.
Recommendation:
When several applications are running on the same server, you can set the affinity mask option so SQL Server gets only specific processors, and leaves processors free for the other applications. In this case, set the priority boost option to 1.
--------------------------------------------
The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.
This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS DOCUMENT.
©1999 Microsoft Corporation. All rights reserved.
Microsoft and Windows NT are either registered trademarks or trademarks of Microsoft Corporation in the United States and/or other countries.
Other product and company names mentioned herein may be the trademarks of their respective owners.
Part number: 098-83775