Kevin Collins
Jet Program Management
February 26, 1996
The primary goal for Jet version 3.0 was to increase performance significantly. To accomplish this, very few features were considered for this version. Particular attention was given to analyzing the bottlenecks present in Jet 2.x and to benchmarks of Jet against the competition. Below is a summary of the key performance enhancements. Benchmark timings and more detail on the performance enhancements follow.
Jet 3.0 employs threads internally to enhance performance and provide background services, such as:
By default, Jet will use up to three threads. The user can increase this number by modifying a registry setting. A user may want to increase the number of threads if there is a large amount of activity in their database. By default, Jet 3.0 does not add registry values to the registry but simply uses default values. If a user wants to change the default threading, or any other setting discussed in this paper, they will need to create the key HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\JET\3.0\ENGINES\JET in the system registry and add a value called Threads with a dataype of DWORD.
Jet 2.x pre-allocated a default of 512KB for its buffer with an upper limit of 4 MB RAM. Jet 3.0 allocates memory, on an as needed basis, up to an internally calculated high water mark (MaxBufferSize). This allows efficient use of memory in large RAM systems without having to adjust the registry settings. The high water mark is calculated by the following formula: ((Total RAM in Mb - 12)/4 + 512 KB). For a system with 32 Mb RAM, Jet 3.0 will use a calculated MaxBufferSize of 5632 KB. The default behavior can be overridden by setting the MaxBufferSize value in the system registry.
The minimum value that Jet 3.0 will use by default is 512 KB, but the minimum value the user could set is 128kb.
Unlike Jet 2.x, Jet 3.0 can exceed the MaxBufferSize. When this happens, Jet starts up a background thread to start flushing pages to bring the buffer pool down to the designated MaxBufferSize.
Jet 3.0 alters the interpretation of the page time-out parameter. In Jet 2.x, setting the PageTimeOut parameter to two seconds insured that a shared buffer would be kept at least two seconds and probably longer. In Jet 3.0, the buffer is kept no more than two seconds, giving the user precise control over the currency of the data returned. The default value for Jet 2.x is five tenths of a second and the default value for Jet 3.0 is 5000 milliseconds. This means that users might not immediately see other users' changes.
Further, Jet 3.0 implements Fox's strategy of recognizing when shared databases are not being updated, and suppresses buffer refreshing. Thus, performance on shared databases that are infrequently changed may approach the performance of databases opened exclusively.
Jet 3.0 has the ability to write changes to the database either synchronously or asynchronously.
In synchronous mode, the changes are written to the database before control returns to the application code. If transactions are used, the changes are written when the transaction commits; if transactions are not used, the changes are written at every Update or other DML statement. This is the only mode present in Jet 2.x.
In asynchronous mode, changes are stored up in memory for eventual writing to the database in another thread. The changes will be written when one of two things happens:
Asynchronous (or background) writes may improve performance in several ways:
Jet 2.x developers who have faithfully used transactions to achieve performance (as prescribed) should immediately recognize the utility of asynchronous writes in Jet 3.0. They may now remove transactions used solely for performance, and allow Jet to write changes out as needed. In fact, users may see a significant performance improvement by removing explicit transactions when dealing with DAO code and large sets of data. This is due to the fact that in Jet 2.x modified pages that were in the explicit transaction would spill to a temporary database if the cache defined by MaxBufferSize was exceeded. In Jet 3.0, this situation is eliminated when not using explicit transaction by always flushing the modified pages to the database, thus significantly reducing the number of writes to disk. While users may experience performance improvements with asynchronous processing, there are behavioral changes that may cause the application to function differently, especially in a multi-user environment (see examples below).
Note: The shipping version of Jet 3.0 incorrectly reverses the Boolean sense of the ImplicitCommitSync and UserCommitSync keys when specified in the registry; that is, a setting of "no" will be interpreted as "yes." They are discussed here in their non-reversed (normal) sense.
Sub DAOUpdate() Dim db As Database Dim rs As Recordset Dim sAddress As String Set db = OpenDatabase("NWind.mdb", True, False) Set rs = db.OpenRecordset _ ("SELECT * FROM Customers", dbOpenDynaset) While Not rs.EOF rs.Edit sAddress = rs!Address rs!Address = sAddress rs.Update rs.MoveNext Wend End Sub Sub DAOSQLUpdate() Dim db As Database Set db = OpenDatabase("NWind.mdb", False, False) db.Execute _ "UPDATE Customers SET Address = Address", _ dbFailOnError End Sub Sub DAOSQLUpdateTrans() Dim db As Database Dim ws as Workspace Set ws as Workspaces(0) Set db = OpenDatabase("NWind.mdb", False, False) ws.begintrans db.Execute _ "UPDATE Customers SET Address = Address", _ dbFailOnError ws.committrans End Sub
Jet 3.0 now ships as a 32-bit product to work more efficiently with Windows NT and Windows 95.
All database access—whether for user databases or for the temporary database—uses common buffering code. This eliminates the situation in Jet 2.x where pages are needlessly written to disk. It also allows all databases to benefit from performance enhancements in the core code, including heuristic multi-page I/O, usage-sensitive replacement policy, and asynchronous I/O.
The fixed size read-ahead buffer that Jet 2.x allocated for each open database is eliminated as well.
Jet 3.0 supports reading as many as 32 pages (64 KB) in one read.
Jet 2.x has a first-in, first-out buffer replacement policy. This means that a page might be discarded from memory even though it is being referenced frequently. Jet 3.0 implements a least-recently used replacement policy. Pages that are being used frequently by the user or the application remain in memory.
Jet version 2.x uses three separate index structures—and their associated support routines—for unique keys, short record lists, and long record lists. Jet 3.0 uses a single, common index structure throughout.
The largest benefit from this common index structure is that Jet 3.0 is much more efficient when building an index over many duplicate keys.
Jet 3.0 indexes supports leading key compression. Any bytes that are common to all key pairs in the node will be recorded only once. This is done in a way that still allows a node to be searched in a binary fashion. Compression allows duplicate keys to be stored in the same manner as unique keys, and achieves greater densities than was possible in Jet 2.X.
An example of this can be found when converting a 139 MB Jet 2.x database that contains only data to a Jet 3.0 database. The resulting file is 116 MB, a 23MB file savings! However, if the MDB file contains Access forms, reports or modules, the user may see the MDB file grow, because VBA (Visual Basic for Applications, the programming language in Access) requires approximately four times as much space as Access version 2.0.
Jet 2.x uses several mechanisms to monitor free and used space inside the database. Jet 3.0 introduces an entirely new mechanism for managing space in the database that allows for the following performance improvements:
It is now possible to grow tables in contiguous chunks of pages, avoiding most of the fragmentation problem in Jet 2.X. Thus, it is more likely that scanning a table will be accomplished by visiting the pages in increasing order of page number, avoiding the "bouncing around" problem in Jet 2.X, and by capitalizing upon the sequential heuristics of many of today's disk controllers. Furthermore, multi-page read ahead is more likely to hit pages related to the table being scanned. This differs from Jet 2.X, where multi-page reads are done by guessing, and may return pages for a number of different tables.
Each table now can track pages that are candidates to receive additional rows. When an insert is done, Jet attempts to lock these pages in succession (only one attempt) until a lock is obtained. If none of the candidates can be locked—or if there is insufficient room on the locked page—another page can be allocated to the table.
Jet 2.x required that the table header page and last data page both be locked in order to add a new page to a table. This prevented multiple users from inserting rows into the same table at the same time. Jet 3.0 no longer imposes this limitation, thus allowing for many users to insert rows into the same table at the same time. However, if indexes are present on a table, especially unique indexes, then locking conflicts can occur on the index page that needs to be maintained.
Jet 2.x maintains data pages by having them doubly-linked to each other. Thus, deleting all the rows in a table would require Jet to traverse through every page in the table. Jet 3.0 no longer maintains doubly-linked data pages, thus allowing for extremely fast deletions since every data page will not have to be visited. Substantial performance increases are seen when the table is dropped via a DDL DROP TABLE statement, or the table does not contain a referenced primary key and a DML DELETE statement that does not have a predicate.
In Jet 2.X, index pages which are freed were available for immediate reuse. For instance, a table with indexes that has all of the rows deleted can be freed and then allocated as a data page.
Jet 3.0 defers page reuse until all users of the database have closed it. This is a key element to Jet 3.0 performance as this greatly reduced the amount of read locks, thus increasing concurrency and reducing network traffic.
DBEngine.CompactDatabase is altered in the way it copies tables. Jet 2.x copies in physical order. Jet 3.0 copies in primary key order (if present). This effectively provides the equivalent of non-maintained clustered indexes. However, it is important to note that Jet does not maintain a clustered index and a true clustered index format is only achieved after doing a COMPACT. By doing frequent COMPACT's of the database you will ensure optimal performance for your application and correct any issues with pages being corrupted.
In the case where compact is being used to upgrade a database from Jet 2.x format to Jet 3.0 format, rows will be copied in physical order. This solves an upgrade problem for Access, which currently presumes that physical order and insertion order are the same. Once the database is in a Jet 3.0 format, compact begins to copy rows in primary key order. In order to disable this feature the user will need to add a key called CompactByPKey with a DWORD value of zero in the registry.
Jet 3.0 greatly reduces the number of read locks from Jet 2.x by only placing and holding them on long value data types (Memo/OLE) that exceed one page, and on index pages when they are being utilized to enforce referential integrity. This greatly reduces the amount of network traffic and reduces concurrency issues that users could see with Jet 2.X. In Jet 2.X, read locks were released arbitrarily, causing them to be somewhat defeating, and could either time out or be cleared by utilizing FreeLocks/DBEngine.Idle DBFreeLocks. Jet 3.0 read locks are held only as long as necessary, making FreeLocks/DBEngine.Idle DBFreeLocks obsolete.
Jet 2.x used a locking algorithm to determine what other user had a lock. By walking a byte range in a manner that would cause it to frequently hit bytes that were already locked. The user locking the page would be determined when Jet 2.x could successfully place a lock. The process of trying to place a lock and not succeeding proved to be a very costly operation. Jet 3.0 modifies this process by reversing the locking order and determining the user locking that page when a lock attempt fails.
Jet 2.x stored all of its long value data (Memo/OLE data types) in a special hidden system table. An example of this could be when users receiving locking conflicts when one user was adding a row with a LV data type in one table while another was adding a row with a LV row in a different table. Jet 3.0 eliminates this concurrency issue by creating individual sets of pages for each long value column.
Inside the DAO object and Jet model there are a functions that the Jet performance team used to help analyze and tune the performance of .MDB files. The reason this is hidden and not documented in the product documentation is that this function is not supported as an official part of DAO or Jet. While these functions currently work in Jet 3.0, they may not be supported in future releases of Jet. Also, the results that are returned may vary depending on various factors. Needless to say, you cannot expect to get support information about using these functions.
Even with these important caveats, if you are a true "database performance techie" you may find the following functionality useful when trying to substantiate timing results based on various ways of manipulating data in the database. Again note that these only apply to .MDB files.
ISAMStats( (StatNum as Long [, Reset as Boolean]) As Long
ISAMStats returns information about the "raw" disk reads, writes, locks and caching. This method returns the value of a given engine statistic as defined by StatNum, defined below. If the optional "Reset" argument is supplied then the statistic defined by StatNum is reset and no value is returned. If Reset is supplied and False, then it is equivalent to not supplying the argument at all. These statistics returned apply to the whole engine, regardless of how many databases or sessions are active, including temporary databases. The following table describes the valid settings for StatNum
StatNum |
Description |
0 |
Number of disk reads |
1 |
Number of disk writes |
2 |
Number of reads from cache |
3 |
Number of reads from read ahead cache |
4 |
Number of locks placed |
5 |
Number of release lock calls |
The following notes may help you interpret these statistics.
A typical scenario that you might use this function is when you're trying to tune a query and you're not sure which way SQL will result in less network traffic. Another scenario might be looking at the affect of placing explicit transactions vs. letting Jet use its internal transactions. A user can also adjust the registry settings and see how they affect the number of reads and writes to the database. Since timing can vary with a varying environment (i.e., increased network traffic, multiple processes, disk fragmentation) these statistics can help validate timing results.
One of the reasons that these statistics are not supported and will be ignored by support engineers if you start quoting them, is that using them in a meaningful way typically requires an extremely controlled environment. For example, when we use these in the test lab, we always reformat all the hard disks and reinstall the operating system, we run on an isolated LAN, we rebuild all databases from scratch (for example, import from another database) and we have carefully predetermined memory configurations and system settings (e.g. temp directory). Even after you do this and thus isolate many of the system variations, there are still many subtleties in your database design that can produce variations. For example, the existence of relationships, referential integrity, indexes, previously deleted fields or tables and attached tables will all influence your results.
In summary, we don't expect that this functionality will be useful to most users of DAO. However, we hope that those few people who do need this fine granularity of information will find it useful.
Private Sub cmdTestStats () Dim db As Database ' If you are using VB, make sure that the database is open ' or that the engine is initiliazed before issuing ' any ISAMStatus calls. DBEngine.Version is used ' to satisfy the purpose in this example gVersion = DBEngine.Version ' Explicitly set the counters to zero gDiskRead = DBEngine.ISAMStats(0, True) gDiskWrite = DBEngine.ISAMStats(1, True) gCacheRead = DBEngine.ISAMStats(2, True) gCacheReadAheadCache = DBEngine.ISAMStats(3, True) gLocksPlaced = DBEngine.ISAMStats(4, True) gLocksRelease = DBEngine.ISAMStats(5, True) Set db = OpenDatabase("c:\office95\access\samples\northwind.mdb", False, False) sSQL = InputBox("Enter a SQL string", "SQL Box", "UPDATE Customers SET ContactName = ContactName") db.Execute sSQL, dbFailOnError ' The following ISAMStats calls will retrieve the latest ' values. The values will accumulate until they are reset gDiskRead = DBEngine.ISAMStats(0) gDiskWrite = DBEngine.ISAMStats(1) gCacheRead = DBEngine.ISAMStats(2) gCacheReadAheadCache = DBEngine.ISAMStats(3) gLocksPlaced = DBEngine.ISAMStats(4) gLocksRelease = DBEngine.ISAMStats(5) End Sub ShowPlan
The Microsoft Jet query engine implements a cost-based query optimizer. When a query is compiled, the query engine create a plan of execution. This plan is used internally to find the quickest way to execute a query. This information is not exposed to the user of Microsoft Jet. You can, however, using the ShowPlan key in the Registry, cause the engine to create a text file containing query execution plans.
To activate ShowPlan, add the following key to the Registry:
\\HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\Jet\3.0\Engines\Debug
Under this key, add a string data type called JETSHOWPLAN (make sure you use all capital letters). To turn on ShowPlan, set the value of the new data type to ON. To turn it off, set the value to OFF.
When ShowPlan is turned on, Microsoft Jet appends text to a file called SHOWPLAN.OUT every time a query is compiled. This text contains the plan that the query optimizer has generated for that query.
For example, the following text shows the query plan for the Invoices query stored in the NORTHWIND.MDB database that ships with Access for Windows 95:
--- Invoices --- 01) Sort table 'Orders' 02) Inner Join table 'Shippers' to result of '01)' using temporary index join expression "Shippers.ShipperID=Orders.ShipVia" 03) Sort table 'Employees' 04) Inner Join result of '02)' to result of '03)' using temporary index join expression "Orders.EmployeeID=Employees.EmployeeID" 05) Inner Join result of '04)' to table 'Customers' using index 'Customers!PrimaryKey' join expression "Orders.CustomerID=Customers.CustomerID" 06) Inner Join result of '05)' to table 'Order Details' using index 'Order Details!OrdersOrder Details' join expression "Orders.OrderID=[Order Details].OrderID" 07) Sort table 'Products' 08) Inner Join result of '06)' to result of '07)' using temporary index join expression "[Order Details].ProductID=Products.ProductID"
Note that if you have ShowPlan turned on, Microsoft Jet will append the query plans to the SHOWPLAN.OUT text file every time the query is compiled. As such, the file can grow quite large. You should only turn this feature on during your development and debugging process. Never leave it on for production database applications.
During the course of the Jet 3.0 development, a large effort was organized to implement a suite of benchmarks to gauge the performance of Jet 3.0. Below are the results from some of those tests.
The tests were conducted on 36 identically configured P560 machines with 32 MB RAM that was, in most instances, configured down to 5 MB RAM, 540 MB IDE and 1.2 GB EIDE hard drives and PCI NIC's (Network Interface Card). The tests were run using only DAO/SQL commands on either NT 3.51 or Windows 95. When run in a network environment the Network OS was Netware 4.1 on a Dell XPE PowerEdge P90 with four 1GB RAID drives running off a dedicated EISA RAID SCSI host adapter, two four port PCI full-duplexed Ethernet NIC's and 32 MB RAM.
The following test loads one million rows from a 227 MB ASCII delimited file based off the SetQuery benchmark. All the reported SetQuery benchmarks were run with 16 MB RAM on Windows 95. The following commands create this benchmark with the database opened shared:
CREATE TABLE Bench (KSeq LONG, K500K LONG, K250K LONG, K100K LONG, K40K LONG, K10K SMALL INT, K1K SMALL INT, K100 BYTE, K25 BYTE, K10 BYTE, K5 BYTE, K4 BYTE, K2 BYTE, S1 CHAR (8), S2 CHAR (20), S3 CHAR (20), S4 CHAR (20), S5 CHAR (20), S6 CHAR (20), S7 CHAR (20), S8 CHAR (20))
The number values off of the columns starting with K represent the cardinality of the data. KSeq represent a primary key and K2 represents two unique values.
INSERT INTO Bench SELECT * FROM [SetQuery.DAT] IN ''[TEXT; Database=e:\]
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
5980 |
1569 |
4411 |
3.81 |
The following tests are based on the previous Load Times—ASCII files
CREATE INDEX KSeq ON Bench (KSeq) WITH PRIMARY
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
339 |
242 |
96 |
1.4 |
CREATE INDEX K500K ON Bench (K500K)
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
346 |
256 |
89 |
1.35 |
CREATE INDEX K1K ON Bench (K1K)
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
335 |
242 |
93 |
1.38 |
CREATE INDEX K25 ON Bench (K25)
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
393.37 |
240 |
153 |
1.64 |
CREATE INDEX K2 ON Bench (K2)
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
4179 |
234 |
3944 |
17.79 |
Using the SetQuery benchmark above, these times show improvement in deleting data without predicates.
DELETE * FROM Bench
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
383 |
33 |
350 |
11.52 |
Using the SetQuery benchmark above, these times show improvements in moving data within a Jet database.
SELECT * INTO Temp FROM Bench
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
8451 |
626 |
7824 |
13.48 |
Using the SetQuery benchmark above, these times show improvement over Jet 2.x.
SELECT COUNT(*) AS Q2B INTO Q2B FROM Bench WHERE K2 = 2 AND K100 <> 3
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
271 |
185 |
82 |
1.45 |
Using the SetQuery benchmark above, these times show improvements to Jet 2.X's Rushmore algorithms in the way that it retrieves bookmarks or pointers to actual data.
SELECT KSeq, K500K INTO Q4B FROM Bench WHERE K100 > 80 AND K10K BETWEEN 2000 AND 3000 AND K5 = 3 AND (K25 = 11 OR K25 = 19) AND K4 = 3
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
11 |
4 |
7 |
2.79 |
Using the SetQuery benchmark above, these times show the performance improvement without explicit transactions.
Set rs = db.openrecordset("Bench", dbOpenTable) While Not RS.EOF RS.Edit RS!S8 = "1234567890987654321" RS.Update RS.MoveNext Wend
Jet 2.x sec |
Jet 3.0 sec |
Difference sec |
Times Faster |
2419 |
1151 |
1268 |
2.1 |
One of the greatest performance improvements comes with the ability to have multiple workstations insert rows into the same table at the same time.