Harnessing the Power of Updatable Queries

Neil W. Black
Microsoft Corporation

February 1996

Neil is a Program Manager in the Data Access and Retrieval Technologies (DART) Group at Microsoft. He has worked at Microsoft for over six years as both a Test Lead and as a Program Manager, and he is currently the Lead Program Manager for the Jet database engine. Neil has a B.S. in Computer Science from Drexel University in Philadelphia.

Abstract

This paper discusses a feature of the Microsoft® Jet database engine known as updatable queries or updatable views. It is one of the most powerful features of the Jet database engine, but is also one of the least understood. This paper will help describe how updatable queries work and how you can use them to your advantage in applications.

Much of the text of this document comes from Chapter 4, "Queries," of the Microsoft Press® book Microsoft Jet Database Engine Programmer's Guide.

Introduction

Since version 1.0, the Microsoft® Jet database engine has provided the ability to update the result of a multitable join without writing any code. This is a feature that is found in few, if any, other desktop database systems.

In the first version of Jet, the following rules applied to query updatability:

While this seems like a simple set of rules, it turned out to cause much confusion among users of Microsoft Jet. To simplify this confusion, we dramatically increased the updatability rules in Jet version 2.0. However, while the flexibility of updatable queries was increased in Jet 2.0 to allow either side of a multitable join to be updated, the complexity of the rules increased as well. Because of this, many developers do not take full advantage of this powerful feature.

Terminology

Before describing the rules behind updatable queries, it is necessary to understand some basic terminology.

Referential Integrity

Referential integrity is a system of rules that the Microsoft Jet database engine will use to ensure that relationships between records in related tables are valid and that you don't accidentally delete or change related data. You can set referential integrity when all of the following conditions are met:

When referential integrity is enforced, you must observe the following rules:

If you want the Jet database engine to enforce these rules for a relationship, select the Enforce Referential Integrity check box when you create the relationship using the Relationships window in the Microsoft Access user interface. If referential integrity is enforced and you break one of the rules with related tables, Microsoft Access displays a message and doesn't allow the change.

You can override the restrictions against deleting or changing related records and still preserve referential integrity by selecting the Cascade Update Related Fields and Cascade Delete Related Records check boxes. When the Cascade Update Related Fields check box is selected, changing a primary key value in the primary table automatically updates the matching value in all related records. When the Cascade Delete Related Records check box is selected, deleting a record in the primary table deletes any related records in the related table.

Updating Your Data with Queries

This section describes the rules governing query updatability. You can use the dynamic selecting and sorting capabilities of the query engine for more than just static searches: After a result set is retrieved, the data can also be updated through a program such as Microsoft Access or a custom Microsoft Visual Basic® program.

Single-Table Query Updatability

A query based on a single table can select some or all records, and some or all columns, from that table. In the simplest case, the query returns all rows and all columns:

SELECT * FROM Customers;

Every row retrieved, and every column in every row, is accessible and subject to change (except calculated columns based on expressions). Every row can be deleted, and new rows can be inserted.

If you were to use Microsoft Access to open a datasheet based on this query, you would be able to freely browse forward and backward, making any changes you like (subject to security restrictions, validation rules, and referential integrity requirements). Any changes made to the result set created by this query would automatically be reflected in the underlying table.

A query can also select and sort rows from the original table and remain updatable:

SELECT *
FROM Customers
WHERE (((Customers.ContactTitle)="Owner"))
ORDER BY Customers.CompanyName;

An updatable single-table query like this can be useful in a data entry situation in which you would like to present the rows to the user in a certain order or hide certain rows based on the selection criteria.

One thing to be aware of, however, is that the Microsoft Jet database engine does not prevent the user from adding a new row through a query's result set that would not have met the original selection criteria. For example, the above query selects only rows in which the ContactTitle is "Owner". The user could add a new row and specify a value other than "Owner" as the ContactTitle, in which case the user would add a row that the query, if subsequently rerun, would no longer select. It is up to your application to enforce insert restrictions such as this.

Single-table query updatability restrictions

A query can select specific columns rather than selecting all columns with the asterisk (*) character:

SELECT
    Customers.ContactName,
    Customers.ContactTitle,
    Customers.Address,
    Customers.City,
    Customers.Region,
    Customers.PostalCode
FROM Customers;

This can be a useful technique to hide certain columns from users, while still allowing them access to the information they need. The columns made available through a query like this are fully updatable. It may, however, be impossible to add a new row through this query, either because the fields not included in the output are specified as "required" at the table level or because the primary key or foreign keys can't be created with default values.

Another restriction is on nonnative tables, such as those from Borland® Paradox®. The Microsoft Jet database engine requires that a primary key be defined for Paradox tables. Btrieve® and Xbase data sources do not require a unique index, but Open Database Connectivity (ODBC) tables do. Any queries based on ODBC tables lacking a unique index are not updatable.

Multitable Query Updatability

In addition to allowing updates to single-table queries, the Microsoft Jet database engine supports updatable multitable joins. This is extremely useful, because it enables you to combine data from the main table with lookup information from other tables (a many-to-one join) or to join a master table with a related detail table (a one-to-many join) and still have an updatable result set.

Many-to-one joins

This example joins the Products table (the focus of the query) with the Suppliers table, which provides lookup information, including the Supplier, CompanyName, and City:

SELECT DISTINCTROW
    Products.*,
    Suppliers.CompanyName,
    Suppliers.City
FROM Suppliers INNER JOIN Products
ON Suppliers.SupplierID = Products.SupplierID;

The user can change data in any column from the Products table, including the SupplierID that links the Products table to the Suppliers table. If you change the value of the linking field through a datasheet or a form in a Microsoft Access application, you will also see that the corresponding "lookup" information from the "one" table is automatically retrieved and redisplayed. This technique is known as row fix up or AutoLookup.

Although the focus of this many-to-one query is the Products table, it's also possible (though perhaps not desirable) to change the values in columns retrieved from the lookup table, such as the CompanyName or City field in Suppliers. The user may attempt to alter the CompanyName value on one row of the result set created from this query, under the impression that the change will affect only the current row. However, since the value is actually stored back in the Suppliers lookup table, the value is changed for every row in the result set.

One-to-many joins

There is no real logical difference between a many-to-one join, as described above, and a one-to-many join, except from the point of view of the user. A one-to-many join is sometimes referred to as a master-detail relationship. A single row in the "one" or "master" table is related to one or more rows in the "many" or "detail" table. Updatable multitable joins are especially useful with these types of one-to-many relationships.

The following query joins the Orders table to the Order Details table in a classic one-to-many relationship:

SELECT DISTINCTROW
    Orders.*,
    [Order Details].*
FROM Orders INNER JOIN [Order Details]
ON Orders.OrderID = [Order Details].OrderID;

Columns in the Orders table focus on the order itself: the customer who placed the order, the employee who took the order, the date the order was taken, and so on. Columns derived from the Order Details table specify the actual items that were ordered: the product ID and pricing details. Just as with the many-to-one example shown above, changes to columns from the "one" table on any given row are automatically made for all other rows based on the same value in the "one" table. For example, if the user changes the CustomerID field, which is drawn from the "master" Orders table, on any given row, the change is automatically reflected in all other rows for this same order.

Updatable multitable joins are not limited to a single-level hierarchy. For example, the following query links from Employees to Orders to Order Details:

SELECT DISTINCTROW
    Employees.EmployeeID,
    Employees.LastName,
    Orders.OrderID,
    Orders.OrderDate,
    [Order Details].ProductID
FROM (Employees INNER JOIN Orders
        ON Employees.EmployeeID = Orders.EmployeeID)
    INNER JOIN [Order Details]
        ON Orders.OrderID = [Order Details].OrderID;

Columns from all three of these joined tables are updatable in the resulting result set.

Inserting rows into a multitable query

When inserting a row into a result set based on a multitable join, rows can be added to one, more than one, or all of the tables included in the join. It is easy to see that rows from the "many" side of the join are simply appended to the "many" table as needed. Rows from the "one" side of the join can be added as well, if they do not violate referential integrity constraints.

The following query joins the SalesReps (many) table to the Division (one) table:

SELECT DISTINCTROW
    SalesReps.EmployeeID,
    SalesReps.LastName,
    SalesReps.FirstName,
    SalesReps.DivID,
    Division.DivID,
    Division.Description
FROM SalesReps INNER JOIN Division
ON SalesReps.DivID = Division.DivID;

If, while adding a new SalesReps entry, the user specifies a DivID value that is already present in the Division table, only the SalesRep row is added. AutoLookup occurs: Values from the corresponding Division record are retrieved.

It is possible to add a new DivID value to the SalesRep table and have that new value automatically added to the Division table if the primary key from the "one" table is included in the join, as in the example above.

Updatable one-to-many outer joins

As described above in the terminology section, an outer join selects all rows from one table in a multitable join, while only selecting rows with matching values from another table. For rows in which there is no match in one table, artificial NULL column values are supplied. The Microsoft Jet database engine allows you to "fill in the blanks" in these artificial NULL rows.

Consider the following OUTER JOIN and its result:

SELECT
  Cust.CustName,
  Cust.CustID,
  Orders.CustID,
  Orders.OrderID
FROM Customers AS Cust LEFT JOIN Orders
ON Cust.CustID = Orders.CustID;
Cust.CustName Cust.CustID Orders.CustID Orders.OrderID
Johnson 1000 1000 1
Johnson 1000 1000 2
Smith 1001 1001 3
Smith 1001 1001 4
Blair 1002 1002 5
Anderson 1003    

In this example, the Customers table is outer joined with the Orders table to show all customers, regardless of whether or not they have placed an order. Customer "Anderson" (with CustID 1003) has not placed an order. The "fill in the blank with key propagation" rule states that the user can add an order for this customer by filling in the Orders.OrderID field (and any other Order information except OrderID). The value in Cust.CustID is automatically propagated into Orders.CustID. In this update scenario, the child key (Orders.CustID) is read-only. The act of "filling in the blank" locks the new Order row to Customer 1003.

Multitable Query Updatability Restrictions

There are several requirements that the query must meet for it to be fully updatable.

JOIN clauses must be used

You must specify an explicit INNER or OUTER JOIN between the tables. Joins created implicitly in the WHERE clause of the SELECT statement are not updatable. For example, the following join is not updatable:

SELECT
  Products.ProductID,
  Products.ProductName,
  Categories.CategoryID,
  Categories.CategoryName
FROM Categories, Products
WHERE Products.CategoryID = Categories.CategoryID;

Queries with aggregated data are not updatable

Summary (GROUP BY), UNION, DISTINCT, and crosstab queries are never updatable. Queries joined to one or more summary queries are not updatable, even if you do not attempt to modify fields from an otherwise updatable table.

Subqueries with aggregated data are updatable

However, a query may be updatable if it refers to a summary query in a sub-SELECT statement, as in the following example:

SELECT DISTINCTROW Orders.*
FROM Orders
WHERE (((Orders.Freight)>
  (SELECT DISTINCTROW Avg(Orders.Freight) AS AvgOfFreight
   FROM Orders;)));

Columns from the Orders table are updatable.

All joined fields must be output in order to insert new records

To be able to insert new rows into a table in any query, all primary key columns must be present.

Many-side fields may be nonupdatable during updates to the one-side

While updating a single query row, changes to certain fields may render certain other fields nonupdatable until the row edit is either saved or canceled. After the user edits data on the "one" side of a query, the join key on the "many" side can no longer be modified. Usually, the "many" side join key is updatable. However, because data on the "one" side was modified first, this column is temporarily rendered unmodifiable because AutoLookup would discard your changes to the "one" side data. As soon as the change to the "one" side of the query is committed or canceled, the "many" side join key becomes updatable again.

Many-side records cannot be orphaned

A change to a multitable query must not create "orphaned" records. You can change the join key in the "many" table to a value already present in the "one" table, but you cannot specify a nonexistent value, except in the case of outer joins.