Physical Database Modeling

Classically, database architects would build a logical model of the domain, including the entities that exist and the relationships between them. They captured much of the knowledge that we now capture in the static object model. They would then map these entities and their relationships into a set of database tables.

For each table, the database designer must decide on the order of the fields in the rows. Typically, a field corresponds to a single data member in an object. One of the first decisions the database designer confronts is deciding which field (or fields) should be the primary key for the table. The primary key must be unique in a table and will be indexed for fast access. Typically, the OID will be the primary key for any given table.

Furthermore, the database designer may add additional (secondary) indexes. These additional indexes speed up retrieval, at the cost of slowing down the operations of adding and deleting records.

Once the designer has created the fields in the table, then he must decide on the appropriate data types, and must designate which fields may have

NULL

values. He may include foreign keys. Remember that a foreign key is a field that is a primary key from another table and is used for quickly joining and searching other tables.

Finally, the designer must allocate sufficient space for the tables and then add the necessary constraints to the system to ensure referential integrity.

Use of Referential Integrity

One type of referential integrity constraint in databases is a check that a foreign key that is used really exists — the value of the foreign key in this table should be the primary key of a record in another table. Referential integrity checks that a record with this key exists in the other table. This prevents the database equivalent of "dangling pointers".

We can also use referential integrity to enforce the multiplicity constraints indicated in the object model. Let's say we want to refer to a row of another table from within a given row in the current table. We can include columns in the current table that have the same definition as the columns used as the primary key in the second table. Since we use an OID as the primary key for all objects, we'll use the OID as the foreign key of the current table.

If the object model says that an object of class

has an association to objects of class B
with a multiplicity of 1
on the A
end, and 1..*
on the B
end, then an attempt to store an A
object that has no B
objects associated to it will fail. An attempt to delete the last B
object associated with an A
object will also fail. Referential integrity will be enforced by the foreign key constraints in the database tables.

Scenario Development and Referential Integrity

Referential integrity does not have a sense of time. It is absolute. However, all of our business processes start, flow and stop. Sometimes they are interrupted in the middle. How can your system maintain referential integrity if it is interrupted in the middle of its work?

Let's look at a simple example in the form of a bank account application. Say there is a customer account which must have an association to one or more account objects. A bank employee is on the phone with a customer, who wants to open an account. The bank employee gathers all the necessary information about the customer, including his name, social security number, phone number and so forth. He is about to gather information about the type of account the customer wants to open, when the customer announces that he must run off to a meeting and will call back later.

It would be unfortunate if the bank employee had to throw away the records he has built so far, but if he saves the customer without opening at least one account, he will violate the referential integrity of the system. The system is not referentially correct, yet we need to save its current (temporary) state.

One solution is to change the multiplicity of the account end of the relationship from

1..*

to 0..*
. Unfortunately, this sacrifices referential integrity (in this case, that each customer must have an account) to support a boundary condition. Moreover, this boundary condition is unstable; it should resolve either to a fully opened account relationship or it should, eventually, be closed.

A second possibility is to create a temporary set of tables for "work in progress" which maintains less referential integrity. This works, but creates a certain amount of duplication in the system.

A better solution, perhaps, is to factor out the referential integrity from the business logic and build a framework to enforce referential integrity among the business objects. This framework could take into consideration the work in progress, but segregate it from the rest of the processing so that it is not visible to applications that would not understand the implications of it being work in progress.