Development-time Issues

In order to support persistence, each class must know if it is persistent and if so, it must be able to chain up to its parent class in support of persistence. Query functions will probably be implemented as static member functions. The query functions must know which attributes persist and whether or not their value can be

NULL

. In addition, the query functions must know the type and length of each member variable to be persisted. This kind of knowledge is known as meta-data — it describes how we manage the data

A decision has to be made as to where this (meta-) data will be kept. The place of choice is in the static object model. If you use a CASE tool you may be able to extend their metamodel to include the meta-data at the class and attribute level.

Generating the Code for each Class

Some of the code that will be written implements the persistence framework. The rest of the code is the class-specific code for the CRUD (

Create

, Read
, Update
and Delete
) queries and other functions that allow the class to interact with the rest of the persistence framework. In a large application, there will be hundreds or thousands of persistent objects. Writing all of this code by hand is a large undertaking. Furthermore, this code will change. Keeping it all tested and coordinated is not going to be easy.

There are several advantages to basing the code generator on a CASE-tool supplied metamodel:

If you extend the metamodel, you can include the class and attribute-specific persistence-related information that we need to capture directly in the model.
The CASE tool probably supplies a mechanism to iterate through all of the classes, all of the attributes in a class and so on. Either there is a scripting language, in which case it includes these iteration constructs, or there is an API to get at the model information, in which case you can use the API to iterate.
The CASE tool probably comes with a set of GUI screens that are used to enter the model information.

Validating the Object Model

The object model is the most important intellectual capital that your development efforts will generate. As requirements change and as hardware comes and goes, the object model will provide continuity to your requirements and design.

As such, it needs to be accurate and up to date. It is obvious that if you are generating code from the static object model, the model must be valid, or you will not be able to build and run your application successfully. What is more difficult is keeping your model up to date when you've gone over to coding. The inevitable outcome is that your code evolves while the model remains fixed, and they drift out of synch with one another.

The issue of documentation going out of phase with code is not new. Typically, there is a failure in the procedures that are put in place to guarantee the validity of documentation. It is not unusual for developers to be under enormous pressure to getting the code out the door on time, which results in the documentation work taking second place.

This is the most compelling reason to use round-trip CASE tools — if you always generate the code from the model, then it is impossible for the code and the model to go out of phase with one another. Unfortunately, for many small development organizations, these tools are prohibitively expensive.

To ensure that your model matches your database design and ultimately your code, you will want to create a checklist that you can use in design reviews. Here is a list of some of the things that you should check:

For each member function, each parameter in the signature is "reachable" from the class (there is an association path from this class to the class mentioned in the parameter)
Syntax restrictions on the names of classes, attributes etc. are followed
All of the member functions that are queries are static
All of the CRUD member functions are non-static
The character values used in the names in the model are all legal to the C++ compiler
The types of all attributes are legal
The lengths are all non-negative
The names of classes used in the interaction or sequence diagrams all correspond to classes in the static object model
Canonical methods — constructor, destructor, copy constructor and assignment operator — are in place

When you've validated your model against your code, you will then want to flag certain design characteristics as possibly representing design problems. These are not necessarily problems, but they merit further investigation:

List all classes that have no parents
List all classes that have no children
List all classes that have no queries
List all classes that have no associations
List all diamonds in the inheritance structure where the base class has data members
List all uses of multiple inheritance where more than one base class has data members
List all classes with more than 20 data members or more than 20 member functions
List all non-virtual functions that shadow virtual functions
List all virtual functions whose attributes are different than in a parent class

The Development Lifecycle

If only it were the case that we just define the static object model, generate the code and submit the bill. Instead we iterate. The requirements evolve, and this then changes the design. The design evolves as we learn more and as we write more. Changes in the static object model impact the code. Changes in the implementation of member functions feed back into the design.

The entire system churns and heaves. In a healthy system the overall movement is forward, wriggling up out of the primeval soup of design until it takes flight in working code. In an unhealthy system your project begins to feed upon itself, with diseased parts of the design infecting the implementation, until the system becomes a boiling writhing wretched mass of convoluted code.

Adding Operations

If a tool is used to generate code from the static object model, it usually has some way for you to take the files it outputs and add the implementations of the member functions that you write by hand. The goal is to allow you to work both through the tool and also manually, but at the same time to keep the code and the model in synch with one another.

Promulgating Model Changes to the Database

Synchronizing model changes with code changes is not too difficult if the code is generated from the model. Otherwise, it is a time-consuming error-prone manual process. That said, most of us do at least some of it by hand. It's not just that CASE tools which generate code are expensive, or that they never quite do it all — many of us have a deep-seated distrust of system-generated code.

This presents a problem: How do you unit test any changes to a class? You can't persist the class because the DDL has changed. If you change the DDL, then no one else can use the database. One solution is to use a hierarchical set of databases. One database is the project master. You also need a database for each development team and perhaps even one for each developer. That way, DDL can be generated and put into effect in synchronization with the changes in the model and the code, without affecting the other developers. As the code is unit tested, the changes to the DDL can percolate up through the database hierarchy, until they are available at the project master.

In a traditional development environment, the configuration management system allows another programmer to pick up a new copy of your module more or less when he chooses. There are always one or two back versions available, so he doesn't have to interrupt his development to integrate changes made by other developers. The freedom to choose when to pick up new versions of classes will be reduced by this interaction with the database layout.

You must also ensure that the data remains usable as the DDL changes. There are utilities that will copy tables, and some of these will allow you to define default values for newly created columns. Of course, if the new column corresponds to an OID, there may not be any good default value to use. This strategy works in the traditional database world, but is insufficient in the object world.

Rearranging the object hierarchy may change the nature of the relationships between the tables in the database. This happens, for example, any time more than one table is used to store all of the data attributes of an object, both those it directly defines and those it inherits from its ancestors. It would be difficult to automate these types of changes in a replication tool, even one with some scripting capabilities.

One solution is to write a database program that reads the old versions of the tables and writes the new version of the tables. Another solution is to retrieve all of the objects that are of the class that has changed, write them to their new object format, and then persist them. You would do this by defining a class that represents the old format and a class that represents the new format, give them separate names, and provide a conversion function as a member of the old class.