The Aggregate Relationship - Combining Objects

There are many times when an object is composed of other objects, in whole or in part. For instance, our Invoice object is partially composed of the Customer object. Without the Customer object, our Invoice doesn't have enough information to be useful. The Customer is subordinate to the Invoice, but the Invoice can't exist without the Customer.

There are two different approaches that we can use to handle this situation. The first of these is simple aggregation; the second involves combining interfaces.

Our discussion will therefore run as follows:

We'll look at a simple aggregation example
We'll alter that example to use a combined interface technique
We'll investigate how one object might expose another object's data
as a read-only property

Technique 1 - Simple Aggregation

The first approach we'll look at is simple aggregation. With this approach, we can simply expose the subordinate objects as properties of the top-level object. For instance, look at the following code:

Option Explicit

Private objCustomer As Customer

Private Sub Class_Initialize()
  Set objCustomer = New Customer
End Sub

Public Function Customer() As Customer
  Set Customer = objCustomer
End Function

All we've done, here, is create a private Customer object, and allowed the calling program to gain access to it through our Customer method. This is very simple to implement, and it gives the calling program full access to any of the objects we've aggregated into our top-level object.

There are, however, a couple of drawbacks to simple aggregation. The most obvious problem is that the calling program can do virtually anything to the aggregated object. In the example above, the calling program has full access to the Customer object, so it can change any properties or call any methods. In some cases, this might be fine, but we may want greater control over what the calling program can do, such as validation and error handling.

When we're designing an object model, it's important to assume that everything will get misused or called incorrectly. Objects need to protect themselves, and if an object is composed of other objects then it needs to protect the subordinate objects as well.

Technique 2 - Combining Object Interfaces

Now that we've seen how easy it is to implement simple aggregation, let's look at an approach that gives us more control over how the client code can use the aggregated object.

Although this doesn't follow true object-oriented design, it's often a better idea to simply merge the properties and methods of all the objects into a single object interface. With this approach, we can still have the aggregated objects instantiated inside our top-level object, but we don't make them directly available to the calling program.

Consider the following code from the Invoice class:

Option Explicit

Private objCustomer As Customer
Private curTax As Currency

Private Sub Class_Initialize()
  Set objCustomer = New Customer
End Sub

Public Property Let Name(strValue As String)
  objCustomer.Name = strValue
End Property

Public Property Get Name() As String
  Name = objCustomer.Name
End Property

Public Property Get Tax() As Currency
  Tax = curTax
End Property

Note that this class is made up of both its own data and the Customer object; but rather than exposing the Customer object itself, this implementation has its own Name property that just calls the Customer object's Name property. This technique passes a lot of control to the Invoice object - at the expense of object-oriented philosophy.

Of course, this approach also has some drawbacks. If an aggregated object's interface changes, perhaps by adding a new property, then the new interface elements aren't immediately available to client programs. In that situation, we would need to add the new elements to the top-level object's interface first. With simple aggregation, on the other hand, where we would expose the aggregated object as a property, this problem would never arise.

Technique 3 - Combining Data

As we've seen, simple aggregation and aggregation through combined interfaces are very good ways to solve the problem where one object relies on another for data or functionality. Unfortunately, neither solution is truly ideal in a client/server setting.

Let's diverge a bit from object theory, and look at a common scenario in business programming. Most of us work with data that is stored in relational tables. Even if we choose to represent that data as objects, such as Customer or Invoice, we'll often run into cases where the data needs to be consolidated (or aggregated) together across object boundaries.

With our current example, the Invoice object needs data from the Customer object to function - in our case, the Customer object's Name property. Essentially, we're trying to consolidate some of the Customer object's data into the Invoice. So far, we've done that using a couple different techniques to implement aggregation.

Unfortunately, this can be very inefficient. In order to get the Name property from the Customer object, our Invoice object has to load the entire Customer object from the database. This means that we're loading two whole objects - one of them just to get a single property value. Most relational database engines can pull information together much faster than we can when we load individual objects and put them together ourselves.

Let's take an example. Here's a functional use case from the Rental Analysis requirements:

Consolidated Reporting

This use case covers the specific steps used to consolidate the rental counts and prices for each video.

The system must scan all invoice detail lines for each video. Each invoice for a video will count as a single rental, and so the system must count the invoices. For this analysis, the system must also calculate an average rental price, based on each invoice for the video. Prices may change over the life of the video, and so the analysis needs to use the average price.

Once the system has calculated a video's rental count and average price, it needs to provide these numbers - along with the name of the producing studio and the video's category for analysis. The studio and category are stored with the other video information.

We'll discuss two different ways of solving this business situation. First, let's look at how we might solve it using pure aggregation of objects; then, we'll look at an alternative solution that provides much better performance.

Object Method

Looking at this problem with a pure object focus, we can quickly pick out the LineItem and Video objects as containing the data we need. We'll add in a VideoData object, which will contain the business logic to accumulate all the data together. So far, we have the objects and properties as shown in the table on the following page.

Using objects, our VideoData object would contain the logic to load all the LineItem objects for each invoice, and accumulate a rental count and average price for each Video. It would also aggregate the Video object to get the Studio and Category data.

In Chapter 4, we'll discuss a couple of efficient techniques for loading and saving object data in a database. As good as these are, they don't compare to the performance we can get by processing a lot of data in the data tier.

Object	Properties
`LineItem`	`VideoID` `Price`
`Video`	`VideoID` `Studio` `Category`
`VideoData`	`VideoID` `RentalCount` `AveragePrice` `Studio` `Category`

It's relatively inefficient for the VideoData object to load all the LineItem objects from the database. The rental count and average price can be calculated with a simple SQL statement in the database itself, and this would be much faster. With an extra JOIN, the SQL statement can pull in the Studio and Category data without even having to load a separate Video object.

Database Method

Following a database train of thought, we would have one type of object: VideoData. This object would be loaded directly from the database using a SQL statement to pull together all the data at once, very efficiently. The following is an example SQL statement that we could use in Microsoft Access to achieve this:

SELECT InvoiceDetail.VideoID, Count(InvoiceDetail.VideoID) AS CountOfVideoID, Avg(InvoiceDetail.Price) AS AvgOfPrice, Videos.Studio, Videos.Category
FROM InvoiceDetail INNER JOIN Videos ON InvoiceDetail.VideoID = Videos.VideoID
GROUP BY InvoiceDetail.VideoID, Videos.Studio, Videos.Category;

The following figure (overleaf) shows this query in Access.

Of course, there are some drawbacks to building objects this way rather than through regular aggregation. Suppose the LineItem object had business logic behind the Price property. By pulling the price directly from the database, we will have bypassed that logic. This might force us to replicate the code inside the VideoData object. In that case, we'd want to think hard about which solution is the most appropriate.