OLAP Services: Semiadditive Measures and Inventory Snapshots

Amir Netz
Microsoft Corporation

April 1, 1999

Contents

Introduction
The Inventory Problem
Average over Time
Opening and Closing Balances
Period Comparisons: Consecutive and Parallel
Minimum and Maximum Values in a Time Span
Relative Contribution to Total Value
Avoiding the Display of Incorrect Data
Conclusion
For More Information

Today's organizations rely on highly sophisticated and complex analysis of the information stored in their databases. In the past, a uniform and easy calculation was common—for example, one that tracks the movement of products to and from a single warehouse. Now, database analysis in business domains, such as securities, account balances, budgets, and insurance policies and claims, is semiadditive—straightforward with some factors and variable with others, notably when tracking data over time. For instance, an organization might first request monthly quantities for specific products, and then demand quarterly totals consolidated across warehouses and subsidiaries. The calculated members in Microsoft® SQL Server™ 7.0 OLAP Services provide the capabilities to implement the most demanding semiadditive measures. Focusing on a classic inventory problem, this article describes the implementation techniques of semiadditive measures in online analytical processing (OLAP) Services.

Introduction

All of the base measures in Microsoft® SQL Server™ 7.0 OLAP Services are fully additive and include aggregation types of SUM, MIN, MAX, and COUNT. These base measures are:

Uniformity

In the classic case of Fact tables that contain transactions, such as sales, uniform aggregation is not problematic. The sum of the amounts and quantities will sum across all of the dimensions.

However, for other applications, uniform aggregation across all dimensions is not desirable. For example, in an inventory application, instead of storing the inventory transactions (such as arrival and departure of products from and to the warehouse), a snapshot of the inventory is taken each month. The following illustration shows a simplified, typical record.

Product
Warehouse
Time
Quantity
Stock Value

A three-dimensional (Product, Warehouse, and Time) cube can be created with two measures (Quantity and Stock Value.) The measures can be summed up across products (for example, the stock value of Product A and Product B is the sum of the stock value of Product A with the stock value of Product B) and across warehouses (for example, the total quantity of Product A is the sum of its quantities in each warehouse). However, the measures cannot be summed across a time period.

Summing the quantities across a time period will yield incorrect results. If, in January, the quantity in stock for Product A was 10, in February, the quantity was 20, and in March, the quantity was 15, the overall quantity in stock for the first quarter is not 45. The quantity for that quarter may be the average of the monthly quantities, the closing balance (the quantity of March), or the opening balance (the quantity of January), but certainly not the sum of all months.

These kinds of measures, which are additive on some of the dimensions and not on other dimensions, are called semiadditive measures. Semiadditive measures are very common in many business scenarios beyond simple inventory maintenance. Every cube based on snapshots of balances over time shares this problem. You will find these snapshots in applications dealing with securities, account balances, budgeting, human resources, insurance policies and claims, and many other business domains.

Financial applications can have even more demanding semiadditive measures. Consolidation of accounts balances and other financial results may sometimes involve more complex expressions over the Accounts dimension. For example, consolidating the financial results of an enterprise that has several subsidiaries (each with a different percentage holding position) is a complex matter that requires some advanced techniques.

Although the basic measures in OLAP Services do not provide direct support for semiadditive measures, calculated members provide the capabilities to implement even the most sophisticated and complex semiadditive measures.

This article provides information about the implementation techniques of semiadditive measures in the OLAP Services, focusing on the classic inventory class problem. It is assumed that the reader has good understanding of online analytical processing (OLAP) in general, OLAP Services in particular, and the multidimensional expressions (MDX) language.

The Inventory Problem

The inventory problem is one of the most typical semiadditive measure problems. The inventory represents a whole class of semiadditive measure problems that share a common attribute: keeping track of data snapshots. Snapshots are typical in applications in which balances are involved, such as account balances, stock tickers, head count tracking, contact management, and active insurance policies management. Data snapshots (as opposed to pure transactions) disable the ability to aggregate data over time correctly. Although this article refers to the inventory problem, remember that the applicability of the proposed techniques is broad and can be adapted without change to a wide variety of business problems.

Inventory Example

The inventory cube has this structure:

As explained earlier, summing the quantities and values over the time dimension yields incorrect results. These are the items that must be analyzed:

End users should not see any data generated by summing over the time dimension.

Average over Time

To calculate the average of the quantities and values for a given period, you must divide the total monthly quantities and values by the number of months contained in the given period of time.

The calculated member [Measures].[Average Monthly Quantity] is defined as:

([Time].CurrentMember, [Measures].[Quantity]) / 
      Count(Descendants([Time].CurrentMember, [Months]))

The expression indicates that the total quantity in the current time period will be divided by the number of months that are descendants of (that is, contained in) this time period. Because the current member in a dimension is the default member, the expression can be simplified to:

[Measures].[Quantity] / 
            Count(Descendants([Time].CurrentMember, [Months]))

Based on the same principle, [Measures].[Average Monthly Stock Value] is defined as:

[Measures].[Value] / 
            Count(Descendants([Time].CurrentMember, [Months]))

In some cases, the preceding expressions may be simplistic. For example, if a new item is introduced to the system in February, the expression calculates the average monthly stock value of the new item by summing the quantities in the months January through March and then divides the sum by three. The results may be misleading. Included in the calculation is the month of January, in which the item did not exist at all. Therefore, the average appears lower than it actually was.

To calculate the average only for the months in which the product actually existed in the system, the counting clause in the expression should be modified, for example:

Count(CrossJoin({[Measures].[Quantity]},
     Descendants([Time].CurrentMember, [Months]), ExcludeEmpty)

This expression will help ensure that only months that are not empty (Quantity<>NULL) are counted.

The performance of the preceding expression will be much poorer than the simple count described in the previous section. In this expression, the ExcludeEmpty() function is used, and the cell values of the quantities per month have to be evaluated. When the simpler Count(Descendants(… function is used, the metadata of the dimension structure is enough to evaluate the denominator.

Opening and Closing Balances

In many cases, getting the opening balance and especially the closing balance is the most basic piece of information the cube must provide. For example, the closing balance of a customer account shows the current state of the account, and the closing balance of the inventory quantity shows the quantity presently in the warehouse.

To get the opening balance of period, you must get the value of the first month contained in the given time period. For example, the opening quantity balance for 1997 is the opening quantity balance for January 1997. The opening quantity balance for 1997, Quarter 2, is the opening quantity balance for April 1997.

Similarly, to get the closing balance for a given period, you must get the value of the last leaf member (that is, a member of the last level in the hierarchy) contained in the given time period. For example, the closing balance for 1997 is December 1997 and for 1997, Quarter 3, it is September 1997.

To get the first month contained in the current time period, you can use:

Head(Descendants([Time].CurrentMember, [Month]),1)

For the last month contained in the current time period, use:

Tail(Descendants([Time].CurrentMember, [Month]),1)

MDX provides a more intuitive and effective shorthand syntax for these two expressions:

OpeningPeriod( [Month], [Time].CurrentMember ) 

and

ClosingPeriod( [Month], [Time].CurrentMember)

The syntax for OpeningPeriod() and ClosingPeriod() allows for an even shorter syntax form. The second parameter can be omitted if you are seeking the opening or closing period of the current member of the time dimension.

Thus, the following four calculated members should be declared to obtain the opening and closing balances:

Measures. [Opening Quantity]: 
            (Measures.[Quantity], OpeningPeriod([Month]))
Measures. [Closing Quantity]: 
            (Measures.[Quantity], ClosingPeriod([Month]))
Measures.[Opening Value]: 
            (Measures.[Value], OpeningPeriod([Month]))
Measures.[Closing Value]: 
            (Measures.[Value], ClosingPeriod([Month]))

A common real-world problem can arise during implementation of the closing balance expressions. In many implementations, the Time dimension is defined with future time period members already contained in the dimension. The ClosingPeriod([Month]) function does not analyze future and past time periods. It only traverses the members' hierarchy tree to find the last leaf descendant under the given member.

For example, take the example of implementing a closing balance expression with a current date of mid-October 1998. Asking for the closing balance for 1998 arguably should provide the data from the last snapshot of the year, or the October snapshot. However, the ClosingPeriod([Month]) function returns December 1998 as the closing period of 1998. Because no snapshot exists for December 1998, the closing balances return NULL.

This more sophisticated expression solves the problem:

Measures.[Last Non Empty Value]: 
               IIf(IsEmpty((Measures.[Value], Time.CurrentMember) ,
                  (Measures.[Last Non Empty                             Value],Time.CurrentMember.PrevMember),                           Measures.[Value])
Measures.[Closing Value]: (Measures.[Last Non Empty Value], ClosingPeriod([Month]))

In this example, the [Last Non Empty Value] measure has a recursive expression that checks to see whether the value of the current cell is empty. If it is empty, the expression moves back to the previous period on the time dimension and checks the [Last Non Empty Value] of the previous period. The function continues to go back in time until a nonempty value is found. The [Closing Value] measure can then use the [Last Non Empty Value]. This recursive behavior helps ensure that, for the last year, the values of the last snapshot are returned. The less common CoalesceEmpty() function performs this exercise more efficiently. This function is equivalent to the expression demonstrated earlier:

Measures.[Last Non Empty Value]: 
   CoalesceEmpty((Measures.[Value], Time.CurrentMember ) ,
   (Measures.[Last Non Empty Value],Time.CurrentMember.PrevMember))

Another useful way to get the right closing period is to provide the relevant dates as properties of the leaf node of the time dimension properties and deal with the future periods by using date comparison functions and string functions, for example:

Measures.[Closing Value]:
   IIf(CDate(ClosingPeriod().Properties("Closing Date"))>Now()),
   (Measures.[Value],StrToTuple(Format(Now(),"\[yyyy\]\.\[\Qq\]\.\[mmmm\]"))),
   (Measures.[Value], ClosingPeriod([Month])))

This expression extensively uses Microsoft Visual Basic® for Applications functions to manage dates and strings. (Remember that the Visual Basic for Applications function library is registered automatically for use in MDX expressions.) The function checks to see if the value of the user-defined property "Closing Date" shows a time period in the future (greater than Now()). If so, the function constructs the member name of the last snapshot time period by using the Visual Basic for Applications Format() function with the Visual Basic for Applications Now() function. Then, the string is converted to a tuple to return the [Closing Value].

Period Comparisons: Consecutive and Parallel

The comparison of inventory levels between time periods provides additional useful information. Periods that are usually compared are consecutive periods (for example, 1998, Q3 versus 1998, Q2) and parallel periods (for example, 1998, Q3 versus 1997, Q3). To calculate the change in the inventory levels between the periods is one of the most useful ways to compare periods.

To calculate the change between consecutive periods, you can use the PreviousMember property, for example:

Measures.[Quantity Change from Previous Period]:
               (Measures.[Quantity], Time.CurrentMember) – 
            (Measures.[Quantity], Time.CurrentMember.PreviousMember)

The expression subtracts the quantity of the previous time period from the quantity of the current time period.

To calculate the change between parallel periods, MDX provides the ParallelPeriod() function. The definition of a "parallel period" may vary depending on the business problem. For example, a parallel period may be the same day in the previous month, or it may be the same day in the previous year. Using the ParallelPeriod() function, you can specify the time span of the parallel period, for example:

Measures.[Quantity Change from Last Year]:
               (Measures.[Quantity], Time.CurrentMember) – 
            (Measures.[Quantity], ParallelPeriod(Time.Year))

Minimum and Maximum Values in a Time Span

Knowing the highest and lowest inventory values within a time period is often useful.

The MIN and MAX aggregate functions do not provide the desired results because they are applied at the transaction level. A MAX measure applied to the Stock Value column of the Inventory table will return the maximum value of the column in the Fact table (for the given time period). This means that it will provide the stock value from a single row in the Inventory Snapshot table, and it will be the stock value of a single item in a single warehouse in a single month.

However, this information may not be as useful as the maximum total inventory value of a day within a year. For example, the question "What was the maximum value of the inventory held on a single day in the last year?" cannot be answered correctly using a measure aggregated with the MAX function.

To apply MIN and MAX calculations on aggregated values, use the MDX functions of MIN and MAX (the same principle applies to AVG, STD, VAR, and the other aggregate functions.)

This expression provides the maximum aggregated inventory value within the current period:

Measures.[Maximum Inventory Value]:
      Max(Descendants(Time.CurrentMember, Time.Month), Measures.[Value])

The MAX function scans all months under the current time period (for example, all of the months in 1998) and returns the value of the month that contains the greatest inventory value. A similar expression can be constructed using a MIN function.

Relative Contribution to Total Value

Seeing the relative contribution of an item or a warehouse as a percentage to total inventory value is another common requirement. In general, to measure the contribution of a cell to the grand total, divide the current cell value by the cell that represents the grand total of the cube, for example:

Measures.[Value Contribution to Total]:
            Measures.[Value] / 
         (Measures.[Value], Time.[All Time], Products.[All Products],…)

This expression divides the current value with the value of the cell that contains the grand total.

In cases in which there are semiadditive measures, care must be taken not to divide by the absolute grand total, because the grand total contains results of summing across the time dimension. In the inventory sample, the grand total cell contains the aggregate, over time, of inventory snapshot values. As shown in the "Introduction" of this article, this value is incorrect.

The solution is simple: Avoid asking for the total over the time dimension. For example, in the inventory cube, the expression should look like the following:

Measures.[Value Contribution to Total]:
            Measures.[Value] / 
  (Measures.[Value], Products.[All Products],Warehouses.[All Warehouses])

Note   When the current time period is not a leaf member (a month in the example), the contribution calculated is the average contribution during the time period. For example, over a year, the expression divides the 12-month sum of the stock value of the current product in the current warehouse by the 12-month sum of the overall stock value of all products in all of the warehouses. The expression is exactly the average contribution during that 12-month period.

In many cases, the most interesting analysis on relative contribution occurs when a cell value is compared to the total only on a single dimension. For example: What percentage was product A of the stock value of all products in warehouse X during December 1998? This expression calculates the relative contribution only on the product's dimension:

Measures.[Product Contribution to Total Products]:
               (Measures.[Value] , Products.CurrentMember) / 
            (Measures.[Value], Products.[All Products])

Avoiding the Display of Incorrect Data

The basic measures that calculate the sum of the stock value and quantity contain incorrect data because these measures sum the values over the time dimension. Although by themselves these measures contain incorrect data, they are useful. This article demonstrates how to create calculated members using MDX expressions to derive the correct data from the base measures.

These basic measures should be considered as interim results, as they are a calculation step on the way to the final answer. However, by themselves, they are meaningless to the end user. Because these are only interim results, these measures should not be exposed to end-users. OLAP Services provides a dedicated property for basic measures, IsInternal, to hide the measures from the client application.

An internal measure can be used inside an expression of a calculated member, but will act as a hidden member as far as the client application is concerned. End users cannot select these hidden measures for display.

To hide a measure, use the Advanced tab of the Property pane in the Cube editor of the OLAP Manager.

Conclusion

In OLAP Services, calculated members can be used to work effectively with semiadditive measures. This article explored different scenarios in which you may be working with semiadditive measures. This article also provided examples of ways to use calculated members to address the problems that can arise in these scenarios.

For More Information

For more information about MDX, see Microsoft SQL Server OLAP Services online Help.

For more information about the inventory problem, see Ralph Kimball's book The Data Warehouse Toolkit, ISBN: 0471153370.