Data Analysis Solutions with Pivot Tables in Microsoft Excel

Presented by: Steve Harshbarger

Steve Harshbarger is a director at Micro Modeling Associates (MMA), Inc., a Solution Provider Partner developing custom business applications for the Fortune 1000 using Microsoft technology. He currently manages MMA's Washington, DC practice, and will be heading up the new MMA San Francisco office in mid-1997. Steve is co-author of Microsoft Excel 97 Developer's Handbook (Microsoft Press, 1997), Official Microsoft Intranet Solutions with Office 97 and FrontPage 97 (Microsoft Press, 1997), and Excel Professional Techniques (Que, 1994). He is a frequent speaker at Microsoft-sponsored events including Tech-Ed, VBITS, Developer Days and various product launches. Steve is also a Certified Public Accountant which gives him a unique perspective on business solutions. Prior to joining Micro Modeling, Steve was a senior consultant at KPMG Peat Marwick in Washington, DC.

Phone: (301)657-1385

Fax: (301)657-0025

Email: harshbargers@micromodeling.com

Introduction

Microsoft® Excel 97 pivot tables are a powerful data analysis and reporting tool. Through Microsoft® Visual Basic® code, the full power of pivot tables can be realized. This paper assumes you are familiar with pivot tables from and end user point of view, and understand Visual Basic development and the Microsoft Excel object model in general. The following specific topics will be covered:

The Pivot Table Object Model

Pivot tables are manipulated by using several objects in Microsoft Excel’s object model. The first major concept to grasp is the difference between the PivotTable and PivotCache. To understand this, we need to understand how Microsoft Excel manages pivot tables under the hood. It works like this:

The following diagram shows the relationship between the PivotCache and PivotTable objects. A particular PivotCache can be accessed through the PivotCaches collection of the workbook object, or as a property of the PivotTable which is derived from it. Note that is possible for many pivot tables to share the same cache. Hence, we have the ability to access PivotCaches through a collection independent from the PivotTable object.

The PivotTable object itself contains a hierarchy of related objects which represent the structure of the table. The PivotField object represents “columns” of data from the original data source, which may be displayed in a number of orientation in the pivot table. The PivotItem object represents discreet values in a given PivotField.

Creating a PivotTable in Code

To create a pivot table in code, call the PivotTableWizard method of the worksheet object. This method will build the PivotCache and display the PivotTable in worksheet. This method has the following major arguments (see the Microsoft Excel help file for a complete listing of additional arguments):

SourceType

SourceType determines what type of data will feed the pivot table. Choose one of the following constants:

xlConsolidation Multiple consolidation ranges

xlDatabase Microsoft Excel list or database

xlExternal Data from another application

xlPivotTable Same source as another PivotTable

SourceData

SourceData can be one of the following:

TableDestination

TableDestination is the range object indicating where to place the pivot table.

TableName

TableName is a string specifying a name for the new pivot table.

BackgroundQuery

BackgroundQuery determines if the query used to build the pivot table will be performed asynchronously. In general, you should set this to false. If you do not, your Visual Basic code will continue running before the pivot table is built and may lead to unexpected results.

RowGrand and ColumnGrand

Set RowGrand and ColumnGrand to True to create row and column grand totals.

HasAutoFormat

Set HasAutoFormat to True to have Microsoft Excel automatically format the PivotTable when it is refreshed or when fields are moved.

The PivotTable Object

The PivotTable object represents the visible pivot table as a whole, and can be used to manipulate the pivot table. It is accessed through the PivotTables collection of the worksheet object like this:

Worksheets(1).PivotTables(“MyPivotTable”)

Controlling the Display

The following properties control overall display characteristics of the pivot table:

RowGrand and ColumnGrand

Set RowGrand and ColumnGrand to True to display row and column grand totals.

NullString & DisplayNullString

NullString is a value to display in the pivot table in place of null values in the underlying data. If DisplayNullString is set to True, the NullString value will be used; if False, it will not.

ErrorString & DisplayErrorString

ErrorString is a value to display in the pivot table in place of error values in the underlying data. If DisplayErrorString is set to True, the ErrorString value will be used; if False, it will not.

HasAutoFormat

Set to True to have Microsoft Excel automatically format the pivot table when it is refreshed or when fields are moved.

SelectionMode

SelectionMode returns or sets the pivot table structured selection mode. Can be one of the following XlPTSelectionMode constants: xlLabelOnly, xlDataAndLabel, or xlDataOnly

PageFieldOrder, PageWrapCount

PageFieldOrder sets the order in which page fields are displayed in the pivot table (xlDownThenOver or xlOverThenDown). PageWrapCount determines how many page fields to display before a new column (or row) of page fields is started.

Referencing Ranges

The following properties return Range objects representing subsets of the visible pivot table:

ColumnRange

ColumnRange represents the column area of the pivot table.

RowRange

RowRange represents the row area of the pivot table.

PageRange

PageRange represents the page area of the pivot table.

DataBodyRange

DataBodyRange represents the data area of the pivot table.

DataLabelRange

DataLabelRange represents the area containing labels for the data area of the pivot table.

TableRange1

TableRange1 represents the entire pivot table, but does not include page fields. By using the SetSourceData method of a chart to connect the chart to this range, the chart will always display the current data shown in the pivot table.

TableRange2

TableRange2 represents the entire pivot table, including the page fields.

Protecting the PivotTable

There are a number of properties which you can use to control what the user can do to the pivot table at run time.

EnableDrillDown

Set EnableDrillDown to False to prevent the user from using the drilldown feature.

EnableFieldDialog

Set EnableFieldDialog to False to prevent the user from displaying the pivot table field dialog which can be used to reformat, reposition, and otherwise change fields.

EnableWizard

Set EnableWizard to False to prevent the user from displaying the pivot table Wizard and changing its structure.

The PivotCache Object

As discussed above, the PivotCache object represents the cached underlying data stored in RAM. This data is independent of the visible pivot table. It can be accessed as a property of the PivotTable which is derived from it:

PivotTables(“My PivotTable”).PivotCache

Or from the PivotTables collection of the workbook object:

ActiveWorkbook.PivotCaches(1)

Refreshing the PivotCache

You can refresh the PivotCache (i.e., requery the underlying data and rebuilt the cache) in two ways. Do it indirectly by calling the RefreshTable method of the PivotTable object, which refreshes the underlying cache. Or do it directly by calling the Refresh method of the PivotCache object.

PivotCache Informational Properties

The following are properties of the PivotCache which return interesting information about the size and state of the cache:

Protecting the PivotCache

You can set the EnableRefresh property of the PivotCache to False to prevent the user from refreshing the cache.

The PivotField Object

The PivotField object represents fields in the pivot table which correspond to columns in the underlying data set. To reference PivotFields in code, you access them through the PivotFields property of the PivotTable object:

PivotTables(“My PivotTable”).PivotFields(“Region”)

PivotField Orientation

PivotFields can be oriented in a number of positions on the pivot table (hence the name “pivot” table-you are literally pivoting columns of data into other positions). The possible orientations are:

Orientation property

You can change the orientation of any field by setting its Orientation property to any of the constants described above. For example:

With PivotTables(1)
.PivotFields(“Region”).Orientation = xlPageField
.PivotFields(“Quarter”).Orientation = xlRowField
.PivotFields(“Product”).Orientation = xlHidden
End With

AddFields method

You can also use the AddFields method of the PivotTable object to place multiple PivotFields in the page, row, and column positions in one step.

Other collections of PivotFields

The PivotTable object also provides properties representing subsets of the entire PivotFields collection. These are:

Useful PivotField Properties

DataRange

DataRange returns a Range object representing the values in a PivotField.

CurrentPage

For PivotFields in the page position, CurrentPage returns or sets the value currently displayed in the combo box.

Subtotals

For PivotFields in the row or column position, Subtotals determines if subtotals should be show (set to True or False).

NumberFormat

For PivotFields in the data position, NumberFormat determines the formatting to be applied to numbers.

Function

For PivotFields in the data position, determines the type of function used to summarize the data. Function can be one of the following:

Calculation

For PivotFields in the data position, determines the type of calculation done by the specified pivot field. Calculation can be one of the following:

Grouping Data

You can further summarize data in a pivot table by grouping values of PivotFields in the page, column, or row position. For date/time data, Microsoft Excel offers automatic grouping into useful time periods like months and quarters. You can also manually group any set of items into your own groups.

Grouping in accomplished by calling the Group method of a Range object. In the case of pivot tables, the range object to use is the DataRange property of the PivotField.

Protecting PivotFields

There are a number of properties which you can use to control what the user can do to a PivotField at run time.

DragToColumn

If set to False, the PivotField cannot be placed into the column orientation.

DragToRow

If set to False, the PivotField cannot be placed into the row orientation.

DragToPage

If set to False, the PivotField cannot be placed into the page orientation.

DragToHide

If set to False, the PivotField cannot be removed from the pivot table

EnableDrillDown

Set EnableDrillDown to False to prevent the user from using the drilldown feature on the PivotField. This overrides the “global” EnableDrillDown property at the PivotTable object level.

EnableFieldDialog

Set EnableFieldDialog to False to prevent the user from displaying the pivot table field dialog which can be used to reformat, reposition, and otherwise change fields on the PivotField. This overrides the “global” EnableFieldDialog property at the PivotTable object level.

PivotItems

If PivotFields represent fields in the pivot table, PivotItems represented individual values in a PivotField. To reference PivotItemss in code, you access them through the PivotItems property of the PivotField object:

PivotTables(1).PivotFields(“Region”).PivotItems(1)

Useful PivotItem Properties

Visible

Visible determines of the PivotItem is visible. Set to False to hide.

Value & Name

These properties are synonymous, and return or set the displayed value of the item. Note that is possible to change the value/name and Microsoft Excel will remember it until the PivotCache is refreshed and the pivot table rebuilt.

SourceName

This is the original value of the item prior to any changes to the Value/Name property in code.

Position

The ordinal position of the item. Can be changed to reorder items.

Other Collections of PivotFields

The PivotField object also provides properties representing subsets of the entire PivotItems collection. These are:

Calculations in Pivot Tables

You can derive entirely new PivotFields in a pivot table by defining a calculated PivotField. The calculated PivotField is defined by a formula which can reference among other things, existing PivotFields. To do this, you use the Add method of the CalculatedItems collection, which is accessed through the PivotTable object. The Add method takes two arguments:

For example, to derive a PivotField which calculated net income from revenue and expense, you would use the following syntax:

PivotTables(1).CalculatedItems.Add “NetIncome”, _
“=Revenue – Expense”

Once this has been done, the new PivotField can be placed in any orientation just as if it was a normal PivotField.

(Note: You can also create calculated PivotItems in a similar way. Refer to the online help for examples of this.)

Managing Large Datasets

Since pivot tables store all the underlying data in RAM in the PivotCache, performance can degrade with very large data sets. There are two techniques available to manage large data sets.

Server-Based Page Fields

Pivot tables have a new feature called Server-Based Page fields which can be used to manage memory when dealing with large data sets. For PivotFields in the page position, setting the ServedBased property to True will cause the pivot table to only query data into its cache that matched the currently displayed value in the page field. Thus, each time the selection is changed in the page field, the query is re-executed and the cache rebuilt. For large data sets, this can help manage the size of the cache. Of course, for small data sets, this will probably slow performance since it incurs a database hit every time the selection is changed. Having this property gives you the option of tuning the pivot table cache according to the situation. Note that when a field is server based, the “(all)” selection is not available since the whole data set is never in memory at once.

Limiting Data Read into the PivotCache

Through thoughtful construction of the SQL statement that is used to build the PivotCache itself, you can limit the number of records read into RAM. For example, if you only need to analyze data for a particular region or for sales over a certain amount, restrict the data being read by using a WHERE clause on your original query.

Suggested Reading

For more information on Microsoft Excel 97 development, the Microsoft Excel 97 Developer’s Handbook by Eric Wells and Steve Harshbarger available from Microsoft Press.

For more information on developing with Microsoft Excel 97, Office 97, and other Microsoft tools for Intranet applications, try Official Microsoft Intranet Solutions by Micro Modeling Associates, Inc., also available from Microsoft Press.

© 1997 Microsoft Corporation. All rights reserved.

The information contained in this document represents the current view of Microsoft Corporation on the issues discussed as of the date of publication. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information presented after the date of publication.

This document is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

Microsoft and Visual Basic are registered trademarks of Microsoft Corporation.

Other product or company names mentioned herein may be the trademarks of their respective owners.