Interacting with Microsoft Excel: A Case Study in Automation

Ruediger R. Asche
Microsoft Developer Network Technology Group

January 31, 1996

Click to view or copy the files in the THRDPERF project that accompanies this technical article.

Click to view or copy the XLCLIENT project files.

Click to view or copy the XLAUTO project files.

Abstract

Following up on the technical article "Win32 Multithreading Performance," this article discusses how Automation is used in the THRDPERF project to sample multithreading data using Microsoft® Excel.

Introduction

Automation? Sounds weird to a hard-core systems person like me—I'd rather booger down in the guts of an operating system than talk to lousy end-user applications like spreadsheets or word processors. Ditch it.

That was me a while ago. Then again, I kept away from the Microsoft® Foundation Class Library (MFC) for the longest time, until I discovered that MFC is a great tool that lets you focus on the essentials of an application instead of the user interface. Ever since I became a convert, I haven't used anything but MFC to write my applications.

My most recent MFC application tests multithreading performance on the Win32® application programming interface (API) (see "Win32 Multithreading Performance" for details). My application generates a whole lot of numbers that are completely incomprehensible unless visualized by some chart generator.

Oops, there it was. I couldn't escape Microsoft Excel anymore—after all, the last thing I wanted to do was to write my own chart software. The next problem then was how to stuff the numbers into Microsoft Excel? I could have gone the good old way and typed everything into a Microsoft Excel sheet by hand, but we are talking major numbers of numbers here, and repeated test runs. So there was no way around using Automation with Microsoft Excel, and I rolled up my sleeves and prepared for some gory work.

As was the case with my MFC conversion, I quickly learned to overcome my prejudices. Automation is a breeze to use from an MFC application. This article guides you through the steps you have to take to talk to Microsoft Excel from your MFC application. The first section covers the basics of Automation; the same steps also apply to communications with any Automation server other than OLE. In the second section I talk about Microsoft Excel objects and how to use them.

Note that there is already a lot of material available in Visual Basic® documentation about using Automation with Microsoft Excel. I do not want to add unnecessary noise, so I will keep the discussion about Microsoft Excel objects short. If you need to know more about how to use the objects, please refer to the material that is out there. One of the interesting things about Automation is that it is really language-independent, so it is sufficient for you to learn just enough Visual Basic to read object access code, and then you can simply transcribe that code to MFC.

Microsoft Product Support Services (PPS) has provided two sample application projects that demonstrate Automation, XLCLIENT and XLAUTO. The information in these samples complements the information in this article and in my own sample application, THRDPERF.

Automation from Visual C++

There are a couple of steps you need to perform in order to make sure that Automation is available for your application to use:

  1. Install the Automation server properly. Properly means that the system registry must be updated to contain the correct entries in the HKEY_CLASSES_ROOT subtree. Normally, the installation program takes care of this step, but at times your software may be installed without the necessary changes having been made. For example, I have a system that boots both Microsoft Windows NT™ and Windows® 95, and I installed Microsoft Excel under Windows NT. When I shut down and restarted the machine, Windows 95, of course, didn't have any idea about Microsoft Excel Automation, although Microsoft Excel worked just fine. I ended up reinstalling Microsoft Excel into the same directory, but there are ways to import subtrees into the registry, and some registration programs offer separate options to register Automation objects.

  2. Your MFC application should call the AfxOLEInit function to initialize OLE properly. When you generate your application from ClassWizard, this code should already be included in your project when you ask for OLE support.

Beginning from here, there are several ways to access Automation. You can either do it the hard way, that is, use OLE without Visual C++® wrappers (which requires that you know quite a bit about class IDs, class factories, dispatch objects, and the whole scary bit), or you can use type libraries, which make the job unbelievably easy. A type library is a little bit like a dynamic header file in that a type library contains information about the objects that are supported by the corresponding Automation server: the methods the objects support, the parameter lists, the values of symbolic constants, and so on. The ClassWizard from Visual C++ uses type libraries to generate C++ classes that encapsulate the OLE objects. Let's see how this works.

When you have created your application from AppWizard, you can call ClassWizard and add a class from a type library (you get there by clicking the Add Class button). Select the type library that goes with Microsoft Excel—in the version I used, this is XL5EN32.tlb, which normally resides in the directory that Microsoft Excel was installed in. Now ClassWizard will provide you a list of the objects that Microsoft Excel supports as an Automation server. Choose the objects you need, and ClassWizard will generate a prototype and implementation file for all the objects you chose. Now all that's left for you to do is to include the header file(s) that ClassWizard generated into the files of your projects that need to work with the Automation objects, and you're ready to rock 'n roll. Of course, now you are stuck with the problem of how to use the objects, and that's what we will look at next.

Using Microsoft Excel Objects

Several samples show how to control Microsoft Excel from a Visual C++ application. The XLCLIENT sample provided by Microsoft PSS is fairly elementary (just like my code) in that only a few data items are pumped into a Microsoft Excel sheet. XLCLIENT comes in both flavors, using type libraries or addressing OLE directly. The XLAUTO sample does use type libraries, but it is fairly complex in that a lot of objects are used, and the Microsoft Excel sheet is embedded in the application's client area.

In the code in my sample, THRDPERF, I wanted to do only the minimal work necessary to stuff some data into Microsoft Excel. Note that Automation is very powerful; I could have used Automation not only to pump the data into Microsoft Excel, but also to generate and format the charts, render the charts in the application window, execute macros, or even call a Win32 API routine from inside Microsoft Excel. I was content with having the data in a Microsoft Excel sheet though, so I used macros that I ran from within Microsoft Excel and a little manual work to do the rest.

Thus, my sample is about as simple as you can get. Using type libraries, using only the objects you minimally need, and using no visual embedding makes the code that communicates with Microsoft Excel very easy to work with.

Note that Automation servers implement "programming languages" of their own. That means that there is no predefined notion of what an object does and how the objects interact. This was kind of confusing to me because I knew how to access Microsoft Excel, and I had the names of the objects and their respective methods, but I didn't really have a clue about how to use the objects. Would I have to create a Books or Application object to be able to stuff data in a sheet, or is a Worksheet object enough? How would I have to use the methods? Are there any interactions between the methods, or are there side effects that calling one particular method might have on other methods? How would I access properties?

Fortunately, the Microsoft Excel object model is fairly intuitive, and there do not seem to be many dependencies between the objects. Thus, you use the objects on a "pay as you go" basis, creating and using only the objects you need.

Let's look at the code to see how Microsoft Excel is controlled from Threadlibtest.exe. All the code is located towards the bottom of Threaddlibtestview.cpp in CThreadLibtestView::StuffIntoExcel and the macros preceding that function.

First, we create an object of the Worksheet class (the class has been defined by ClassWizard). I use the default constructor, so the object does not have any connection to a real Microsoft Excel object yet. Thus, the next step is to associate the Visual C++ object with a "real" Microsoft Excel worksheet. There are several ways to accomplish this. I use the CreateDispatch member function. The parameter to CreateDispatch is the name of the object that we want to create; I choose Excel.Sheet as the parameter. What happens behind the curtains in the call to CreateDispatch is that the name Excel.Sheet is used to look up a class ID in the registry, and the class ID is used by the OLE run-time system to connect to Microsoft Excel and have Microsoft Excel create an object of type Worksheet:

Worksheet *ws = new Worksheet();  
if (!ws->CreateDispatch("Excel.Sheet"))
{
 AfxMessageBox("couldn't create sheet object...");
 return;
};  

You are probably a little bit confused about this two-step process. Isn't the Worksheet object already a worksheet? Why should we have to explicitly assign a "real" worksheet object when the Worksheet class already encapsulates a Microsoft Excel worksheet?

Keep in mind that the Worksheet object is first of all a representation of a Microsoft Excel worksheet. (In order to make this distinction clearer in this article, we will use the convention of formatting the Visual C++ objects in bold type and the corresponding OLE object in italic type.) OLE is an operating system extension that manages objects at run time, but Visual C++ requires objects to be known at compile time. The Visual C++ worksheet class is like a promise; it's as if the class were to say, "If you give me a real Microsoft Excel worksheet at run time, I know how to convert your compile-time parameters into something I can pass on to the object." We could have created our Worksheet object with a parameter that would already have specified the run-time worksheet object, but because the overloaded constructors for the COleDispatchDriver class (from which Worksheet is derived) cannot be passed an object name, we would have had to derive the class ID from the object name first before creating the Worksheet object. I think that using CreateDispatch is much easier.

Now that the worksheet object is created, we can stuff data into it. To do that, we need to know a little bit about Microsoft Excel objects and their meaning. In Microsoft Excel Automation, the object that represents cells is called a range. A range is basically a square of cells, and before we can access the cells, we must create a range object that represents the cells. The macro ALLOCRANGE does that for us:

#define ALLOCRANGE(cell) \
    VariantClear(&v); \
   V_VT(&v) = VT_BSTR; \
   V_BSTR(&v) = SysAllocString(cell); \
   vRet = ws->Range(v,v); \
   SysFreeString(V_BSTR(&v));

Note that in our sample code, we use range objects of only one cell at a time. In an application that requires efficient access to cells, you probably want to access a bigger range object wherever possible in order to avoid transferring the data in little chunks.

To obtain a range, you call the member function Range from a worksheet object. The arguments to Worksheet::Range are the upper-left and lower-right indices of the cell block you wish to access. Once more, in our case the two are the same. Note that the strings to pass must be Unicode™ strings (hence the OLESTR macro in which the string is wrapped when ALLOCRANGE is called). You must use the SysAllocString and SysFreeString functions from Oleaut32.dll to ensure that the strings are valid in the address spaces of both the Automation server and client.

After the Range has been allocated, it can be assigned a value using the SetValue member function:

#define PUTINT(cell, ival)  \
ALLOCRANGE(cell)  \
V_VT(&v) = VT_I4; \
V_I4(&v) = (int)ival; \
ptr = new Range(V_DISPATCH(&vRet)); \
ptr->SetValue(v); \
delete ptr;

This looks a little bit weird, so let me explain what happens here. The important data type for Automation is a VARIANT. A VARIANT is basically a pointer to an arbitrary object, along with an identifier that specifies what type the object has. We need a variant because Visual C++ requires that types be known at compile time, but OLE parameters are passed at run time. Thus, a VARIANT is a type that satisfies Visual C++ (because it is well defined) but can be used in OLE (because OLE can derive and decode the parameter at run time using the type information).

Note that a VARIANT in Visual C++ is different from a variant in Visual Basic. A Visual Basic variant allocates all the space necessary to fit the largest possible object as a whole, whereas a VARIANT in Visual C++ is pretty much a typed pointer.

The earlier call to Worksheet::Range returned a VARIANT to satisfy the C compiler. However, we need to know what the "real" return value is so that we can work with it. Worksheet::Range returns a pointer to an OLE dispatch object, which we have not discussed so far. Fortunately, we do not need to know what a dispatch object is; all we need to care about is that now we can use that object to create an object of type Range from. You may recall that when we created the Worksheet object earlier on, we used the default constructor, and therefore we had to use CreateDispatch to associate the Visual C++ object with a "real" OLE object. We create the Range object with the OLE dispatch object as the parameter, so our Range object is now associated with a "real" range object. We could have accomplished the same thing by creating the Range object with the default constructor and then using the CreateDispatch or AttachDispatch method to associate the Visual C++ object with the OLE object.

We then call the SetValue member of the Range object, and that does it. We are done with the object, and we can simply nuke it.

If you are familiar with Automation from Visual Basic, you will notice that there is a slight difference here when addressing values. Visual Basic uses the following code to set the value of a cell:

cells(x,y).Value=<value>

Visual Basic allows you to access the properties of an object directly using the dot notation. Properties roughly correspond to member variables in Visual C++; a property is a data value associated with an object. Using Visual C++, you cannot directly access properties because, once more, the Visual C++ representation of an object is different from the OLE object itself, so you can't obtain a pointer to the storage location easily. The SetValue member function of the Range object encapsulates code that accesses the Value property implicitly.

When we are finished stuffing the data into the worksheet, we use the SaveAs member of the Worksheet object to save the sheet to disk:

VARIANT v11,v12,v13,v14,v15,v16;
    V_VT(&v11) = VT_BSTR;
    V_BSTR(&v11)=SysAllocString(SHEETFILELOC);
    V_VT(&v12) = VT_I2;
    V_I2(&v12) = -4143;
    V_VT(&v13) = VT_BSTR;
    V_BSTR(&v13) = NULL;
    V_VT(&v14) = VT_BSTR;
    V_BSTR(&v14) = NULL;
    V_VT(&v15) = VT_BOOL;
    V_BOOL(&v15)=FALSE;
    V_VT(&v16) = VT_BOOL;
    V_BOOL(&v16) = FALSE;
     ws->SaveAs(v11,v12,v13,v14,v15,v16);
    SysFreeString(V_BSTR(&v11));
    delete ws;

The trick here is to know what the parameters to SaveAs are and how to pass them. Unfortunately, the parameter conventions for calling Microsoft Excel objects from Visual C++ are not documented, so you need to look into the Visual Basic documentation that is shipped with Microsoft Excel (see the file VBA_XL.HLP). The documentation for Worksheet::SaveAs reads as follows:

Syntax

object.SaveAs(filename, fileFormat, password, writeResPassword, readOnlyRecommended, createBackup)

Elements

The SaveAs method has the following object qualifier and named arguments:

object Required. The object to which this method applies.
filename Optional. A string indicating the name of the file to save. You can include a full path; if you do not, Microsoft Excel saves the file in the current directory or folder.
fileFormat Optional. The file format to use when you save the file. See the FileFormat property for a list of valid choices.
password Optional. A case-sensitive string indicating the protection password to be given to the file. Should be no more than 15 characters.
writeResPassword Optional. A string indicating the write-reservation password for this file. If a file is saved with the password and the password is not supplied when the file is opened, the file is opened as read-only.
readOnlyRecommended Optional. If True, when the file is opened, Microsoft Excel displays a message recommending that you open the file as read-only.
createBackup Optional. If True, Microsoft Excel creates a backup file; if False, no backup file is created; if omitted, the status is unchanged.

The first parameter is the name of the file to save the worksheet to. Once more, this needs to be a Unicode string, and the string must be passed to SysAllocString. The second parameter can be one of a number of choices; I chose xlNormal. This is a symbolic constant; how are we supposed to know what its value is?

Fortunately, the constants are stored in the type library. Thus, we next run a type library viewer such as Ole2vw32.exe on the type library (XL5EN32.tlb) and find that the value that corresponds to xlNormal is –4143. (Open the type library and look into the Constants TypeInfo list.) Note that there is no indication whether the value is supposed to be a short or a long integer; I chose VT_I2 (short or 2-byte integer).

The remaining parameters I chose to set to NULL. Once more, we need variants for each parameter. It is important not only to set the parameters but also to assign type values to them using the VT_xxx macros.

After the sheet has been saved, we can simply delete the worksheet object, and we are done. Note that once more, I avoid the icky issue of error handling here. It goes without saying that each OLE call may fail—just like any other call—and in a real-life application you are strongly advised to provide solid error checking.

Summary

Automation from Visual C++ using type libraries is conceptually very easy, as this minimal sample shows. The tricky part is to use the Automation objects properly.

Automation is the last bastion of software abstraction, because with Automation it is now possible to view an application (or, more generically, every Automation server) as a service provider. Thus, everything that an application makes public through Automation can be accessed from client applications. In my case, I wanted to use charting capabilities programmatically without having to write the software myself; by using Automation, I was able to put existing software to work.