Dr. GUI and COM Automation, Part 1

February 10, 1999

Dr. GUI's Bits and Bytes
Where We've Been; Where We're Going
Automation: And Now for Something Completely Different
Give It a Shot!
Where We've Been; Where We're Going

Dr. GUI's Bits and Bytes

It's a New Year, But No Resolutions or Predictions Here

Dr. GUI hereby breaks with the venerable columnist tradition of noting the new year by making resolutions and/or predictions for the upcoming year.

After the good doctor said he'd never buy Amazon.com at $100 because it was "overpriced," it shot up to nearly $200 after splitting 3 for 1. (That's equivalent to almost $600 before the split!) Having missed out on that run, the good doctor is in no mood to predict or resolve much of anything.

Dr. GUI still isn't comfortable with Amazon's stock price, but don't get the doc wrong: Amazon's a great company. Their Web site is amazing, they've got customer service that makes Nordstrom seem surly, and the company really knows how to pull together—everybody, including CEO Jeff Bezos, was working in the warehouse packing orders to keep up with the Christmas rush. So Dr. GUI will, after all, indulge in one prediction: Amazon.com will go far. They seem to be a company most of us can learn from.

Is Visual C++ 6.0 Cool, or What?

Dr. GUI hates to gloat (well, no he doesn't), but he wanted you to see Richard V. Dragan's review of Visual C++ 6.0 in PC Magazine's January 19 edition (http://www.zdnet.com/pcmag/pctech/content/18/02/tf1802.001.html). If you're not using VC++ 6.0 yet, you'll be drooling over it (use a napkin, please!) after you read this review.

For those of you who already have Visual Studio, Service Pack 2 is now available at http://msdn.microsoft.com/vstudio/sp/vs6sp3/. It's available for immediate download or on CD—or you may be getting it automatically (check the Web site to see).

Want to Use C++ Better?

Dr. GUI noticed that Scott Meyers's Effective C++ and More Effective C++ are now available as a single title: the Effective C++ CD. Sure, you'll need a computer to read it (it runs on Mac, UNIX, and even Windows), but having all this great information online makes it easy to cut and paste code to try out the features for yourself! And it's easy to refer to the CD: Scott added anchors to every paragraph to which you can link. Finally, the price is right: $29.95—less than either of the books in paper. (But you'll still want paper version for reading, right?) There's even a demo at http://meyerscd.awl.com/.

Y2K Roundup

So Dr. GUI made the mistake of writing about Y2K problems once. Then twice. Then…

In any case, the good doctor saw a couple of items on Y2K problems you might want to read.

Dave Barry's perspective on Y2K

Dr. GUI continues to be amazed and amused at Dave Barry's grasp of computers and computing. The good doctor, knowing that laughter is indeed the best medicine for all that ails you, heartily recommends Dave's thoughts in his Miami Herald column on Y2K—"Come the Millennium, Use the Stairs" (http://www.herald.com/archive/barry/1999/docs/jan3.htm). See what it is this time that Dave calls "the evil demon spawn of hell." (Hint: it's not Dr. GUI writing about Y2K.) And find out what will happen if you're in an elevator at midnight, January 1, 2000.

Microsoft Year 2000 Resource Center

Some of us, however, work on software, not elevators, and we have real Y2K problems to solve. If you do, you'll want to check out Microsoft's Year 2000 Resource Center (http://www.microsoft.com/technet/year2k/). You'll find information about Y2K compliance for Microsoft products, from DOS, Access, and Excel to Actimates Barney, Arthur, and DW. The site also contains product-by-product instructions for ensuring compliance, white papers, FAQs, tools, and more.

By the way, the Year 2000 Resource Center is also available on CD. Check out the URL above to get your copy.

VC++'s daylight savings time arrives late in 2001

VC++ may be cool, but it's not perfect. The localtime() routine in the run-time library thinks that daylight savings time for 2001 starts on April 8 rather than April 1. That means time calculations that rely on this routine can be an hour late for a week. (Dr. GUI wishes he could use that excuse for being late at other times.)

You can quickly find out whether your code could exercise this bug by searching for calls to localtime. The Win32 APIs for time aren't affected; neither are MFC's time routines because they use the Win32 API for time calculations. But if you use the standard library functions (for instance, if you've ported programs from UNIX systems), you should check to make sure you're not affected.

There will be a Visual Studio Service Pack to fix the library and a Windows update to fix the msvcrt.dll DLL version of the library sometime this spring.

There's more information at the Visual C++ site at http://www.msdn.microsoft.com/visualc/.

Where We've Been; Where We're Going

Last time we explored a more complicated ATL object with multiple interfaces, including interfaces that inherit from other interfaces; and we used that object from a variety of languages. This time and next, we're going to dive deep into the world of COM Automation. The good doctor was hoping to IDispatch this topic with one brief column, but it's going to take at least two. But relax: Brockschmidt spent well over 100 pages on the topic—we'll be done long before then. This time, we'll talk about how Automation (IDispatch) calls are made and what Automation objects have to do to handle them. Next time, we'll talk about special COM data types used for Automation and delve into dual interfaces.

Automation: And Now for Something Completely Different

Automation (formerly known as OLE Automation) is a totally different method for clients to call servers compared to the standard COM vtable interfaces that we've seen so far.

Automation uses the standard COM interface IDispatch to access the object's Automation interface. Therefore, we say that any object that implements IDispatch implements Automation. Note that our two previous ATL objects both use dual interfaces, in which a working IDispatch interface that matches the custom interface you write is provided to you by ATL and COM. Despite that fact that ATL makes Automation interfaces trivial to implement (can it be easier than selecting the right radio button?), it's still good to understand what's going on underneath the hood—that's what this article is about.

Why Automation?

Automation was originally developed as a way for applications (such as Word and Excel) to expose their functionality to other applications, including scripting languages. The intention was to provide a simple way to access properties and call methods that put as little strain on the Automation client as possible—and allowed calls to be made without needing type information about the object being accessed.

It's not trivial to figure out the type information from a C++ header for an interface, figure out the vtable offset for the method, and, hardest of all, set up a proper C++ stack frame so you can do the method call correctly. All this is especially tricky for a text-based interpreted language.

If every scripting language had to do this tricky programming, few would be able to access COM objects. With Automation, objects can present a simplified Automation interface and scripting language authors only need to master IDispatch and a few COM APIs.

Visual Basic's first 32-bit version used Automation to access OLE controls (now called ActiveX controls), which replaced 16-bit Visual Basic's VBX controls. Visual Basic can still use Automation to access a control's properties and methods, but more recent versions also support using standard COM vtable interfaces. The examples we created in previous columns all used the early-bound vtable interfaces, which give better performance than Automation interfaces. The examples we create this time will use Automation interfaces.

Scripting languages, such as Visual Basic for Applications, VBScript, and J/Script, use Automation exclusively. So if you want your object to be usable by scripting languages, you need to implement an Automation interface.

Objects and Properties and Methods, Oh My!

The Automation view of the world has three main concepts. Objects are the most important concept. Each object exposes properties and methods.

Figure 1. Properties and methods on an Automation object

Contrast this with the more complicated COM view of the world, in which interfaces, not objects, are primary, properties don't exist, and each object can have multiple interfaces containing multiple methods.

Figure 2. COM object, interfaces, methods (including unlabeled IUnknown)

Automation properties correspond to C++ data members or instance data (also called attributes), while methods correspond to C++ member functions. Notice that there's no separate concept of an interface—each object has only one Automation interface. Note further that COM interfaces don't have the concept of properties—they only have methods. (But we can simulate a property with a get/set pair of methods.)

How Are Automation Objects Created?

Creating an Automation object is a simple operation. I'll use Visual Basic as an example here, but the process is pretty much the same in any Automation-compatible language.

In Visual Basic, you'd first create an Object variable:

Dim Beeper as Object

…then set it to refer to a specific object:

Set Beeper = CreateObject("BeepCntMod.BeepCnt")

In this case, we've created a BeepCnt object. (See the first ATL article.)

We can then call methods on the object and manipulate its properties, as we'll see shortly.

But first, let's talk about what Visual Basic (or any Automation client) really has to do behind the scenes.

We already know that we're going to access the Automation object through the IDispatch standard COM interface. So the DIM statement just shown sets aside at least enough memory so that Visual Basic can store the IDispatch pointer for the object we'll soon create.

It's the CreateObject call that's a little trickier. First off, where's the GUID? How can we create objects without a GUID for its CLSID?

You might recall that we can also refer to object types by their ProgID. You might even recall that we register a key in the registry with the ProgID as the key name. This key has the CLSID as a subkey.

COM provides a function called CLSIDFromProgID that looks up the CLSID given a ProgID. Visual Basic calls this function using the string we passed to CreateObject. In this case, Visual Basic will pass "BeepCntMod.BeepCnt". CLSIDFromProgID looks up that key and returns the CLSID associated with it. (By the way, the first part of the ProgID is the module or application name, and the second part is the name of the object within that module or application.)

At that point Visual Basic calls our old friend CoCreateInstanceEx, passing the CLSID and asking for the IDispatch interface. If CoCreateInstanceEx succeeds, VB creates an object variable containing the IDispatch pointer it received from CoCreateInstanceEx and assigns it to our object variable.

If the creation fails for any reason—the object doesn't exist, or it doesn't implement IDispatch—the CreateObject call fails.

As you see, the overhead for Visual Basic (or any Automation client) is minimal—all it has to know about to create objects is two simple COM functions.

So How Do You Access Automation Properties and Methods?

The Visual Basic source code for accessing our object might look like this:

BC = Beeper.Count
Beeper.Count = 5 
Beeper.Beep

These three statements—accessing a property, setting a property, and calling a method—are all done using only two IDispatch methods: GetIDsOfNames and Invoke. IDispatch::GetIDsOfNames gets the integer ID associated with the text name of the method or property. Visual Basic calls it to discover that "Beep" corresponds with ID 1 and "Count" corresponds with ID 2. We'll need these IDs, called dispids, when we call IDispatch::Invoke.

All actual Automation property and method access is done via calls to IDispatch::Invoke. In other words, all your Automation client needs to know to access Automation objects is a few simple COM calls. If your implementation language is other than C or C++, you can write helpers for your run time that do those calls for you—so it's easy to use Automation from any program.

Easy, perhaps, but not trivial: IDispatch::Invoke takes a bunch of parameters, all of which have to be set up just right. The most important are:

An integer ID, called a dispid, that specifies the property or method being accessed (we got this by calling GetIDsOfNames with a string containing the name of the property or method).
A structure that contains a pointer to an array of parameters. (Each parameter is stored in a structure containing a type tag and a union called a variant.)
A flag that indicates what to do with the property (set it, get it, set it using a reference) or method (call it).
A return value parameter, also a variant, for property gets and methods that return values.

Oh, and both Invoke and GetIDsOfNames accept a locale ID in case you want to localize method, property, named parameter names, or parameter values.

Invoke also has a couple of other parameters for passing error information back to the Automation client. We'll assume a perfect world and skip these for now.

Variants are stored in 16 bytes. The first two bytes are a tag that contains a number representing the type of the variant, the next six bytes are padding, and the final eight bytes are the value of the variant. The format of the value depends on the value of the tag. In C/C++, we represent the value of the variant with a union. Variants can hold most of the C++ data types plus pointers, arrays, strings, dates, and currency objects. We'll do the full treatment on COM data types, including variants, next time.

So Why Is This Easier than a Custom Interface?

Note the things that aren't necessary for doing a call through IDispatch::Invoke:

No vtable offsets—we use a dispid, which we got by asking the object itself.
No C/C++ parameter lists and calling convention—we use an array of variants.
No C/C++ header file to tell you about the above—but type libraries are optional.

It's not true that you can dispense with C/C++ altogether. Obviously, the four calls need to be done using the C/C++ calling convention. But that's the only place that you need to worry about it when you're an Automation client.

All you need to make the call, then, is an IDispatch pointer to the object, the name of the property or method you want to access, and a list of parameters.

By the same token, if you want to be able to write COM objects in your scripting language, it's easier for your language run time to implement only IDispatch (oh, and a way to create objects) instead of trying to deal with myriad details of an infinite variety of custom interfaces.

The Difference Between COM Interfaces and Automation

Just from the preceding description, you immediately can see a few ways in which Automation differs from COM interfaces:

Automation interfaces are not necessarily immutable, although you shouldn't change them on the fly because clients can cache dispids. But it's common to change Automation interfaces, especially to add methods, from version to version of an object. (If you delete methods or change parameters, you can break existing client code.)
Automation methods (and properties) can take variable-length parameter lists containing different types. It's the job of the implementation of IDispatch::Invoke to, at run time, parse the parameters and perform any necessary type conversions. (If it's not possible to convert the parameters, the object's implementation of IDispatch::Invoke returns an error, HRESULT.)
Automation methods and property access is late-bound; in other words, the determination of exactly which methods/properties are accessed is postponed until the call.
Because of all this late binding, Automation methods and properties are polymorphic in a way more like Smalltalk than C++: You can access any method or property on any object and it's the object's responsibility to reject bad calls. For instance, you could call a Print method on an arbitrary object. Any object that implements a method called Print would presumably print its value; any object that didn't would fail the call. In C++, you could only call Print if the object's class defined a Print method: Checking of the name and parameters occurs at compile time, not run time.

So How 'bout Some Examples?

As a quick example, let's look at how the three calls just shown would be made.

Calling a method

Because there are no parameters to or return value from the Beep method, let's call it first. Remember that the call is written as:

Beeper.Beep

One thing you'll need to know first: The parameters pointer passed to Invoke is actually a pointer to a DISPPARAMS structure, which is defined as follows:

typedef struct FARSTRUCT tagDISPPARAMS{
    // Pointer to array of arguments, named and unnamed
     VARIANTARG FAR* rgvarg;            
    // Array of Dispatch IDs of named arguments
     DISPID FAR* rgdispidNamedArgs;
    // Total number of arguments, named and unnamed
     unsigned int cArgs;
    // Number of named arguments.
     unsigned int cNamedArgs;
} DISPPARAMS;

We're not going to use named arguments, so rgdispidNamedArgs will be NULL and cNamedArgs will be zero.

The code to make this simple call could be:

DISPID dispid;
OLECHAR * szMember = "Beep";
// No parameters, so no array
DISPPARAMS dispparamsNoArgs = {NULL, NULL, 0, 0};

// pdisp is an IDispatch pointer 
// to the Beeper object
hresult = pdisp->GetIDsOfNames(IID_NULL, &szMember, 1,
                LOCALE_USER_DEFAULT, &dispid);
hresult = pdisp->Invoke(
        dispid,
        IID_NULL,
        LOCALE_USER_DEFAULT,
        DISPATCH_METHOD,
        &dispparamsNoArgs, NULL, NULL, NULL);

First, we get the dispid by calling GetIDsOfNames. The IID_NULL value in both calls is the value for a reserved parameter. We pass a pointer to the array of char pointers (in this case, only one char pointer) and the number of pointers in the array, the locale (in case we want localized names), and a pointer to the array of dispids (again, only one pointer here). When the call returns, dispid will contain the dispid corresponding to "Beep."

We don't have to call GetIDsOfNames more than once if we provide a way to remember the dispid for "Beep."

Once we have the dispid, we can call Invoke.

You'll notice that even though the parameter list is empty, we still have to provide a minimal DISPPARAMS structure. Note also that we have to specify that we want a method call by passing DISPATCH_METHOD.

You'll also notice that this is a lot more complicated than just calling Beep(). Automation is easier for non-C/C++ clients to use, but it's always slower than just doing the calls directly. However, if the server is in another process or on another machine, the extra time for setting up (and executing) Automation calls becomes insignificant.

Getting a property's value

Getting a property is very similar to calling a method—the only difference is that we'll pay attention to the return value when the call returns.

The code in Visual Basic for getting a property's value is written as:

BC = Beeper.Count

The C++ code for calling Invoke could be:

VARIANT varResult;
// No parameters, so no array
DISPPARAMS dispparamsNoArgs = {NULL, NULL, 0, 0};

// dispid set by call to GetIDsOfNames
// (omitted for brevity)

hresult = pdisp->Invoke(
        dispid,
        IID_NULL,
        LOCALE_USER_DEFAULT,
        DISPATCH_PROPERTYGET,
        &dispparamsNoArgs, &varResult, NULL, NULL);

// Property's value stored in varResult

It is easy to have parameterized properties (or "property arrays") by passing parameters, such as an index or lookup key, when getting (and setting) the property.

Note that the syntax of Visual Basic cannot distinguish between getting a property and calling a method that takes no parameters but does have a return value. In other words, it's impossible to tell from the syntax whether the call BC = Beeper.Count is accessing a property called Count or calling a method called Count.

As a result, some Automation clients will pass both flags for a property access—in the case of a property get, Visual Basic might pass DISPATCH_PROPERTYGET | DISPATCH_METHOD because it can't tell whether it's doing a property access or a method call. Automation objects have to deal with this by doing the right thing depending on whether the dispid refers to a property or a method.

Setting a property's value

Setting a property is different in three ways from getting a property:

There is a parameter for the new value of the property.
This parameter is named using the dispid DISPID_PROPERTYPUT.
The return value parameter is ignored.

Setting a property is different from calling a method in that the parameter to which the property is being set is named with a special dispid.

Why do we have to go through this hassle? Recall that properties can be parameterized. Having a special name for the value of the property makes it easy for the Automation object to recognize which parameter is to be the value of the property.

(The good doctor really wanted to avoid using named parameters, but this is the one case in which they're absolutely required.)

Recall that the VB code to set the Count property on the Beeper object is:

Beeper.Count = 5

The C++ code for calling Invoke could be:

// parameter structure
DISPPARAMS dispparams;
// one-element array of parameter names
DISPID mydispid[1] = { DISP_PROPERTYPUT };
// one-element array of parameters
VARIANTARG vararg[1];

dispparams.rgvarg = vararg; // 1-element array
VariantInit(&rgvarg[0]);
dispparams.rgvarg[0].vt = VT_I4;   // 32-bit integer
dispparams.rgvarg[0].iVal = 5;   // here's our 5!
dispparams.rgdispidNamedArgs = mydispid; // name array
dispparams.cArgs = 1;      // total args
dispparams.cNamedArgs = 1;   // named args

// dispid set by call to GetIDsOfNames
// (omitted for brevity)

hresult = pdisp->Invoke(
        dispid,
        IID_NULL,
        LOCALE_USER_DEFAULT,
        DISPATCH_PROPERTYPUT,
        &dispparams, NULL, NULL, NULL);

All of that code is for one little call!

You start to get an idea of how much code is required to do Automation calls and why Automation can be so slow. If there were more parameters, the three lines to set each variant:

VariantInit(&rgvarg[0]);
dispparams.rgvarg[0].vt = VT_I4;   // 32-bit integer
dispparams.rgvarg[0].iVal = 5;   // here's our 5!

…would be repeated for each parameter—one statement to initialize the variant, one to set its type, and one to set its value. If the parameters were named, there would be an additional statement to set each parameter name in rgdispidNamedArgs. And the counts would have to be set correctly.

So Automation trades off code size of calling for flexibility. And it gives up one more thing: The only parameters you can pass are those that can be described in a variant. The biggest loss is that you cannot describe a struct using a variant.

We'll discuss variants shortly. If you want to know more about Automation argument passing, check out your favorite COM reference.

Going to write your own IDispatch::Invoke calls? Read this…

You won't run into the issue we're about to discuss unless you are writing your own scripting language or calling IDispatch::Invoke directly, perhaps from a C++ client that's talking to an Automation-only object.

As he was doing his reading about optional arguments, Dr. GUI discovered a puzzling situation: The COM documentation for IDispatch::Invoke currently says, as does Brockschmidt, that you have to pass variants with the tag VT_ERROR for each omitted unnamed argument. If you skip an unnamed argument, as in object.method(a,,c), that's clearly true. But what if the optional arguments are at the end? How would a scripting language that doesn't use the type library ever know how many dummy arguments to pass? Answer: it can't.

The good doctor was wondering about this, so he checked with some folks on the COM team—and it turns out the docs are incorrect for this case. (They'll be fixed next time they're revised.)

If you want to omit unnamed optional arguments at the end, you can just omit them. Objects that use the COM implementation of IDispatch (more on this later) will handle this case properly by providing arguments to the dual interface function as specified in the IDL. But there's one gotcha: If the object you're calling implements IDispatch::Invoke itself rather than relying on COM, it may be written so it expects dummy arguments for all optional and default value arguments—so, if you can get the type library information to determine the number of parameters, you should so that you can provide the dummy arguments.

Automation: The Server Side

We've seen now more than we ever wanted to know about how to call methods, set and get properties, access return values, and pass parameters. How does the Automation object deal with all this?

In short, the object deals with all this by implementing IDispatch::Invoke (and, of course, the rest of IDispatch, including GetIDsOfNames). But let's say you're implementing the object. How do you implement these methods?

Implementing IDispatch the Hard Way

If you don't have any parameters (or maybe only one parameter) for your methods and no parameterized properties, it's pretty easy to implement IDispatch::Invoke on your own. All it takes is a switch statement to call the right function for each combination of dispid and type. (You could use a call table, too.) You'll have to convert the parameters to the type you need, but it's easy to call VariantChangeType to convert the variant from whatever type it is to whatever type you need. (If the conversion fails, just return an error to the caller.)

But you can imagine the mess if you have multiple parameters. First off, the unnamed parameters are in reverse order in the array. If they're named, the order is determined by the array of dispids, meaning that you'll have to sort that out. But wait—that's not all—you might want to support optional parameters and optional parameters with default values. Figuring out all of this is a huge mess. It's not only hard—you're likely to make a mistake doing it. If you do, you'll probably end up reporting an error on calls you should have been able to handle—not a great way to inspire confidence among your users.

Is there an easier way? You bet there is.

Don't Do It the Hard Way. Let COM Do It for You!

If your server meets two relatively simple requirements, you can take advantage of COM's built-in implementation of IDispatch. The requirements are:

Your Automation interface must be a dual interface (as ATL generates), not a pure dispinterface.
You must generate and make available a type library to tell COM what the methods and properties are.

ATL doesn't even have support for incoming pure dispinterfaces, so your choices are dual (including Automation) and custom interfaces. The major difference between a dual interface and the equivalent custom interface is that the dual interface is derived from IDispatch rather than IUnknown. This means that in addition to QueryInterface, Release, and AddRef, dual interfaces also must implement all of the IDispatch methods (including GetIDsOfNames and Invoke). Just as ATL provides implementations of the IUnknown methods for you, it also provides implementations of the IDispatch methods.

IDL for dual interfaces

The IDL for a dual interface looks very much like the IDL for a custom interface. For the BeepCnt object, the IDL for the IBeepCnt interface is:

   [
      object,
      uuid(4F74530F-3943-11D2-A2B5-00C04F8EE2AF),
      dual,
      helpstring("IBeepCount Interface"),
      pointer_default(unique)
   ]
   interface IBeepCount : IDispatch
   {
      [id(1), helpstring("method Beep")] HRESULT Beep();
      [propget, id(2), helpstring("property Count")]
         HRESULT Count([out, retval] long *pVal);
      [propput, id(2), helpstring("property Count")]
         HRESULT Count([in] long newVal);
   };

Note that the IID for the dual interface and the fact that the interface is a dual interface are specified in the interface attributes. Note also that the interface is derived from IDispatch, not IUnknown.

The IDs in the method attributes are the dispids for the automation interface. Note that methods that implement properties have a special attribute—in this case, the Count property has two methods, one to get the value and one to set it. When MIDL generates the C++ header file, the two methods will be named get_Count and put_Count.

By the way, the vtable for this interface will have 10 entries: three for IUnknown, four for IDispatch, and the three IBeepCount methods.

The type library

The type library is generated by the library section of the IDL file:

[
   uuid(4F745303-3943-11D2-A2B5-00C04F8EE2AF),
   version(1.0),
   helpstring("BeepCnt 1.0 Type Library")
]
library BEEPCNTLib
{
   importlib("stdole32.tlb");
   importlib("stdole2.tlb");

   [
      uuid(4F745310-3943-11D2-A2B5-00C04F8EE2AF),
      helpstring("BeepCount Class")
   ]
   coclass BeepCount
   {
      [default] interface IBeepCount;
   };
};

The library attributes specify the LIBID, version, and a helpstring. The library imports standard type libraries, and then specifies the coclass for this object.

Note that the CLSID for the object is specified by the GUID in the coclass attribute list. Because this is a simple object, all that's left is to specify the interface this object implements. (IUnknown and IDispatch are covered by inheritance.)

MIDL generates the type library in a .tlb file. You can use this file separately, but you usually don't have to. This file is included in the DLL in the resource section, so it's built into the DLL.

The original use for type libraries is very important, too: They're what tools like Visual Basic, Visual J++, and Visual C++'s smart pointer (#import) use to figure out what methods and properties an object has.

How ATL Uses COM to Implement IDispatch

ATL's implementation of IDispatch is pretty simple at this point. First, in the DllMain function (which runs when the DLL is loaded), the Init method of the Module object loads the type library, among other things. COM allows us to access the type library through the standard COM interfaces ITypeLib and ITypeInfo. Init stores a pointer to the object's type library ITypeInfo interface.

Our good friend ITypeInfo

The ITypeInfo interface has methods called GetIDsOfNames and Invoke. ITypeInfo::GetIDsOfNames uses the type library information to get the correct dispatch IDs for the names you send. The implementation of this method is a built-in part of COM, so you don't have to write it—just provide the type library.

ITypeInfo::Invoke is considerably more interesting: it parses the dispid and parameters (and so forth) and actually makes a call to the proper C++ method via the vtable of your dual interface. It does all this using the type library information to determine the types to convert the parameters to; how to deal with default, optional, and named parameters; and what the vtable offset is. It then builds a stack frame for the parameters and makes the call.

Because COM does all this work for you, ATL's implementations of the IDispatch methods in IDispatchImpl is very simple—it just involves passing the call on to the type library via the ITypeInfo interface pointer stored when the object was initialized. For instance, the IDispatchImpl::Invoke code is simply:

   STDMETHOD(Invoke)(DISPID dispidMember, REFIID riid,
      LCID lcid, WORD wFlags, DISPPARAMS* pdispparams, 
      VARIANT* pvarResult, EXCEPINFO* pexcepinfo, UINT* puArgErr)
   {
      return _tih.Invoke((IDispatch*)this, dispidMember, riid, lcid,
      wFlags, pdispparams, pvarResult, pexcepinfo, puArgErr);
   }

The ITypeInfo pointer is stored in tih. All we do is pass the pointer to the object and the parameters passed to us, and COM does the rest. It couldn't be simpler.

Okay, so what's the catch?

As you might be guessing, though, there is a speed penalty for using the IDispatch side of a dual interface—all this digging through the type library and setting up call stacks isn't free. But the COM implementation is optimized to minimize the overhead—and IDispatch calls are slow in any case because of all the work (such as array setup and call stack setup, as just described) necessary on the calling end. So the difference isn't that great—especially because if you implemented IDispatch::Invoke yourself you'd have to do about the same amount of work, less digging through the type library.

When Should Your Components Support Automation?

So when should and shouldn't you support Automation?

Obviously, if you want your components to be used by scripting languages and other Automation-only clients, you have to support Automation. So if your component will be used in a Web page, by the Windows Scripting Host, or by Visual Basic for Applications in Office apps and many other apps, you'll have to support Automation.

In other words, if you don't want to drastically cut the market for your components, you need to support Automation. And it's as easy as clicking the correct button in ATL, so there's not much excuse not to.

On the other hand, there are some components you'll only call from vtable-compatible languages (Visual C++, Visual J++, Visual Basic, and so on). A good example of this is if you're writing components because your system design uses componentization (a very good thing). Such components often can't be used outside of the system for which they were designed because they're specialized—so it might not make sense to use these components, say, on a Web page.

What Will It Cost You to Support Automation?

There are some costs associated with supporting Automation.

Size and speed

There's a small increase in your ATL component's size if you offer a dual interface. Usually this isn't significant, but if you're in a situation where every byte counts, generate the component both ways and see if the difference is important.

IDispatch calls are slow, but if you use a dual interface, you give the client the choice of whether to use fast (vtable) or slow (Automation) calls. So performance isn't really a big issue.

But there is one subtle performance problem that slows your calls if marshalling is involved, as when the client and server are in different processes (or on different machines). Dual interfaces use COM's universal marshaller, which is driven by type library information. This marshaller is somewhat slower than the one generated for you by MIDL. However, the extra time spent marshalling the call is pretty insignificant compared to the time spent switching processes or communicating with another machine over the network.

Marshalling is normally not necessary for in-process (DLL) servers, so this issue doesn't affect performance in most cases. However, marshalling is used in in-process servers if the client and object are in different "apartments," as they often are in multithreaded applications. In the in-process cross-apartment case, the extra marshalling time can be significant.

Not as flexible as good 'ol C++

The bigger problem has to do with the differences between the Automation interface model and the standard COM interface model. First off, there's normally only one Automation interface for a particular object. This can be a problem in some designs.

Worse yet, your parameters and return values are limited to those types that can be stuffed into a variant. We'll discuss variants in the next column, but for now you should know that most scalar types are supported, as are strings and arrays. So that covers most of what you'll want to do.

But structures are NOT supported for Automation interfaces. You can hack around this, but it's a pain. That means, by the way, that linked data structures are also difficult.

So if you want to pass sophisticated C++ data structures around, Automation is not for you. But what if you need to support Automation anyway?

The smart alternative

If you need to support Automation even though you've got some good reasons not to, there is a way out: Do both a dual interface and an equivalent (in functionality) custom interface.

The custom interface can take advantage of C++'s powerful data structures and won't require use of the universal marshaller—in fact, you can write a custom marshaller for the most optimized performance. This would be the interface that your C and C++ clients would use.

Everyone else would use the dual interface. You might have to do some fancy work to figure out how to translate a complex C++ data structure into something you can squeeze into a variant, but once you've done that you'll have an object any client can use.

Give It a Shot!

Dr. GUI knows full well that you won't know whether you really know what we've talked about until you try it—so give it a shot!

Write a simple app that calls a simple COM Automation component by coding the IDispatch::Invoke call (and related code) yourself. You don't want to do this often, but it'll be good to do it once—if for no other reason than to help you appreciate what your scripting language does for you.
Write a simple Automation component, implementing IDispatch, including the dreaded Invoke method, yourself. There are a couple of simple methods we didn't mention—check the documentation to find out about those. If you like, modify your solution to use COM's standard IDispatch implementation.

Where We've Been; Where We're Going

This time, we made a down payment on the mysteries of COM Automation by discussing how Automation properties and methods are used. Next time, we'll talk about Automation data types and delve more into dual interfaces.

Dr. GUI and COM Automation, Part 1

Contents

Dr. GUI's Bits and Bytes

It's a New Year, But No Resolutions or Predictions Here

Is Visual C++ 6.0 Cool, or What?

Want to Use C++ Better?

Y2K Roundup

Dave Barry's perspective on Y2K

Microsoft Year 2000 Resource Center

VC++'s daylight savings time arrives late in 2001

Where We've Been; Where We're Going

Automation: And Now for Something Completely Different

Why Automation?

Objects and Properties and Methods, Oh My!

How Are Automation Objects Created?

So How Do You Access Automation Properties and Methods?

So Why Is This Easier than a Custom Interface?

The Difference Between COM Interfaces and Automation

So How 'bout Some Examples?

Calling a method

Getting a property's value

Setting a property's value

Going to write your own IDispatch::Invoke calls? Read this…

Automation: The Server Side

Implementing IDispatch the Hard Way

Don't Do It the Hard Way. Let COM Do It for You!

IDL for dual interfaces

The type library

How ATL Uses COM to Implement IDispatch

Our good friend ITypeInfo

Okay, so what's the catch?

When Should Your Components Support Automation?

What Will It Cost You to Support Automation?

Size and speed

Not as flexible as good 'ol C++

The smart alternative

Give It a Shot!

Where We've Been; Where We're Going