June 1996
Download MSJJUN.exe (4KB)
Don Box is a co-founder of DevelopMentor where he manages the COM curriculum. Don is currently breathing deep sighs of relief as his new book, Essential COM (Addison-Wesley), is finally complete. Don can be reached at http://www.develop.com/dbox/default.asp. |
Q
I am using dual interfaces to expose my objects to both C++ and Visual Basic® clients. Designing the interfaces was relatively painless once I passed simple data types as method parameters. I now need to pass user-defined structures as parameters and can't get it to work. Any suggestions?
A
Your problem stems from the fact that dual interfaces suffer from the same limitation as normal dispatch interfaces: all parameters must be VARIANT-compatible. Both dispatch and dual interfaces (which ultimately are just an extension to dispatch interfaces) are meant to expose functionality to clients written in scripting languages, many of which are interpreted. To allow scripting clients to pass method parameters whose type may not be known until run time, you must use some alternative mechanism for passing parameters. This is the purpose of the VARIANT from IDispatch.
|
|
This array is now a suitable parameter list for use by IDispatch.
The type tag's value must be chosen from a limited set of values (see Figure 2). This list is fixed and is not user-extensible. If your data type is not found in the list, then you must somehow coerce your parameter into a data type supported by the VARIANT. While the VARIANT does contain a union member of type void * (selected by specifying VT_PTR), this member cannot be used when the object is located on a different thread or in a different process; the standard marshaler for IDispatch has no idea how to marshal the data that is referred to by the pointer. |
Figure 2 VARTYPEs |
To understand how the VARIANT is used, consider the following simple interface:
|
|
Many environments are not capable of taking advantage of the dual interface (for example, Visual Basic 3.0 and Visual Basic for Applications). Clients written in these environments would access the SetCoords method indirectly via Invoke.
|
|
Since C++ and Visual Basic 4.0 are relatively strongly typed, these languages have no problem accessing the methods of a dual interface directly without using Invoke. These clients can simply call the SetCoords method directly.
|
|
Both code fragments accomplish the same task, but clients accessing the method directly will find a considerable performance gain, especially when the object is on the same thread as the caller.
The limitation imposed by the VARIANT is obvious for dispatch interfaces, as all logical parameters must be passed as an array of variants. If there isn't an appropriate union member to hold your parameter, there is probably no way to pass your parameter without some sort of manual type coercion. It is less obvious why this limitation applies to dual interfaces, as clients can access the method implementations directly. They can bypass the IDispatch::Invoke mechanism and pass the parameters directly on the stack instead of in a VARIANT array. Given the fact that dual interface clients don't use VARIANTs, it seems counterintuitive that the methods of a dual are also limited to VARIANT-compatible types. This limitation exists for two reasons. First, it is assumed that all methods of the dual interface are accessible from Invoke. Since Invoke can only receive parameters that are packed in a VARIANT array, this immediately presents a problem for VARIANT-incompatible types. The second and most important limitation stems from the universal marshaler that is used to remote the methods of the dual interface. The universal marshaler uses the type information for the interface to dynamically translate between the call stack and the marshaling packet for use in standard marshaling. The advantage of this approach is that no user-supplied proxy/stub pair is needed to remote the interface. The limitation of this approach is that the universal marshaler is hard-wired to support VARIANT-compatible types only. The type library compiler is aware of this limitation, and will not allow you to create a dual interface that accepts VARIANT-incompatible types. In spite of the previous explanation, there must be some technique that allows user-defined structures to be passed as method parameters. In fact, there are several. If you must remain in the world of dispatch and dual interfaces to maintain scripting compatibility, there are four viable options for passing structures as parameters:
Before critiquing each approach, let's see an example of each applied to a simple structurepassing a RECT structure from a client to an object as a method parameter. In case you don't have a copy of Petzold handy, here's the definition of RECT from WINUSER.H: |
|
In Visual Basic, this would be written as:
|
|
Both type definitions yield a structure that contains four integers of four bytes each. Unfortunately, neither one is a legal parameter type for a dispatch or dual interface. To pass a rectangle through these interfaces, one of the four techniques listed above must be used.
The first technique requires the least explanation. To allow callers to pass a rectangle, you can simply break apart the structure into individual parameters, as is demonstrated by this interface: |
|
This passes a rectangle from Visual Basic:
|
|
To pass a rectangle from C++, use this code:
|
|
Reconstituting the RECT in the implementation is simple:
|
|
This technique is practical for simple structures that do not contain many elements. It is not so practical for complex structs, and is somewhat inconvenient and error-prone when the structure in question is commonly used as the native data representation by the client.
The second technique takes advantage of the fact that VARIANTs can be used to pass primitive types, pointers to single instances of primitive types, and self-describing arrays of primitive types. One of the supported primitive types is unsigned char, which for our purposes translates to byte. Instead of breaking apart the struct as in the previous example, this technique simply copies the structure into an array of bytes and assumes that the method implementation treats the byte array appropriately. Unlike vanilla C and C++, when dual and dispatch interfaces use arrays, they must be accompanied by a description of the array's dimensions. This array descriptor is called a SAFEARRAY and describes the bounds for each array dimension as well as the individual element size. |
|
The SAFEARRAY is simply a description of the array, not the array itself (see Figure 3). The actual array contents are stored in a separate block of memory referred to by the pvData member. The array descriptor also tracks the number of outstanding pointers to the array data, ensuring that the actual memory is not freed while one or more concerned parties are relying on it remaining valid.
SAFEARRAYs can be created and manipulated using API functions or through direct manipulation. The SafeArrayCreate API function creates a new array descriptor and allocates the space for the array contents. To access the array data, call the SafeArrayAccessData function, which returns a pointer to the base of the array after incrementing the lock count. Release the lock by calling SafeArrayUnaccessData. |
Figure 3 SAFEARRAYs |
The following code allocates a new array of 100 longs and initializes each element:
|
|
The SAFEARRAY API functions manage all memory for you. For simpler situations, or when performance is important, you can manage the memory for both the descriptor and the array by hand. To notify the system that the memory is user-managed, the fFeatures field supports the following flags:
|
|
The AUTO, STATIC, and EMBEDDED flags indicate that the memory belongs to the user, not the system. The FIXEDSIZE flag indicates that the array cannot be redimensioned. The performance gains that can be achieved by manually managing the memory used in the array are fairly trivial if the object is out-of-process, but can be quite high (by factors of 10 or more) if the object is in-process.
Armed with the knowledge of SAFEARRAYs, we can now get back to the problem of passing structs through a dual interface. The type library compiler allows you to specify parameter types that are arrays by using the SAFEARRAY keyword. |
|
While the type library compiler accepts this description, both Visual Basic and MFC have severe problems dealing with this type of parameter (in fact, both prohibit it). Fortunately, both environments allow you to pass arrays indirectly as a VARIANT, with the following interface definition:
|
|
To pass a RECT through this parameter, the client must first create and initialize the array, then create and initialize a VARIANT that refers to the array. Figure 4 shows the memory layout of this technique. Figure 5 illustrates how to implement this in C++. If the client is Visual Basic, some additional work is required to invoke the method.
|
|
Fortunately, finding the RECT in the implementation is straightforward.
|
|
Figure 4 Passing a RECT as a SAFEARRAY of Bytes |
This technique is practical for structures that contain simple primitive types that you can treat as byte arrays easily. The implementation shown above assumes that both the client and the object are executing on the same platform and does not address issues such as byte ordering, member alignment, and floating point formats. Additionally, for structures that contain more complex types, simply copying the memory that the structure occupies is not enough, as is the case with pointers. Moreover, if the structure contains COM interface pointers, you cannot use this technique at all. On the plus side, this technique performs extremely well, as virtually no processing or copying is required by the client or object implementations.
The third technique for passing structures is to serialize each struct member into an array of VARIANTs. This technique is similar to the previous approach, but it requires the client and object to individually pack each structure member into a variant, which allows the use of somewhat richer member types (interface pointers are legal). To use this technique, the interface description is the same as in the previous example. |
|
To pass a RECT through this parameter, the client must first create and initialize an array of VARIANTs based on the RECT's contents and then create and initialize an additional VARIANT that refers to the array. Figure 6 shows the memory layout of this technique. Figure 7 illustrates the required client C++ code. For Visual Basic clients, all that is needed is some simple linearization of the struct into a variant array:
|
|
To reconstruct the RECT, the implementation must extract each member by hand.
|
|
This technique is somewhat less convenient than using byte arrays, but it works with structs that contain interface pointers. It also works when communicating with objects that are on machines with different byte-ordering or floating point formats than that of the client, as the interface marshaler will convert between the disparate data formats transparently. |
Figure 6 Passing a RECT as a SAFEARRAY of VARIANTs |
The fourth technique for passing user-defined structs through dual and dispatch interfaces requires the most work from the object, but yields the best integration with Visual Basic. This technique requires that the structure be passed not by value as raw bytes, but instead as a COM object that exposes the structure members as properties that are accessible from either IDispatch or as a dual interface. This approach requires the definition of a second interface that maps structure members onto properties.
Figure 8 is an example mapping of the RECT struct mapped onto a dual interface. Figure 9 shows CoRect, an implementation of this interface that simply contains a RECT and implements the accessor and mutator functions as one would expect. Given the implementation of this RECT wrapper, our RECT-hungry interface would now look like this: |
|
Visual Basic clients must now create an instance of the wrapper object to pass to the method in question.
|
|
C++ clients do essentially the same thing but with a syntax that only a mother could love (see Figure 10).
On the implementation side, you must reconstruct the RECT by extracting each member from the wrapper using the propget functions. |
|
This technique is extremely convenient and elegant for Visual Basic clients, and it's palatable for C++ clients. As with the previous approach (based on VARIANT arrays), the marshaling layer will hide any platform differences transparently. |
Figure 11 In-process Struct-Wrapper |
One drawback of this approach is that the object designer must implement and support not only the primary objects and interfaces, but also any wrappers that are used to hide user-defined structures. A more serious drawback is that of performance. If the target object that will receive the wrapper is out-of-process, then the wrapper object is, by definition, in the wrong address space. If the wrapper object is implemented as an in-process server, then the client's PROPERTYPUT routines will execute relatively quickly, but the target object's PROPERTYGET routines will require communications back to the client's address space (see Figure 11). If the wrapper object is implemented as an out-of-process server that shares the process of the target object, then the target's PROPERTYGET routines will execute quickly, but the client's PROPERTYPUT routines will each require out-of-process communications (see Figure 12). Either way, using the wrapper object approach results in n additional round-trips where n is the number of exposed properties. |
Figure 12 Out-of-process Struct-Wrapper |
Figure 13 enumerates the relative tradeoffs for each approach described above. It is important to note that the performance penalty of the fourth approach (based on COM wrappers to encapsulate structs) only worsens when using MFC version 4.x or earlier. At the time of this writing, MFC only supports dual interfaces if the developer is willing to implement most of the dual and all of the ODL manually, rendering the Class Wizard virtually useless. For many developers, this means the wrapper object will only export a plain dispatch-based interface. This slows things down considerably when compared to a dual interface.
I intentionally skipped two other alternative techniques. One of these techniques is a variation on the struct-wrapper approach, but instead of exposing the embedded struct via a dispatch/dual interface, you would expose it via an IDataObject interface. This approach suffers all of the limitations of the byte-array technique (it is just passing raw bytes around via GetData) but does not offer the same performance benefits (the GetData method requires an additional round-trip, which severely impacts performance in the out-of-process case). It also cannot be used from Visual Basic, which limits its usefulness. Given that the byte-array technique is actually easier to implement and is available from Visual Basic, there is no reason to favor the IDataObject approach. Another technique that was popular in the early days of IDispatch was to treat a BSTR as an opaque array of bytes. This approach ceases to work now that BSTRs are Unicode. The marshaler will perform byte-swapping when communicating with some remote hosts, and performs Unicode-to-ASCII conversion when communicating with 16-bit processes. As you might have concluded by now, structs and IDispatch/dual interfaces are not a good match. All of the techniques described above are workable, but less than ideal. For small structures, breaking out the structure members as individual parameters is definitely the way to go. For large structures, the choice is not so obvious. If Visual Basic compatibility is important, then the struct-wrapper technique is elegant but inefficient. However, it is arguably less work to simply break out 20 structure members as parameters than to perform 20 PROPERTYPUT operations. Unless programmers will use your struct-wrappers as native data types in their Visual Basic-based applications, be prepared to see the performance of your library slow down considerably. If optimal performance and elegance from C++ is important, perhaps the best approach is to leverage the key feature of COM, QueryInterface. Instead of forcing C++ clients to call through the dual interface, you could add support for a second custom interface that is not subject to the VARIANT-compatible restriction. By supporting both a dual and a custom interface from the same object, Visual Basic clients could still QueryInterface for the more restricted dual interface, while C++ clients should use the more flexible struct-friendly interface. This technique requires more work by the object implementor, but yields the best overall performance and is considerably more convenient for the object's client. |
Have a question about programming with ActiveX or COM? Send your questions via email to Don Box at dbox@develop.com or http://www.develop.com/dbox/default.asp |
From the June 1996 issue of Microsoft Systems Journal.