January 1999
Code for this article: jan99HoC.exe (2KB)
Don Box is a co-founder of DevelopMentor, a COM think tank that educates the software industry in COM, MTS, and ATL. Don wrote Essential COM and coauthored the follow-up Effective COM (Addison-Wesley). Reach Don at http://www.develop.com/dbox. |
After reading my October 1998 column on essential
books for COM developers, a colleague asked me
if I had given up writing technical articles for MSJ and had switched to a more soft-core focus, as would befit a developer of my advanced age (36). Not wanting to be put out to the pasture of technical management just yet, I decided to ditch the "Top Ten English Phrases You Can Put into a GUID" piece I had originally planned for this month and turned my attention to something a bit more grungy. Since I just spent four hours FTPing Windows NT® 4.0 Service Pack 4, I can think of no better topic for this month's column than the infamous type library marshaler that just got a major boost in SP4 because it now supports structures as parameters. (Note: the information given here may or may not apply to Windows 2000.Ed.)
Most objects don't elect to implement the IMarshal interface. This is because these objects choose to let the COM runtime build a proxy/stub pair to allow references to the object to be passed across context boundaries. (Prior to Windows 2000, proxies were only needed across apartments, but alas, Windows 2000 subdivides apartments even further.) These objects are said to use standard marshaling, and are the objects I'll focus on in this column. The COM library and wire protocol explicitly support standard marshaling for only one COM interface: IUnknown. The proxy manager and stub manager used in standard marshaling only know how to make this one interface work across contexts. Fortunately, the proxy and stub managers are extensible, which allows other interfaces to be used with standard marshaling. As shown in Figure 1, the proxy manager can load interface proxies to translate method invocations into ORPC requests. The stub manager can load interface stubs, which then turn these request messages into method invocations on the actual object, as shown in Figure 2. All interfaces beyond IUnknown must have an interface proxy and stub registered on the system to work with standard marshaling. This is true for interfaces defined by Microsoft or by third parties. For example, if the IStorage interface is to be remotable via standard marshaling, its IID must be listed under HKEY_CLASSES_ROOT\Interface, and it must also have a ProxyStubClsid32 subkey pointing to the in-process server that implements the interface proxy and stub. Fortunately, IID_IStorage has such an entry. If you were to look up the GUID stored at IStorage's ProxyStubClsid32 entry ({00000320-0000-0000-C000-000000000046}) under HKEY_CLASSES_ROOT\ CLSID, you would find that it points to OLE32.DLL, and the friendly name of the class is either PSFactoryBuffer (under Windows 2000) or oleprx32_PSFactory (under Windows NT 4.0). This identifies the proxy/stub factory for all of the remotable interfaces that make up the COM core.
MIDL and Standard Marshaling
|
|
MIDL will emit the C source code for the interface proxies and stubs. This code resides in the _p.c file (in this case, bob_p.c) and takes the form of functions with names like IBob_MethodX_Proxy and IBob_MethodY_Stub. If you step into your proxy in a debugger, the _Proxy routines are actually stored in the vtable of the interface proxy. On the server side, if you look at the call stack while stopped inside a method call, you'll notice that your method is called directly by the corresponding _Stub method.
If you really feel like stepping through the MIDL-generated code, you will need to remove the #pragma code_seg(".orpc") directives from the _p.c file and do a debug build of your proxy/stub DLL. The reason for removing the pragma is that COM-aware debuggers don't step into code in a segment named .orpc since MIDL-generated code rarely has errors. Figures 5 and 6 show the structure of a MIDL-generated interface proxy and stub. Note that the interface proxy's vtable is populated with the MIDL-generated _Proxy routines. Also note that the interface stub maintains a vector of function pointers to the _Stub routines. This vector is used by the standard implementation of IRpcStubBuffer::Invoke (CStdStubBuffer_Invoke from RPCRT4.DLL) to dispatch the incoming method request. While perfectly reasonable, the type of interface marshalers shown in Figures 5 and 6 have become passe in modern COM, largely because they contain compiled C code to perform the marshaling. Most COM developers now choose to use interpretive marshalers. As of Windows NT 4.0, the MIDL compiler can emit byte code in lieu of C code by passing the /Oicf flag: |
|
An /Oicf-based marshaler benefits from increased performance due to a considerably smaller memory footprint. Figure 7 shows the memory savings for a variety of IDL files. When building an /Oicf-based marshaler, the _p.c file contains two byte strings in lieu of executable C functions. The most important byte string is called MIDL_PROC_ FORMAT_STRING. This string contains byte code describing every method signature contained in the IDL file. Figure 8 shows the format string for the following IDL: |
|
Note that the representation for each method is laid out back-to-back inside the string.
The following pseudo IDL shows how each method description is laid out |
|
with each parameter described as follows: |
|
Note that the exact format of /Oicf strings is undocumented, completely unsupported, and subject to change at any time. But it is still interesting to look at /Oicf strings because it gives you insight into how COM works.
Note that for the simple method Eat, the entire method signature can be contained inside the MIDL_PROC_FORMAT_STRING (offsets 0-27). The other two methods, Sleep and Drink (offsets 28-61 and 62-95, respectively), both of which have a pointer to a struct as a parameter, have PARAM_INFOs that provide offsets into a second string called the MIDL_TYPE_FORMAT_STRING. You can see these "pointers" into the type format string at offsets 48 and 82 in the proc format string (where the struct BOB * appears). The MIDL_TYPE_FORMAT_STRING contains byte code descriptions of any complex (non-basetype) parameters used in the IDL. Figure 9 shows the type format string that goes with the IDL fragment for ISomeInterface shown earlier. Rather than duplicate the potentially complex description of a datatype in the proc format string, MIDL puts type definitions into the type format string once. This means that each occurrence of the type as a parameter only needs to refer to the type format string, which costs a constant two bytes in the proc format string. It is common for the type format string to contain references to itself to reduce the overall size of the format string by eliminating redundancy.
Finding the Format Strings
|
|
ObjectStubless simply calls into ObjectStublessClient, passing the method index (from ecx) as a parameter. Finally, ObjectStublessClient teases out the format strings from the vtable and jumps to NdrClientCall2. Like NdrStubCall2, this RPCRT4.DLL routine performs the interpretive marshaling and unmarshaling just as if a compiled proxy and stub were in use.
While the previous discussion has glossed over many details (virtually all of which are undocumented and subject to change), the upshot is that the vtables in an /Oicf-based marshaler are annotated with format strings that completely describe the interface. The folks who work on OLE32.DLL and RPCRT4.DLL need to know every detail, but most of us don't. Just use /Oicf and watch your proxy/stub DLL size shrink dramatically.
The Type Library Marshaler
|
|
Note that these two functions have signatures that are virtually identical to the IPSFactoryBuffer methods. One notable exception is the presence of an ITypeInfo parameter, which is used by RPCRT4 to reverse-engineer the /Oicf format strings.
These two routines make the type library marshaler's job quite simple. Figure 12 shows the pseudocode for the type library marshaler's implementations of CreateProxy and CreateStub. Note that the two methods simply load the type library and pass the requested interface's ITypeInfo to the corresponding RPCRT4 routine. The two CreateXXXFromTypeInfo routines do a fair amount of caching to amortize the cost of generating vtables and format strings. The type library marshaler also performs some caching to reduce the overhead of looking up the type information at each unmarshal. Once the unmarshal is complete, the resultant interface proxy and stub are no different than the ones generated by the MIDL compiler. The primary performance difference is that /Oicf-based proxy/stub DLLs can initialize the first interface proxy or stub faster since the /Oicf format strings are precompiled in the DLL. The type library marshaler must generate them on the fly at the first unmarshal.
Choices, Choices, Choices
Have a question about programming with ActiveX or COM? Send your questions via email to Don Box: dbox@develop.com or http://www.develop.com/dbox.
|
From the January 1999 issue of Microsoft Systems Journal. |