Building a Lightweight COM Interception Framework Part 1: The Universal Delegator MSJ, January 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

January 1999

Building a Lightweight COM Interception Framework Part 1: The Universal Delegator
Keith Brown

I wrote a component that I called the delegator. The delegator was a simple COM object that would wrap any other COM object, without requiring type information. The only interface the delegator actually implemented was IUnknown, but the implementation supported aggregation.

This article assumes you're familiar with C++, COM

Code for this article: Delegate.exe (82KB)
Keith Brown works at DevelopMentor, developing the COM and Windows NT Security curriculum. He is coauthor of Effective COM (Addison-Wesley, 1998), and is writing a developer's guide to dis-tributed security. Reach Keith at http://www.develop.com/kbrown.

In March of 1997, I fell in love with the idea of smart proxies. But I was unhappy to discover that handler marshaling, while mentioned in the COM spec (two whole paragraphs), didn't work as advertised. The idea behind handler marshaling is that an object can simply implement IStdMarshalInfo and specify the CLSID of an in-process handler (otherwise known as a custom proxy), which will be instantiated in place of the standard proxy. The custom proxy can then aggregate any of the standard proxy's interfaces to get the best of both worlds: hands-off remoting for interfaces the proxy doesn't particularly care about, and full in-process implementations of interfaces the proxy does care about. It sounded like a silver bullet for achieving the ideal of local/remote transparency that Kraig Brockschmidt wrote about long ago.
      I quickly discovered that in Windows NT® 4.0, while CoMarshalInterface queries for IStdMarshalInfo, it does absolutely nothing with it. Even if it did, there is no mechanism to aggregate the standard proxy; handler marshaling is apparently reserved for the special case of OLE rendering handlers, which unfortunately isn't much help to those of us developing distributed systems. While this is fixed in Windows 2000, I want a solution that works now.
      Don Box introduced me to a neat idea with which you can fake handler marshaling. The idea is elegant: simply implement custom marshaling on an object, specify the CLSID of a custom proxy via IMarshal::GetUnmarshalClass, then implement IMarshal::MarshalInterface by asking the standard marshaler to write a STDOBJREF into the stream, which also sets up the stub manager. In the custom proxy's implementation of IMarshal::UnmarshalInterface, simply call CoUnmarshalInterface to unmarshal a standard proxy, which will connect back to the stub. This mechanism allows you to inject a bit of in-process code (a smart proxy) into the client's process, which starts life holding a standard proxy it can use to access the remote object using COM. One flaw of this scheme is that it still offers no way to aggre-gate the standard proxy, so the custom proxy has to implement all the interfaces of the remote object explicitly, even if it means simply writing code to delegate each method call to the standard proxy.
      To try to solve this problem, I wrote a component that I called the delegator. The delegator was a simple COM object that would wrap any other COM object, without requiring type information. The only interface the delegator actually implemented was IUnknown, but the implementation supported aggregation—the delegator could be used to wrap a standard proxy so that my custom proxy could aggregate it without having to write all the delegation code by hand. I used inline assembler to provide a generic vtbl that would automatically delegate all other method calls directly to the wrapped object with very little overhead.
      From such humble beginnings, I soon discovered a whole host of other applications for the delegator—from transparent tracing of COM method calls across the network (implemented in conjunction with Chris Sells and presented at the Software Development conference in San Francisco earlier this year), to automating the adjustment of security blankets on proxies. The Universal Delegator (UD) described here is the culmination of this work. It's a lightweight interception framework that lets you aggregate any object and, much more importantly, compose generic services onto any object, à la Microsoft® Transaction Server (MTS) and COM+.
Motivation
      COM is all about binary encapsulation. This allows a vendor to provide a component that others can use, without worrying about source-level details like language compatibility, compiler versions, and so on. However, a big piece is missing from the current model; there is no infrastructure for transparently adding out-of-band functionality to existing COM objects.
      As an example, consider the difficulty inherent in adding a generic auditing layer to a set of existing COM objects. If the objects' source code is available, it would be intellectually trivial (although physically laborious) to visit each method in each interface on each class and add code that records each method call in an audit log. However, if the source code is not available, it would be very challenging to add an auditing layer after the fact. The only way to make the auditing layer completely transparent would be to provide a layer of in-process COM objects that looks exactly like the existing objects, whose implementation simply logs an audit record and forwards each method call to the real object (see Figure 1).

Figure 1 Mirroring COM Objects

      Figure 1 Mirroring COM Objects

      One obvious problem with this approach is its tediousness. For each class of object, the auditing layer must provide an in-process implementation of all of the interfaces exposed from the underlying object via QueryInterface. Any sane developer faced with this sort of tedium will prefer to develop a code generator that automates this task (having a type library is helpful here). However, even this approach has its limits. First, it adds unnecessary code bloat, as each class must be mirrored individually in the in-process auditing layer. Second, unless the auditing layer is designed carefully, it may introduce new and undesirable semantics as well as significant performance penalties.
      To understand this, imagine that an existing object runs in apartment A, and that some code in apartment B holds a standard proxy to that object. Proxies have an interesting property that allows them to be exported to other apartments without generating an intermediary stub. In other words, using standard marshaling, COM never creates proxies to proxies. The standard proxy accomplishes this magic by implementing IMarshal and effectively marshaling the same OBJREF from which it was originally created. It's as if the proxy marshals itself by value because, upon unmarshaling, COM creates a new proxy that directly references the original stub. Figure 2 shows the effect of this behavior, which is not only more efficient than creating proxies to proxies, but also imperative due to the lifetime issues I'll discuss later.

Figure 2 Multiple Proxies-Single Stub

      Figure 2 Multiple Proxies-Single Stub

      Unfortunately, the naïve use of layering can break the standard proxy's marshaling scheme and cause a middleman stub to be created. Figure 3 shows the marshaling behavior when an in-process layer is added on top of a standard proxy. If an interface pointer to the layer is marshaled to another apartment, unless the layer explicitly custom marshals, the standard marshaling architecture will take over and create another proxy/stub pair, causing the original layer object to act as a middleman. Even if apartment B releases the layer object, the stub still holds references to it and will keep it alive to service requests from apartment C. Apartment B isn't aware of this, however, and may shut down prematurely, abruptly tearing down the connection between apartments A and C. The obvious solution is to add custom marshaling to each class in the auditing layer, but there's got to be a better way.

Figure 3 Multiple Proxies-Multiple (Middleman) Stubs

      Figure 3 Multiple Proxies-Multiple (Middleman) Stubs

Interception
      The ultimate solution to this problem is better plumbing. In fact, COM+ was designed around the notion of composing generic services onto binary COM components to form more sophisticated systems. The goal is to factor out code that is orthogonal to the logical behavior of an object (for example, auditing, transaction propagation, and security access checks). This factoring reduces the object to its essentials, allowing it to focus on the one job it needs to accomplish in its problem domain. Factoring also leads to code reuse and all the benefits that accompany reuse (the tradeoff is a higher level of complexity in these composed systems). In the auditing example described earlier, factoring out the auditing layer into a composable service lets you reuse the auditing service with any type of object. The underlying objects can be used with or without the auditing layer.
      The solution in COM+ is called interception, and it involves interposing a lightweight wrapper between the object and its clients. This wrapper is a generic layer that sends events to composable services, allowing them to work their magic. These services often have caller-side code that communicates with object-side code, which requires some mechanism for passing out-of-band information between the two. (See the January 1998 ActiveX® column in MSJ for details on how this can be done with the DCOM wire protocol as it stands today.) COM+ interposes services between objects based on a concept known as context. In COM+, each class of object declares a set of attributes that define the type of environment the object wants to live in. These attributes include synchronization, security, and transaction support, among others. Objects that have similar attributes—and that live within the same apartment—are said to share a context. When an interface pointer is unmarshaled into a different context, COM+ injects this generic interception layer between the objects.
      A mainstream example of a simple interception architecture is the Context Wrapper (CW) provided for MTS components. This is effectively a server-side layer that provides a very specific set of services: transactions, concurrency management, JIT activation, and security access checks. The beauty of this solution is that MTS provides this layer transparently. The CW for any given object is constructed dynamically based on the IDL definition for each interface, obtained either via a type library or an undocumented data format known as an "Oicf string" (pronounced "oh-I-see-eff") present in modern proxy/stub DLLs. (The name comes from the MIDL command-line switch /Oicf that produces fully interpreted marshaling code.) The CW implements IUnknown, which allows it to hand out its own interface pointers in response to QueryInterface calls. MTS objects must be careful never to short-circuit this mechanism by handing out their own interface pointers directly, so MTS provides an API known as SafeRef that allows an object to obtain an interface pointer to its associated CW that is safe to pass as a parameter.
Today's Limitations
      While COM+ interception provides both caller and object-side hooks, the MTS CW is purely an object-side phenomenon, which means that clients cannot use MTS to compose services onto arbitrary objects (the obvious case being when a client only has a proxy—MTS does not wrap proxies). Other than that, the main limitation of both the MTS CW and the COM+ interception architecture (at least version 1.0) is that they are both proprietary, nonextensible mechanisms. It is impossible for a third party to trans-parently compose a new service onto an application based on MTS or COM+ 1.0. COM+ 2.0 is slated to support ex-ten-sible interception, hopefully after ferreting out most of the difficult issues that such an extensibility framework presents. I developed the UD to help fill in some of these gaps in the meantime.
      The UD is an extensible interception architecture that allows COM programmers to develop services that can be composed onto binary COM components via layering. The goal of the UD is to provide a solid implementation of the somewhat grungy plumbing required by such a layer, including dealing with the weird issues introduced by marshaling. Just like the CW in MTS, the UD dynamically generates a layer object that sits between the client and the original object. The main difficulty with generating this layer object is the potential lack of type information that, at first glance, seems necessary to correctly generate a delegating implementation of each interface exposed by an object.
      As most COM developers know, type libraries are used when compatibility with high-level languages is required (Visual Basic® and Java are the canonical examples). However, for low-level COM interfaces designed for efficiency on the wire, type libraries are not necessary nor desirable due to their lossy nature (many imperative IDL attributes such as size_is and iid_is are not preserved in a type library). Without a type library to describe the interfaces on an object, and without documentation for the Oicf string format, it is virtually impossible to generate an implementation of any arbitrary interface automatically.
      The not-so-obvious solution to this problem is to create an implementation of an interface that will suffice no matter what the actual physical interface looks like. The UD usually doesn't care about the parameters being passed in any given method call, since most of the services provided are orthogonal to the particular interface being used. Constructing this generic implementation is impossible even in a low-level language like C or C++. However, the UD is plumbing, and since it only needs to be written once, a little inline assembly can work the required magic.

Figure 4 UD Architecture

      Figure 4 UD Architecture

      Before diving into the fun-but-grungy ASM, it would be useful to see the big picture. Figure 4 demonstrates the overall UD architecture. All the tricky interception plumbing is hidden away inside the generic UD component, while the interception policy (auditing, for instance) is factored into a separate pluggable COM component called a hook that can preprocess or postprocess each method call. This makes it relatively easy to design interesting services that can be composed with existing objects.
Using the Universal Delegator
      MTS provides a transparent framework for automating the composition of an interception layer, by hooking the class factory for an MTS-assimilated in-process coclass. Whenever a client creates a new instance of the coclass, the MTS-generated class factory creates a new CW and hands an interface pointer from the CW back to the client. The CW then creates the actual object and delegates method calls to it after providing whatever out-of-band services the object requires (for instance, synchronization and security access checks).
      The UD acts much like the CW, but provides no automated mechanism for attaching it to new objects. This allows the UD to be used explicitly in many contexts, by both clients and servers. (It also allows me to see my family at night instead of trying to reengineer MTS.) Consequently, the UD must be applied manually, via the IDelegatorFactory interface on the UD's class object, to each object that requires its services. The policy for when objects are wrapped and which objects are wrapped is left up to the developer using the UD, which allows you to create your own unique frameworks as you see fit.
      To create a new instance of the UD to wrap an object, invoke the Create-Delegator method (shown in Figure 5) and pass an interface pointer from the original object via the pUnk-Inner parameter. The output of Create-Delegator is an interface pointer to the newly created UD obtained via the iid and ppv parameters. Note that the UD always supports aggregation, even if the original object did not; this is the reason for the pUnkOuter parameter (this feature is useful by itself). Each instance of the UD that is created may have a hook attached—or not, in which case the UD may be used to simply simulate aggregation support on an object that doesn't normally support aggregation.
      Providing a simple hook that doesn't require any custom initialization only requires the CLSID of the hook, which the UD will use in a call to CoCreateInstance (CLSCTX_ INPROC_SERVER). An example would be an auditing hook that simply dumps a tracing message via OutputDe-bugString each time a method is invoked. This type of hook does not require any custom data to know how to do its job. However, to direct the hook to a particular log file, perhaps specified by a UNC path, the hook must be created manually (via CoCreateInstance, or some internal mechanism), initialized with the file name via some custom interface implemented on the hook, and then passed as a pointer to the UD. This is the rationale behind the two parameters for specifying a hook: pclsidHook and pHook. Use one or the other (but not both), depending on your needs.
Using the UD in Client Code
      Unlike MTS, a client can compose the UD on top of an interface pointer obtained from anywhere. This is useful for adjusting the security settings of a proxy. As I discussed in November's Security-Briefs column, in-process objects that are loaded into a client application (for instance, an ActiveX control in a browser) have very little control over their security environment, and often need to call IClientSecurity::SetBlanket explicitly to adjust security settings manually on proxies. Usually the goal is to either turn off authentication altogether or explicitly request that the proxy use an alternate set of credentials.
      The pain comes when the proxy exposes several interfaces because SetBlanket only affects the security settings of an individual interface proxy. This means the developer often has to write code to adjust the settings on each interface pointer before it is used. In fact, many developers get bitten because they don't remember that SetBlanket also needs to be invoked explicitly for IUnknown. These, plus other issues (discussed later), motivate the use of the UD for this purpose.
      Instead of explicitly writing the call to SetBlanket for each and every interface pointer, you could develop a simple hook that does this automatically. Each proxy that requires specific security settings could be wrapped with its own UD, and the associated hook would call SetBlanket on every interface pointer handed out via QueryInterface, saving typing and potential bugs.
      A hook (packaged with the UD) known as the Anonymous Hook is designed to work with a proxy, and disables authentication on any interface pointer that the client asks for on that particular proxy. To use this hook, simply call CreateDelegator, passing CLSID_CoAnonymousDele-gatorHook for the pclsidHook parameter.
      Another hook that is useful for clients is the Alternate Credentials Hook, which allows a client to explicitly configure the security settings for all the interfaces on a proxy. Typically, the most painful setting is the identity that the proxy will use to make outgoing calls, and the hook simplifies this tremendously. By specifying an alternate user account and password, the hook will annotate each interface proxy (as the client QIs for each interface) with this alternate identity automatically. Each call from the client will be executed using the alternate account. This is useful when the client is a service running under the SYSTEM account, which has no network credentials and thus cannot normally make authenticated calls through a proxy to a remote object. This particular hook exposes an interface that allows individual security settings to be overridden, while leaving others at their default settings (see Figure 6). This hook is also packaged with the UD, and Figure 7 shows some sample code that makes use of it.
      The UD was designed to be easy to use, even in complex scenarios such as multiple-apartment processes. Earlier I discussed a problem regarding layering an in-process object over a standard proxy. When marshaling a proxy from one apartment to another, the proxy normally copies itself to the new apartment. However, when a layer is constructed over a proxy, the proxy's IMarshal implementation is hidden, leading to middlemen and all kinds of nasty lifetime issues. Since the UD exposes implementations of all interfaces supported by the wrapped object, the implementation of IMarshal on a standard proxy seeps through. So when a client wraps a proxy in the delegator, if the delegator is marshaled to another apartment, the underlying proxy will in fact be asked to marshal itself. When unmarshaled in a new apartment, this results in a pointer to a copy of the proxy, which is generally a much better situation than having the UD act as a middleman as in Figure 3.

Figure 8 Using the UD in Multiple Apartments

      Figure 8 Using the UD in Multiple Apartments

      The only problem with this picture is that the delegator drops off the top of the proxy in the new apartment (see Figure 8). Imagine using the Alternate Credentials Hook on a proxy, and then tucking the UD away into the Global Interface Table (GIT) so it can be shared with other apartments in the process. If the UD doesn't do something special, each apartment that calls GetInterfaceFromGlobal will retrieve the original proxy, not the UD, which means another UD will need to be wrapped around the proxy in the new apartment. This is tedious at best, and motivates the need for the DO_MBV_XXX flags that can be passed to Create-Delegator. Specifying one or more of these flags causes the delegator to expose its own implementation of IMarshal to marshal itself by value, taking the hook along with it. Figure 9 shows the behavior when using this type of delegator.

Figure 9 Using the UD's IMarshal Interface

      Figure 9 Using the UD's IMarshal Interface

      Normally, when importing an interface pointer into an apartment (via the GIT, for example), the interface proxies are not copied when a new proxy manager is created. Instead, the proxy manager regenerates them on demand, so the security settings held by the interface proxies in the original apartment are lost. A UD created with the DO_MBV_INPROC option and the Alternate Credentials Hook prevents this annoying situation, since the proxy can be wrapped a single time, placed into the GIT, and the security settings (stored in the hook) will appear to follow the proxy from apartment to apartment transparently. I will explain how the delegator implements this feature in my next article.
      Careless use of this feature can lead to a security breach, however. Recall that when specifying alternate credentials for a proxy, a clear-text password is required. If a UD created with the DO_MBV_ALL flag (which is just a bitwise OR of all the other DO_MBV_ XXX flags) is exported out of the process, potentially to another machine, the initialized hook goes with it. In the case of the Alternate Credentials Hook, which may store a cleartext password as part of its state, this causes the password to be transmitted across the wire in the clear unless the PKT_PRIVACY authn level is used for the call. Even with PKT_ PRIVACY, it is not normally desirable to propagate passwords without bound. Anyone obtaining the wrapped proxy can easily call Query-Blanket to obtain a user account along with its corresponding password. This was one of the motivating factors for providing clear boundaries for the DO_MBV_ XXX flags. When using these flags, carefully consider how far the delegator should propagate itself. Often, DO_ MBV_INPROC is best. Figure 10 shows a summary of the DO_MBV_XXX options and their associated boundaries.
Using the UD in Server Code
      The UD can be used by servers as well as clients. It is often useful for a server to place an interception layer on top of the objects it exposes, similar to the way the CW is applied in MTS applications. One benefit of this mechanism is that it is completely transparent to the client; that is, the client is not required to intervene or provide any infrastructure other than having an OS that supports basic COM. The server can therefore design objects that know how to do one thing well, and the interception layer can deal with orthogonal issues such as auditing, synchronization, and security access checks.
      To use the UD in this fashion safely, you must avoid exposing interface pointers directly to the underlying object. Instead, only hand out pointers to the UD. Throughout its lifetime, the UD will hold various interface pointers to the original object as necessary so that the original object will stay alive as long as the UD is alive. This means that once an object is wrapped, all direct references to the object can be released (if desired), helping to guarantee that a naked pointer to the original object won't slip through to a client. The UD specifically implements IUnknown to ensure that calls to QueryInterface result in interface pointers that are exposed from the UD, not from the original object. However, if the object wants to pass one of its own interface pointers as an argument via a method on a COM interface, it should pass a pointer to the UD instead of a raw pointer to itself. Otherwise, incoming calls through this direct reference will not be intercepted by the UD, and the composed services of the hook will be lost.
      MTS deals with this issue via a well-known function, SafeRef. This function makes use of a dictionary that allows any given object to obtain a pointer to its corresponding CW, which can be safely passed as an argument to a COM method call. This dictionary is not necessary when using the UD since, unlike MTS, the creation of the interception layer is explicit versus being completely transparent. An object that plans on handing out pointers to itself may simply hold a weak reference (no AddRef) to its own UD, which can be obtained when the object is first wrapped. To obtain interface pointers to safely pass via COM method calls, simply call QueryInterface on the UD.
      Part 1 of this article explained the need for an extensible interception architecture, and provided an overview of the Universal Delegator along with some sample scenarios where it can be put to use. Next month, I will present the second half of the article, which discusses hooks: how they work and how you can implement them. I will also take an in-depth look at how the delegator implements interception (anyone who wondered if COM objects could be implemented in assembly language will get a kick out of this).

From the January 1999 issue of . Get it at your local newsstand, or better yet, subscribe.

For related information see: The Basics of Programming Model Design http://msdn.microsoft.com/library/techart/msdn_basicpmd.htm.
Also check http://msdn.microsoft.com for daily updates on developer programs, resources and events.

From the January 1999 issue of Microsoft Systems Journal.