Basic Marshaling Architecture (Custom Marshaling)

The most fundamental operation in Local/Remote Transparency is to take an interface pointer to an object in one process and somehow create the structures that allow a client in another process to call member functions through that interface pointer. Marshaling, in its most literal form, is the process for making this interface pointer available. Again, OLE's standard marshaling is simply one way to achieve this. Custom marshaling is essentially the generic mechanism that lets any object specify exactly how it communicates with a proxy in another process, if such communication is even necessary.

To understand this mechanism and how standard marshaling uses it, let's begin with a server process that has an IClassFactory pointer that it has just passed to CoRegisterClassObject. We start here because the class factory is the first object that any server process will create and the class factory's interface pointer is the first pointer that requires marshaling. Inside CoRegisterClassObject (and within CoGetClassObject in the client's process), the exact sequence of operations that occurs for the first interface pointer is the same for the first interface pointer to any other object in the server—for example, the one returned from IClassFactory::CreateInstance.4 In short, the basic marshaling process is used whenever a function in one interface returns the first interface pointer to a new and separate object, which must then be marshaled independently. Because custom marshaling happens on an object-by-object basis, we need the capability for an object that uses one form of marshaling to return an interface pointer to an object that might use a different form. Each object has the right to control whatever sort of marshaling it wants to use. If it wants no control, OLE's standard marshaling is used as a default.

Inside CoRegisterClassObject, COM has an IClassFactory pointer that it must somehow marshal to the other process. This happens through the following steps, which are the same for any interface pointer to any object regardless of the form of marshaling:

Inside CoRegisterClassObject, COM asks the object for the CLSID of the proxy it requires in the client's process. If the object does not provide a specific CLSID, COM will use a standard marshaling proxy.

COM attempts to ask the object for a marshaling packet, which is a stream of bytes containing whatever information the proxy needs to create the interprocess connection to the object. If the object does not provide a packet, COM creates one appropriate for standard marshaling.

COM passes the proxy CLSID and the marshaling packet to the client process, where it shows up inside CoGetClassObject.

In the client's process, COM creates an instance of the proxy using the CLSID retrieved in step 1 and passes to it the marshaling packet retrieved in step 2.

The proxy uses whatever means necessary to connect with the object and returns an interface pointer to COM. COM then returns that pointer to the client on return from CoGetClassObject. This pointer is the one through which the client can now make calls that the proxy will forward to the object as necessary.

Steps 1 and 2 are wrapped inside the COM API function named CoMarshalInterface. Steps 4 and 5 are wrapped inside CoUnmarshalInterface. These two functions represent the core of marshaling.

Step 3 is entirely internal to COM. The transfer of the marshaling packet is the responsibility of the service control manager (SCM), which is what launched the server in the first place. In this sense, the SCM knows exactly what sort of barrier lies between the client and the server. It describes the barrier through a marshaling context, a combination of flags taken from the MSHCTX enumeration. This enumeration currently contains MSHCTX_NOSHAREDMEM (shared memory is not available between processes on this machine) and MSHCTX_DIFFERENTMACHINE (a network boundary exists between client and server). Obviously, these flags can restrict the form of marshaling that an object might want to employ. For example, an object that typically uses shared memory as an IPC mechanism could not do so when the context includes MSHCTX_NOSHAREDMEM. It might then use Microsoft Windows messages instead because window handles are shareable on the same machine. If MSHCTX_DIFFERENTMACHINE is also specified, the object would need to use named pipes, RPC, or some other appropriate network IPC.

Step 3 addresses the reason marshaling is happening in the first place. It depends on the design of whatever code calls CoMarshalInterface. The reason is described by a single flag taken from the MSHLFLAGS enumeration. MSHLFLAGS_NORMAL specifies marshaling that is being carried out to hook up a client and object immediately, as would happen in a client's call to IClassFactory::CreateInstance (to connect to the new object). MSHLFLAGS_TABLESTRONG and MSHLFLAGS_TABLEWEAK, on the other hand, specify that the marshaling packet is only being stored in a global object table and that marshaling isn't happening immediately. Registering a class factory is an example of this: CoRegisterClassObject merely stores the object's marshaling packet and its proxy CLSID in a global table so that CoGetClassObject in the client's process can access it. (This might not happen at all.) This sort of registration makes the object available to other processes because the necessary marshaling information is accessible through the table. Strong means that COM has called AddRef on the object when storing its marshaling packet in the table, whereas weak means that COM has not made the call.5 We'll see more about strong and weak references in the section "Strong and Weak Connections" later in this chapter.

The marshaling context and the marshaling flags are known within code such as CoRegisterClassObject, which has, at this point, an IClassFactory pointer to marshal. It passes this pointer to CoMarshalInterface along with the other necessary arguments, described as follows:

Argument	Description
pstm	An IStream pointer to a stream (see Chapter 7) into which the object being marshaled should store its marshaling packet. This information is given to an appropriate proxy in the destination process.
riid	The IID of the interface pointer being marshaled. The interface itself must be derived from IUnknown.
pUnk	The interface pointer (cast to IUnknown) to marshal.
dwDestContext	DWORD flags from the MSHCTX enumeration.
pvDestContext	A void * pointing to additional information based on dwDestContext. (Currently this argument has no defined uses and must be set to NULL.)
mshlflags	DWORD flags from the MSHFLAGS enumeration.

The implementation of CoMarshalInterface works in conjunction with an interface named IMarshal. Arguments to IMarshal's member functions are named similarly to those in the preceding table and also have the same usage, as shown in the following code. (pvInterface is the same as pUnk.)

interface IMarshal : IUnknown
    {
    HRESULT GetUnmarshalClass(REFIID iid, void *pvInterface
        , DWORD dwDestContext, void *pvDestContext, DWORD mshlflags
        , CLSID *pclsid);
    HRESULT GetMarshalSizeMax(REFIID iid, void *pvInterface
        , DWORD dwDestContext, void *pvDestContext, DWORD mshlflags
        , DWORD *pcb);
    HRESULT MarshalInterface(IStream *pstm, REFIID iid, void *pvInterface
        , DWORD dwDestContext, void *pvDestContext, DWORD mshlflags);
    HRESULT UnmarshalInterface(IStream *pstm, REFIID iid, void **ppv);
    HRESULT DisconnectObject(DWORD dwReserved);
    HRESULT ReleaseMarshalData(IStream *pstm);
    };

Through this interface, an object can control its own marshaling: GetUnmarshalClass returns the proxy CLSID to use, and GetMarshalSizeMax and MarshalInterface create the marshaling packet. If the object does not choose to implement this interface, it says, in effect, "Use standard marshaling." Here are the steps performed within CoMarshalInterface:

Query the object for IMarshal. If this interface is unavailable, standard marshaling will be used, in which case we query the object for IPersist in an attempt to call IPersist::GetClassID. The CLSID returned here still allows the object to specify which object handler to use in the client process, although that handler continues to use a standard marshaling proxy underneath. If IPersist is unavailable, COM defaults to a generic proxy CLSID (specifically, CLSID_StdMarshal, whose server is the COM Library itself).

If IMarshal is available, call IMarshal::GetUnmarshalClass to obtain the proxy CLSID. The proxy must understand the object's marshaling packet.

Call IMarshal::GetMarshalSizeMax to retrieve the maximum size of the marshaling packet. COM can then preallocate the stream to the appropriate size. (Some streams, as described in Chapter 7, cannot automatically extend themselves.)

Call IMarshal::MarshalInterface, in which the object creates its marshaling packet by writing information into the stream (using IStream::Write).

At the end of this sequence, we're left with a proxy CLSID and a marshaling packet in a stream that can be either stored in a global table or passed directly to COM in a client process, depending on the marshaling flags. The former happens in the case of CoRegisterClassObject; the latter in most other cases. Depending on what operation is being performed, COM now picks up the marshaling packet in the client process. This might happen from within CoGetClassObject, which periodically checks the global class factory table for the new factory, or from within the proxy for whatever object created the new object being marshaled.6 One way or the other, code in the client process retrieves the new proxy CLSID and the marshaling packet.

The job of this code is to turn the information into an interface pointer that can be given back to the client. After the client receives the pointer, it can transparently make function calls into the local or remote object, wherever it happens to be. This is the purpose of CoUnmarshalInterface. This function takes the following arguments, two of which imply a call to QueryInterface to return the interface pointer to give to the client:

Argument	Description
pstm	An IStream pointer to the stream containing the marshaling packet
riid	The IID of the interface pointer required
ppv	A void ** in which to return the client-process pointer through which the client can make calls

CoUnmarshalInterface then executes the following steps to create the proxy and have it establish communication with the local or remote object:

Call CoCreateInstance with the proxy CLSID, asking for IMarshal in return. A proxy must implement IMarshal in all circumstances. At this point, it doesn't matter what sort of marshaling is being used; the CLSID might be CLSID_StdMarshal as easily as it can be a CLSID for a custom marshaling proxy.

Pass the marshaling packet to IMarshal::UnmarshalInterface. The proxy reads the information using IStream::Read and uses it to establish whatever connection the proxy requires to the local or remote object through the interface in question. From this function, the proxy returns in ppv the requested interface pointer specified in riid.

Call IMarshal::ReleaseMarshalData to free whatever data might be stored in the marshaling packet (for example, a piece of shared memory whose handle is in the stream or the handle to a file or a named pipe).

Return the pointer from IMarshal::UnmarshalInterface to the caller. This pointer is ultimately returned to the client.

Through the steps in CoMarshalInterface and CoUnmarshalInterface, a local or remote object is able to specify what proxy to create in the client process and to provide that proxy with the information necessary to establish a connection to the object through all of the object's interfaces, whatever those might be. This will often, of course, involve standard marshaling, as usually happens for a server's class factory. As we'll see in the next section, standard marshaling creates a generic proxy in the client process and a generic stub in the server process, where that stub maintains the object's actual interface pointers. Using this setup, we can follow the sequence of operations that occurs when a client calls IClassFactory::CreateInstance for an object that does use custom marshaling, as illustrated in Figure 6-1 on the next page. From the client's point of view, an object that uses custom marshaling (proxy) is no different from one that uses standard marshaling: both are transparent. Basic marshaling architecture gives an object control over its own marshaling for all of its interfaces, allowing it to make optimizations as it deems necessary.

Figure 6-1.

The process of establishing custom marshaling between an object and its proxy in the client's process.

Keep in mind that this entire process can work both ways in a client-object relationship. For example, when a client passes a sink interface pointer to IConnectionPoint::Advise, the proxy itself is now in the position of any other remote object in that it has an interface pointer to make available to another process. The proxy calls CoMarshalInterface to create a client-side stub for the sink interface and passes the resulting marshaling packet to the remote process. There the IConnectionPoint stub will itself call CoUnmarshalInterface to create a proxy for the (now remote) sink object, handing an interface pointer of that new proxy to the object. The object can now call members of the remote sink in the client's process as transparently as the client calls the object's own members.

We have yet to mention IMarshal::Disconnect. COM uses this function to inform the custom marshaling object that some other code (in that object's server) has called CoDisconnectObject for it. The object must then notify its connected proxy of this disconnection so that the proxy will no longer attempt to call the object itself, returning RPC_E_NOTCONNECTED to the client instead.

Four Reasons to Choose Custom Marshaling

If the remote object itself is a proxy to some other object, custom marshaling allows you to short-circuit this middle proxy and let the second client connect directly to the remote object. This procedure can increase performance and improve robustness—it'd be silly to have a proxy to a proxy to a proxy to a proxy, and so on.
Some objects keep their entire state in shared memory or in some other shareable storage medium (such as a disk). In this case, custom marshaling enables the proxy to access that shared storage directly, eliminating the need to call the remote object at all and avoiding context switches. Storage and stream objects in OLE's structured storage implementation (Chapter 7) are great examples of this.
After creation, some objects have an immutable state, which means no changes ever occur to their state data. Monikers (Chapter 9) are an example. With custom marshaling, such objects can make a complete copy of their internal states in both client and server processes; being immutable, the two copies are indistinguishable.
Some designs can cut down interprocess or network traffic by grouping several remote calls into one, thus optimizing performance. An example might be some sort of transactioning system with a commit operation, by which the proxy could cache changes until the commit is made, at which time it passes all the changes to a remote object across the network in a single call.

4 Other examples include IProvideClassInfo::GetClassInfo, IStorage::OpenStream (Chapter 7), IMoniker::BindToObject (Chapter 9), IDataObject::EnumFormatEtc (Chapter 10), and many other standard and custom interfaces that you might use or design. All of these return new interface pointers to separate objects.

5 A weak reference means that the object might disappear while its marshaling packet still appears in the table where the object is responsible for revoking the registration. CoRegisterClassObject always uses MSHLFLAGS_TABLESTRONG which is why you cannot control server lifetime with a class factory as discussed in Chapter 5. Only CoRevokeClassObject can remove the extra reference counts. However other mechanisms such as the running object table described in Chapter 9 give the object the choice between strong and weak registration.

6 Specifically, a stub that has an out-parameter containing an interface pointer will return a marshaling packet to the proxy for that pointer, not a pointer itself.