House of COM, MSJ May, 1999

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

May 1999

Don Box is a co-founder of DevelopMentor, a COM think tank that educates the software industry in COM, MTS, and ATL. Don wrote Essential COM and coauthored the follow-up Effective COM (Addison-Wesley). Reach Don at http://www.develop.com/dbox.

The next release of Windows NT®, dubbed Windows®2000, will bring a slew of new features. My article, "Windows 2000 Brings Significant Refinements to the COM(+) Programming Model," in this issue of MSJ, explains the core refinements planned for the programming model, many of which result from the integration of Microsoft®Transaction Service (MTS) with COM. However, the folks from Redmond who bring you COM have been burning the midnight oil, adding plenty of completely new stuff to the once compact COM.

    This column attempts to catalog the onslaught of new functionality slated for Windows 2000. While many of the topics discussed could easily warrant their own feature article, this column should act as a roadmap to help navigate the new, largely uncharted territory. Note that this information is based on my use of Windows 2000 betas, and some features may be different in the final release software.

Integration of MTS and COM

    Once Windows 2000 ships, MTS will be completely integrated with the OS. The innovations it introduced (such as declarative programming, context, and interception) have been integrated into the core COM libraries. In particular, several of the techniques employed by programmers using MTS that deal with the lack of MTS and COM integration (such as SafeRef and IObjectContext::CreateInstance) are no longer required. The introduction of context as the innermost execution scope for an object is the most radical aspect of this integration, and is the focus of my feature article in this issue of MSJ.

New Apartment Type
    A new utility apartment has been added that allows an object to be accessed by any thread in the process. This apartment is called the thread-neutral apartment (TNA), and while no thread calls this apartment home, any thread can visit the apartment by calling through a same-thread proxy. Classes indicate that they want to run in this TNA by annotating themselves as ThreadingModel=Neutral in the registry.

    Like objects that aggregate the freethreaded marshaler, thread-neutral objects never incur a thread switch. Unlike objects that aggregate the freethreaded marshaler, thread-neutral objects can safely hold object references between method calls. For Windows 2000, this apartment type should be the preferred threading model for most components that do not have a user interface. My article in this issue explains how this apartment type is used and where it fits into the COM architecture.

Separating Concurrency Management from Thread Affinity

    While the single-threaded apartment (STA) is still an essential part of COM—you can't write user interface code without it—it is no longer the primary mechanism for serializing calls to an object. Rather, all COM components can now use activities to control concurrent access without requiring a thread switch and Windows message queue for serialization. The preferred configuration for components that want call serialization is ThreadingModel=Neutral and Synchronization=Required, which yields the mythological rental/worker/hotel threading model. This is also discussed in my accompanying feature article. Don't worry, the remainder of this column doesn't overlap my article, so keep reading!

STA-aware Synchronization

    STAs must run a Windows message pump to allow incoming calls to be serviced. This means that if you mark your class ThreadingModel=Apartment, you are agreeing not to make any blocking system calls that don't service the windows message pump. Prior to Windows 2000, this made it difficult to use Win32 synchronization primitives (such as critical sections and semaphores) without risking deadlock while waiting for an executive object to be signaled. (Consider the case where the code that will signal the executive object must first issue a call to an object in your STA.)

    Windows 2000 is expected to introduce a new API, CoWaitForMultipleHandles, that alleviates most of these problems. From the 10,000-foot view, CoWaitForMultipleHandles appears to perform the same work as WaitForMultipleObjectsEx. The difference is that CoWaitForMultipleHandles detects the calling thread type. If the calling thread is based on a multithreaded apartment (MTA), there is no need to worry about servicing window messages and the routine simply calls down to WaitForMultipleObjects. If, however, the calling thread is STA-based, the routine uses MsgWaitForMultipleObjectsEx and allows incoming calls to be serviced while waiting for the executive objects to be signaled. In particular, CoWaitForMultipleHandles uses the calling thread's message filter to control how pending calls will be handled.

    CoWaitForMultipleHandles looks very similar to its Win32 counterpart:



 HRESULT CoWaitForMultipleHandles(
     [in] DWORD dwFlags,
     [in] DWORD dwTimeout,
     [in] ULONG cHandles,
     [in, size_is(cHandles)] HANDLE *pHandles,
     [out] DWORD *pdwIndex);

The dwFlags parameter is a bitmask that can be some combination of the following:



 enum COWAIT_FLAGS {
     COWAIT_WAITALL = 0x00000001,
     COWAIT_ALERTABLE = 0x00000002
 };

Note that unlike WaitForMultipleObjects, which returns the index of the signaled handle through the result of the function, CoWaitForMultipleHandles returns the index through the last parameter.

In addition to the CoWaitForMultipleHandles API I just discussed, COM now has a family of interfaces that model synchronization primitives, all of which are based on the ISynchronize interface:



 interface ISynchronize : IUnknown {
 // perform an STA-sensitive wait
     HRESULT Wait([in] DWORD dwFlags, 
                  [in] DWORD dwMilliseconds);
 // inform the synch object to signal itself
     HRESULT Signal();
 // reset the synch object to non-signaled
     HRESULT Reset();
 }

COM provides two default (aggregatable) implementations of this interface based on Win32 Event objects, both of which use CoWaitForMultipleHandles to achieve their STA-friendliness. Both of these implementations use the freethreaded marshaler, which allows them to be freely shared across contexts and apartments without using proxies. Additionally, since the ISynchronize family of interfaces is remotable, you can now wait for Win32 events on remote host machines.

Finally, COM now defines an interface called ISynchronizeContainer for modeling multiple waits:



 interface ISynchronizeContainer : IUnknown {
 // add another synch object to the wait group
     HRESULT AddSynchronize([in] ISynchronize *p);
 // perform the wait and return the signaled obj.
     HRESULT WaitMultiple(
         [in] DWORD dwFlags, [in] DWORD dwTimeout,
         [out] ISynchronize **ppSignalled);
 }

Again, COM provides a default implementation that uses CoWaitForMultipleHandles.

Calls as Objects

An invocation of a method now has a COM object associated with it to give the client better control over a particular method call. This extends the idea of call objects introduced by the CoGetCallContext call in Windows NT 4.0, which allowed the object to access the current call as a COM object. Both CoGetCallContext and CoGetObjectContext have grown considerably since Windows NT 4.0.

Non-blocking Invocation

It is now possible to decouple the thread that issues a method call from the thread used to perform the operation. This means client-side threads can now issue a call asynchronously and regain control from the channel immediately. On the server side, objects can free up the remote procedure call (RPC) runtime thread used to invoke the method to allow more concurrent calls to be serviced using an application-controlled thread pool. The client and server can elect to use non-blocking invocation independently. In general, clients will only be able to call in-process servers synchronously.

For this new invocation style to work, the interface has to be annotated in IDL as supporting non-blocking invocation using the [async_uuid] attribute. Any interface derived from IDispatch cannot be marked asynchronously.



 [ 
     uuid(9E8857C2-C0C1-11d2-ABFA-0080C7B17AE0),
     async_uuid(9E8857C2-C0C1-11d2-ABFA-0080C7B17AE0)
 ]
     interface IBob : IUnknown {
     HRESULT HiBob([in] long n, 
                   [out, retval] long *pn);
 }

Given this fairly pedestrian-looking interface definition, the IDL compiler will emit the standard blocking version you know and love from today's COM, and also emit a non-blocking version as follows:



 struct AsyncIBob : IUnknown {
     HRESULT _stdcall Begin_HiBob(/*[in]*/ long n);
     HRESULT _stdcall Finish_HiBob(/*[out]*/long *pn);
 };

Note that for each method in the interface there is a Begin version that uses all of the [in] parameters, and a Finish version that uses all of the [out] parameters.

Non-blocking invocation is based on using ICallFactory to create an explicit call object to represent the object:



 [local] interface ICallFactory : IUnknown {
     HRESULT CreateCall([in] REFIID riidAsync,
         [in] IUnknown *pUnkOuter,
         [in] REFIID riid, 
         [out,iid_is(riid)] IUnknown **ppvCall);
 }

As shown in Figure 1, the standard proxy manager implements this interface and allows the caller to create a call object that implements an asynchronous version of the interface. When the client calls the Begin method, the [in] parameters are marshaled into an ORPC request message and the client's thread gets control back immediately after submitting the request to the channel. This means that the HRESULT returned by the Begin method only indicates marshaling errors or low-resource conditions. Any communication or object errors will be returned by the Finish method, which will also harvest the [out] parameters from the received response method. The call object also implements ISynchronize to allow the client to wait for the call to complete using a timeout.

Figure 1 An Asynchronous Call

The code in Figure 2 issues a non-blocking call against the ICallFactory interface shown previously. Remember that the MSJ runtime environment you are holding in your hands handles all HRESULTs automatically. (They print the magazine with a special error-correcting ink.) Actual programs should check HRESULTs religiously.

The code in Figure 2 polls the channel every second looking for the response. An alternative technique is to have the channel call into your code once the response has arrived. To do this, simply provide your own implementation of ISynchronize when you create the call object (which supports aggregation). Your ISynchronize implementation should blind-aggregate the COM-provided call object returned by CreateCall. This causes the channel to call your ISynchronize::Signal method and informs you that the response has arrived (or that the RPC layer has detected a communication failure).

Figure 3 Asynchronous, Server-side

Figure 3 Asynchronous, Server-side

To enable server-side non-blocking invocation, the object itself implements ICallFactory to allow the channel to create an explicit call object. As shown in Figure 3, the channel will aggregate your call object and expect you to inform it of method completion by calling its ISynchronize::Signal. This is especially useful for servers that want to manage their own thread pool rather than simply handle the request on the receiving RPC thread. Note that as long as a proxy is present, the object and client independently determine whether they will use blocking or non-blocking invocation. Also note that in the absence of a proxy, as long as the object supports ICallFactory as described earlier, the client can issue non-blocking calls directly against the object.

Call Cancellation

Both blocking and non-blocking calls can be cancelled using the ICancelMethodCalls interface.



 interface ICancelMethodCalls : IUnknown {
 // client-side function to send cancel message
     HRESULT Cancel([in] ULONG ulSeconds);
 // server-side function to test for cancellation
     HRESULT TestCancel( );
 }

The Cancel method sends a low-level DCE cancel message to the server, which signals to the server's call context that the current call has been cancelled. Unfortunately, there is no way for COM to simply yank back control from a method in progress. Rather, it is up to the object to periodically inspect the state of the call context using the TestCancel method to determine whether to continue a long-lived cancellable operation.

What has yet to be explained is, who exactly implements ICancelMethodCalls? The answer is different for blocking and non-blocking invocations. In the case of non-blocking calls, the call object returned by ICallFactory::CreateCall implements ICancelMethodCalls. On the server side, the COM-provided controlling outer passed to the object's CreateCall implements ICancelMethodCalls. On the client side, the COM-provided call object returned by CreateCall implements ICancelMethodCalls.

In the case of blocking invocation, the server side is easy, as the standard call context object implements ICancelMethodCalls, allowing a long-running (blocking) method to test for cancellation as follows:



 STDMETHODIMP Microsoft::ShipW2K( ) {
     ICancelMethodCalls *pCC;
     HRESULT hr = CoGetCallContext(__uuidof(pCC), 
         (void**)&pCC);
     if (SUCCEEDED(hr)) {
         for (int i = 0; i < MAX_INT; i++) {
             hr = pCC->TestCancel();
             if (hr == RPC_S_CALLPENDING) // not 
                 cancelled this->FixBugAndAddFeature();
                 else if (hr == RPC_E_CALL_CANCELLED)
                 break;
         }
         pCC->Release();
     }
     return hr;
 }

Supporting cancellation of blocking calls on the client is a bit more interesting. First, blocking calls don't have an explicit call context on the client side. Worse, the thread that has issued the call is blocked in the channel and can't do anything until the call returns. To deal with both problems, COM keeps a table of call cancellation objects for all threads that have pending method calls. You can get the cancellation object for a given thread by using CoGetCancelObject:



 HRESULT CoGetCancelObject([in] DWORD dwThreadID,
     [in] REFIID riid,
     [out,iid_is(riid)] void **ppv);

MTA threads have at most one call cancellation object. Because an STA thread can be reentered, STA threads have a stack of cancellation objects. (CoGetCancelObject returns the innermost nested call.)

The following code cancels a call that is pending on another thread:



 HRESULT Cancel(DWORD tid) {
 // get cancel object for specified thread
     ICancelMethodCalls *pcmc = 0;
     HRESULT hr = CoGetCancelObject(tid,
                                    __uuidof(pcmc),  
     (void**)&pcmc);
         if (SUCCEEDED(hr)) {
 // wait 10 seconds for server to ACK the cancel
         hr = pcmc->Cancel(10);
         pcmc->Release();
     }
     return hr;
 }

It is important to note that, by default, call cancellation is not enabled. Processes that want to use call cancellation can use the CoEnableCallCancellation and CoDisableCallCancellation APIs to turn this feature on and off on a threadwide basis.

Support for Handler Marshaling

Prior to Windows 2000, COM supported two types of marshaling: standard marshaling and custom marshaling. Objects that custom marshaled implemented IMarshal to replace the standard proxy/stub infrastructure with an object-specified custom proxy. This made it possible, if somewhat difficult, to implement smart proxies that performed some work on the client side prior to issuing the call to the remote object.

Windows 2000 is expected to support handler marshaling, a hybrid of custom and standard marshaling. Handler marshaling allows the object to specify a custom handler that can aggregate the standard proxy manager, simplifying the creation of smart proxies. This means that you get to inject your own code into the client without having to manually manage a connection back to the object.

To use the handler marshaling functionality, the object must implement IStdMarshalInfo to specify the CLSID of the custom handler.



 interface IStdMarshalInfo : IUnknown {
     HRESULT GetClassForHandler([in] DWORD dwDestCtx,
                                [in] void *pvDestCtx,
                                [out] CLSID *pclsid);
 }

The custom handler class is expected to aggregate the standard proxy manager using the new CoGetStdMarshalEx API. This allows the object-specified handler to use the standard COM remoting infrastructure to access the object. Because CoGetStdMarshalEx allows the proxy manager to be aggregated, the handler is free to expose (or not expose) direct references to the standard proxy without breaking identity. For reasons that are beyond the scope of this column, handlers must themselves be aggregatable, as COM will actually aggregate the handler behind an identity object prior to returning the proxy from CoUnmarshalInterface. The object model of the handler is shown in Figure 4.

Figure 4 Handler Object Model

While handlers will often be used to simply put client-side caching between the caller and the object, they are most interesting when implementing interfaces that make no sense on the wire. (Look at IViewObject and IDataObject from OLE for an example of this.)

Pipes

COM now supports a limited number of DCE-style pipes that are used to facilitate bulk data transfer within a method call. Pipes support two methods, Pull and Push, and are supported for both blocking and non-blocking method calls. What distinguishes pipes from the standard IEnum-style interface is that the remoting layer performs read-ahead buffer management to keep the communication channel full while the receiver is processing information.

Cloaking

COM now provides better control over which credentials will be used when a call is made on a proxy. By default, COM uses the process token to issue the call, ignoring any thread token that may be present due to impersonation (this is consistent with the default behavior under Windows NT 4.0). Calling CoSetProxyBlanket with EOAC_STATIC_ CLOAKING causes COM to capture the current token and use it to authenticate when calls are made using the specified interface proxy (this is also consistent with the default behavior under Windows NT 4.0). Calling CoSetProxyBlanket with EOAC_DYNAMIC_CLOAKING causes COM to inspect the caller's token at each method call. This is what many programmers expect COM to do (after all, this is what happens when you open a file), but because it is relatively expensive, it is not recommended for most applications.

In addition to providing better control over which credentials are used by a proxy, COM now supports delegation-level trust when the Kerberos authentication protocol is used. This allows the client's credentials to propagate more than one network hop away from the originating login. Be aware that for delegation to actually work, the proxy must be configured to use RPC_C_IMP_LEVEL_DELEGATION with either CoSetProxyBlanket or CoInitializeSecurity. Additionally, the server's login account must be marked as "Trusted for Delegation" in the directory, and the client's login account must not be marked "Account is sensitive and cannot be delegated."

Finally, all intermediate host machines must be Kerberos-friendly. Remember that it is against the MTS religion to perform end-to-end access control; rather, each node in a distributed application should only worry about security principals one upstream hop away. This not only simplifies DACL management, but also increases the utility of connection pooling.

Activation-time Load Balancing

Classes can now be configured to support activation-time load balancing. Clients are expected to issue the CoCreateInstance call against a designated router machine, which forwards the activation call to the lightest-loaded host machine in a preconfigured group of server machines (called an application cluster). To support load balancing, the class has to be configured in the catalog (in Windows NT 4.0 terminology this is the registry) as such, at which point the router will collect response-time statistics to determine which host machine to forward an incoming activation call to. Once the CoCreateInstance call returns, all subsequent method calls will go to the selected machine. To trigger another load balancing event, the proxy must be released and another CoCreateInstance call has to be made.

Object Pooling

A class can now be configured with high and low watermarks to place lower and upper bounds on the number of instances that exist inside a process. Additionally, the infamous IObjectControl::CanBePooled method is now supported to allow an object to be recycled rather than destroyed. For object recycling to be useful, the object must be expensive to initialize, cheap to reinitialize, thread-neutral, client/context-neutral, and aggregatable. If all of these five constraints apply to your class, simply return TRUE from IObjectControl::CanBePooled and configure the class to support object pooling. Even if your class doesn't fit all of the requirements for object recycling, the resource governor aspects of pooling can be quite useful for throttling access to server-side resources.

Bring Your Own Transaction (BYOT)

It is now possible to provide your own transaction when creating an object. The creator can provide either an OLETX ITransaction reference or a Transaction Internet Protocol (TIP) URL to bypass COM's automatic creation of a transaction. COM provides a class (ByotServerEx) and two interfaces (ICreateWithTransactionEx and ICreateWithTipTransactionEx) to expose this new feature. One common application of BYOT is to explicitly create a DTC transaction with nonstandard attributes (such as lower isolation level and shorter or longer timeout) and then enlist one or more configured components on the transaction.

Compensating Resource Managers

COM now provides a supporting infrastructure to make writing transactional resources easier. A Compensating Resource Manager (CRM) performs operations that are protected by a transaction. CRMs consist of two configured components. The CRM worker is a transactional object that performs the protected operations on behalf of its client, logging its progress using a COM-provided clerk object. At the end of the transaction, a CRM-provided compensator object is created to vote on the outcome of the transaction and to perform rollback/recovery in the face of transaction or system failure.

Publish/Subscribe Infrastructure

A COM-based event system that supports publish/subscribe-style event notification is planned for Windows 2000. Event publishers register one or more event classes that indicate which interface models the publisher's events. Event subscribers can register to catch events by binding a publisher/event class pair to either an object (for transient subscriptions) or a CLSID/hostname (for persistent subscriptions). Publishers use the event system to synchronously deliver the events to each subscriber in sequence. Unlike most of the planned Windows 2000 features mentioned in this column, this feature will be available under Windows NT 4.0 when Microsoft Internet Explorer 5.0 is installed, but only applies to system notification events such as LogOn/LogOff.

In-memory Database

Windows 2000 is expected to include a new OLE DB provider that has an in-memory, transaction-aware cache. The In-Memory Database (IMDB) can be configured either as a transaction-aware cache on top of an existing OLE DB table (such as SQL Server™or Oracle), or as a standalone table based on a user-provided ITableDefinition interface. (IMDB technology has been discontinued. See the IMDB update page for more information — Ed.). Due to the use of shared memory, writes to IMDB-based tables exhibit cache coherency on a per-machine basis only. Multimachine access to read-only tables works as expected, but the Windows 2000 version of IMDB will probably provide no support for cross-machine cache updates.

Transparent Integration with MSMQ

Under Windows NT 4.0, application developers who wanted to take advantage of Microsoft Message Queue Server (MSMQ) needed to deliberately program against the MSMQ API. Under the current Windows 2000 beta, a class can be configured to support transparent queuing, which allows the client to create a queue-aware proxy (called a recorder) that buffers all method calls until the object is released. Once released, the recorder uses MSMQ to send a message to the server application, where a player component creates an instance of the designated class and replays the method calls issued by the client.

What isn't Getting Fixed in Windows 2000?

Given this extensive list of new features, you may be wondering what probably won't make it into Windows 2000. That's easy. One example is the advanced runtime environment described in Mary Kirtland's COM+ articles in the November and December 1997 issues of MSJ. There is no new type information format or MIDL.EXE replacement. IDispatch, ITypeInfo, and friends are still (sadly) the state of the art in COM. Nor is there a new class loader that would obviate the need for the DllGetClassObject and IClassFactory support in each and every DLL.

Additionally, some of the new services may take a more conservative approach that might disappoint a few users. For example, load balancing only happens at activation time, not at JIT-activation boundaries (that is, when an object goes SetComplete). The latter would have required a substantial change to the DCOM wire protocol. (Actually, it's a limitation inherited from the underlying DCE protocol.) The former simply requires the Service Control Manager to be smarter. Also, the eventing service is limited to type-library-friendly interfaces and is highly synchronous, relying on a second service (Queued Components using MSMQ) to gain any asynchronous notification capabilities. Further, using IMDB against read-write tables is great—provided all access to the table goes through a single machine's IMDB cache.

Finally, while Windows 2000 will show that a lot of integration work has been done to mesh the MTS and COM programming models, there still may be a few loose ends. In particular, non-blocking invocation, handler marshaling, and custom marshaling only work for nonconfigured components. You can imagine scenarios where you'd want to use these new COM features with components that also use activity-based synchronization, transactions, or role-based security. Hopefully, a future release will further integrate context-aware components into all of COM.

Despite all of these shortcomings, the Windows 2000 betas I have been living on for the past few months provide a COM that is considerably more mature than the Windows NT 4.0 version. Also, lack of a separate MTS runtime and API has substantially increased the stability of the development and runtime environment. In particular, the integration of context into the core programming model will make life easier for the vast majority of developers using COM.

Have a question about programming with COM? Send your questions via email to Don Box: dbox@develop.com or http://www.develop.com/dbox.

From the May 1999 issue of Microsoft Systems Journal