Non-Blocking Method Calls

Brian Sabino
Microsoft Corporation

August 1999

Summary: Non-blocking method calls are easy to use, have many interesting applications, and will be especially popular on the client side, where making a non-blocking call is only slightly more work than a standard call. Developers are encouraged not to overlook the improved server scalability offered by this powerful new COM feature. (23 printed pages)

Contents

1. Introduction
2. Motivation
3. Architecture
4. Implementation
5. Limitations and Pitfalls
6. Conclusion

1. Introduction

With the introduction of non-blocking method calls, Microsoft has placed a powerful new tool in the hands of Component Object Model (COM) developers. Clients can use non-blocking method calls to exploit parallelism without the pain of multi-threading, and servers can handle calls asynchronously for vastly improved scalability.

Consider the following standard call of ISample::Sum:

int  sum;
pISample->Sum(4, 5, &sum);

A client using non-blocking method calls uses two methods on the asynchronous interface AsyncISample to make the same call:

int  sum;
...
pAsyncISample->Begin_Sum(4, 5);

//Client continues execution.

pAsyncISample->Finish_Sum(&sum);

The Begin method sends the outbound parameters to the server. The client is then free to do other work while the server is processing the call. The client uses the Finish method to complete the call and retrieve results.

The server can optionally handle calls asynchronously by providing an implementation of the asynchronous interface. The motivation for doing so is discussed in depth in Section 2 of this document, but in essence handling calls asynchronously can make a server vastly more scalable.

The beauty of the new asynchronous architecture is that the client and the server are totally independent in their synchronous or asynchronous handling of calls. That is, a client can use non-blocking method calls on a server that is fully synchronous. Alternatively, it is possible for a client to make blocking calls to a server that handles those calls asynchronously. The details of this will be covered in "Architecture," Section 3 of this document, but for now it is important to note the independence of the client and server with regard to the synchronicity of calls.

About This Document

This document is designed to be a thorough introduction to non-blocking method calls. It assumes the reader is familiar with COM and Distributed COM (DCOM). If you are looking for an introduction to the topic but aren't a developer, you are encouraged to read Section 2 and the beginning of Section 3.

Terminology

The following terms are used in this document:

client

An application or COM object making a COM call

asynchronous client

A client that uses non-blocking method calls

server

A COM object receiving calls from a client

asynchronous server

A server that handles incoming calls asynchronously

asynchronous calls

In the context of this document, this refers only to non-blocking method calls.

2. Motivation

Because client-side use of asynchronous calls is independent of server-side handling of calls, it's important to consider the motivations for an asynchronous client and an asynchronous server separately. First, this article covers some of the many scenarios that provide a clear motivation for client-side use of non-blocking method calls. Then it covers the more specialized situations where an asynchronous server would be appropriate.

Asynchronous Clients

Non-blocking method calls offer the developer much of the power and flexibility of multi-threading, without thread or synchronization overhead. Asynchronous method calls also have interesting applications for signaling and providing custom timeouts for COM calls. Combine these features with the ease of making and adding support for non-blocking method calls, and you can expect client-side use of non-blocking method calls to become quite widespread.

Slow/multiple servers

Non-blocking method calls are useful in many situations where developers would traditionally have to turn to multi-threading. For example, to make parallel calls to multiple servers using standard calls you would have to spawn multiple threads and have each thread make a call. This can be quite bothersome when you consider parameter passing, synchronization, and signaling. However, non-blocking method calls are ideal in this situation. Using non-blocking method calls, a client can have many calls running in parallel without ever having to worry about multi-threading.

Another clear motivation for using non-blocking method calls is starting long running calls early. A client using non-blocking method calls can begin a long-running call as soon as possible, and then continue working without ever having to worry about threading issues. Non-blocking method calls can offer significant performance gains in a single-threaded environment by allowing some operations to happen in parallel.

Signaling

Non-blocking method calls can also be used as a signaling mechanism. In this scenario, a client wants to periodically check to see if a particular event has occurred on a server. The server has a Listen() method that will return only when the event has occurred. The client can begin an asynchronous call with Begin_Listen(), and then proceed with other work. When the client wants to know the status of the event, it checks if the asynchronous call has returned. Checking the return status of an asynchronous call is fast and entirely local to the client; that is, nothing goes over "the wire" when a client checks the return status of an asynchronous call. Using this method, the client is able to poll for the status of a server-side event essentially for free.

In a variation of the preceding scenario, a client wants a call back when an event has occurred on the server. Normally, this would require two round-trip calls. First, the client would have to call a server method to register for the event and provide a pointer to the callback interface. Second, the server would have to call that interface when the event occurred. With non-blocking method calls, this can be done in one round trip. Using a technique discussed in Section 3 of this document, the client aggregates the call object in order to receive a callback on method completion. The client then uses the Begin_Listen() method as before. When the event occurs, the server will return from the Listen() method, and the client will receive the call back.

Another useful signaling technique is known as "fire and forget." In this scenario, a client wants to quickly notify multiple servers of an event, and is uninterested in any responses. The client can begin asynchronous calls to the servers and then "forget" finishing these calls by calling Release() on the call object discussed in Section 3 of this document. Because there is no way to check for call errors after the call object is released, "fire and forget" gives no indication if the call fails. However, it's an extremely efficient method for sending a large number of messages. Equivalently, a server can use "fire and forget" to notify multiple clients of an event.

Custom timeouts

Another strong case for using non-blocking method calls is the implementation of custom timeouts. It's now possible for a client to cancel outstanding calls at any time without resource leaks. A client can begin an asynchronous call and then periodically check to see if the call has completed. When the client decides the call has taken too long, it can cancel the call and use the Finish method to clean up all resources. For more information, you are encouraged to examine the timeouts sample included in the Microsoft® Windows® 2000 Platform SDK.

Other uses

There are many other situations where non-blocking method calls could be useful for a client. For example, non-blocking method calls are used in the COM Pipes implementation to provide read ahead and write behind data transfer.

Asynchronous Servers

Asynchronous servers are appropriate when scalability is an issue for multi-threaded servers. It can offer thread savings when there are many clients, and it can also provide the server with fine-grained control over the number of executing threads. Before looking at asynchronous servers in more depth, this article will review the threading situation for a standard multi-threaded server. Good background reading for this section is the "DCOM Architecture" article by Markus Horstmann and Mary Kirtland, specifically Section 6.2.

Standard multi-threaded server

When a Remote Procedure Call (RPC) arrives for a multi-threaded server, the RPC runtime picks a thread from the RPC thread pool. This thread then calls the RPC channel, which calls the server's stub directly. Execution will run through the stub, through the called method, back to the stub, and finally back to the RPC channel. See Figure 1.

Figure 1. Standard multi-threaded server

This architecture scales well unless you have many clients and long method execution times due to complexity or locking. For example, imagine you have 200 clients making calls simultaneously. This means you can have 200 threads that were taken from the RPC thread pool all attempting to execute simultaneously in the server. This situation can be disastrous from a performance perspective if the server uses locking extensively, or if method execution times are long.

Asynchronous multi-threaded server

In an asynchronous multi-threaded server, thread execution begins as before with the RPC thread. This thread calls the stub, which then calls the appropriate Begin method of the asynchronous interface. Continuing the example presented in Section 1, this would be the Begin_Sum method of the AsyncISample interface. Instead of starting work on the call, the Begin method packages the call's parameters into a structure and then places this structure onto a queue. At this point, the Begin method returns, and the RPC thread is returned to the RPC thread pool. Meanwhile, the server has a small number of worker threads that take items off the queue and process them. When a worker thread has finished, it puts its results back in the structure and signals for call completion. Thread execution continues with the appropriate Finish method. The Finish method gets the results of the worker thread and sends them back to the client. See Figure 2.

Figure 2. Asynchronous multi-threaded server

This approach offers two distinct advantages over a standard multi-threaded server. First, there's conservation of RPC threads. Instead of executing the entire method, these threads can quickly run through the Begin method and spend the rest of their time in the free thread pool. The second advantage is the fine-grained control this approach offers over the level of multiprogramming. In cases where the server has a lot of internal synchronization, it is better to have a small number of threads doing steady work than to have a very large number of threads that spend most of their time blocking.

3. Architecture

Async_uuid and Non-Blocking Method Call-Capable Proxy/Stubs

The key to understanding non-blocking method calls is that both synchronous and asynchronous calls look the same "on the wire." Put another way, a server receiving a call has no way of telling if the client initiated that call synchronously or asynchronously. Similarly, a client can't tell if its calls are being serviced synchronously or asynchronously. To provide this new functionality, COM has given you the opportunity to create proxy/stubs that are capable of non-blocking method calls. These proxy/stubs will provide the infrastructure necessary for the client and server to be totally independent in their synchronous or asynchronous handling of calls. See Figure 3.

Figure 3. Proxy/stubs infrastructure for client and server independence in synchronous or asynchronous handling of calls.

Creating these proxy/stubs is as simple as adding the keyword async_uuid to the server objects' IDL file. This is done on a per interface basis, so an object with multiple interfaces can decide which interfaces, if any, should support non-blocking method calls. Adding async_uuid will cause the MIDL compiler to create an asynchronous version of the indicated interface. For example, if an interface ISample with method

 Sum([in] int i,[in] int j, [out,retval] int * sum) 

has been marked with async_uuid, then MIDL would create a new interface AsyncISample with methods

Begin_Sum([in] int i, [in] int j) 
Finish_Sum([out,retval] int * sum)

If the server wishes to handle calls asynchronously, it provides an implementation of this asynchronous interface for use by the stub. Whether or not the server provides an implementation of the asynchronous interface, it is still usable by the client. As we'll see, the proxy will provide a call object that implements the asynchronous interface. Calling Begin_Sum will cause the call object to send a standard asynchronous RPC call of the method Sum. A call to Finish_Sum on the call object will copy the RPC call's results into the Finish method's parameters.

Client Side

Creating the call object

Non-blocking method calls begin on the client side with a query for the ICallFactory interface. With this query, a client determines whether or not the proxy is capable of non-blocking method calls. If one of the server's interfaces was marked with async_uuid, this call will succeed, and the client will have a pointer to the ICallFactory interface. The client then calls ICallFactory::CreateCall, specifying a particular asynchronous interface to obtain a proxy-supplied call object. The following code illustrates call object creation.

//Get ICallFactory from the Server
hr = pSimpleObj->QueryInterface(IID_ICallFactory, (void **)&pICallFactory);

if(FAILED(hr)){
   //No support for non-blocking method calls.
   ...
}else{
   //Proxy supports non-blocking method calls.

   //Create the call object for AsyncISample
   pICallFactory->CreateCall(IID_AsyncISample,NULL,IID_IUnknown,
                               (IUnknown**)&pIUnknown);
   //pIUnknown points to the call object.
...

The call object is the client's point of reference for beginning, monitoring, and finishing asynchronous calls. Call objects are created for a particular asynchronous interface, and every outstanding call needs its own call object. The first parameter to CreateCall specifies the asynchronous interface for which the call object should be created. The second parameter allows for aggregation of the call object by the client. The third parameter specifies which of the call objects interfaces should be placed in the fourth parameter. Note that call objects cannot be used outside of the apartment in which they were created. See Figure 4.

Figure 4. Call object for beginning, monitoring, and finishing asynchronous calls

The call object supports two new standard interfaces, ICancelMethodCalls and ISynchronize, as well as the asynchronous interface specified by the first parameter to CreateCall. As seen below, ICancelMethodCalls provides the client the opportunity to request cancellation of an outstanding call, and ISynchronize gives the client a way to check the completion status of an outstanding call.

Beginning a call

When the client is ready to begin a non-blocking method call, it calls the appropriate Begin method of the asynchronous interface on the call object. The pointer to this interface is obtained from CreateCall directly or by querying the pointer obtained in CreateCall for the asynchronous interface, as seen below. When the client calls the Begin method, all the [in] parameters are marshaled and sent via asynchronous RPC to the server exactly as if the client had just called the equivalent synchronous method. The difference is that the Begin method returns immediately, and the client is free to continue working. Continuing the code sample from above:

   //Get the asynchronous interface from the call object.
   pIUnknown->QueryInterface(IID_AsyncISample,(void**)&pAsyncISample)

   pAsyncISample->Begin_Sum(2,3); //The proxy will make a standard call
...

Waiting for call completion

Eventually, the COM call generated by the Begin method will return with the method's [out] parameters. The client uses the ISynchronize interface on the call object to check for this event. Specifically ISynchronize::Wait will return S_OK when an asynchronous call has returned and is ready to be finished. Wait(0,0) will return immediately and gives the client a chance to poll for the completion of a call. Wait times other than 0 will pause execution for the specified period, or until the asynchronous call has returned. If the caller is in a single-threaded apartment, it will enter its message loop; if the caller is in a multi-threaded apartment, the calling thread will be blocked. Note that calls to the call object including ISynchronize::Wait are entirely local to the client. Wait simply checks if the asynchronous RPC call has returned, and does not transmit anything to the server.

Canceling a call

If a client wants to send a cancellation request for an outstanding asynchronous call, it queries the proxy-supplied call object for the ICancelMethodCalls interface. The client sends the cancellation request by calling ICancelMethodCalls::Cancel and specifying a timeout. If the asynchronous call has already returned, or if the call returns during the cancel timeout interval, Cancel will return RPC_E_CALL_COMPLETE. If the server has received the request, Cancel returns S_OK. Note that S_OK does not mean that the call was cancelled, only that the request was sent to the server. In either case the client must call the Finish method to release all resources allocated for the call. Once the client has called Finish, it is free to make other non-blocking method calls or to Release the call object.

The client-side cancellation request gives no guarantee that the server will actually cancel the call. It simply sends a cancellation request that the server can check for using ICancelMethodCalls::TestCancel. A well-behaved server will occasionally check for call cancellation and respond to this request as seen in Section 4 of this document. The following code continues the preceding example and shows the usage of ISynchronize and ICancelMethodCalls.

   //Get the ISynchronize interface from the call object.
   pIUnknown->QueryInterface(IID_ISynchronize,pISynchronize);

   hr = pISynchronize->Wait(0,5000); //Give the call 5 sec to return
   if(FAILED(hr)){
      //Call did not return in 5 sec.  Cancel the call.
      pIUnknown->QueryInterface(IID_ICancelMethodCalls,
                                pICancelMethodCalls);
      pICancelMethodCalls->Cancel(0); //Don't wait for acknowledgement
      
      //Call Finish to clean up all resources.
      pAsyncISample->Finish_Sum(&sum); //Call Finish to free resources Sum is garbage
      
      ...
   }
   //Call returned in less than 5 sec and is ready to be finished.
   pAsyncISample->Finish_Sum(&sum); 

Another method of terminating a call on the client side is by calling Release on the call object. When a call object is being destroyed, it will free any resources it allocated, including those allocated for an outstanding call. Note that the server does not receive notification that the call object has been released, and the client is unable to check for errors or receive call results. Still, as mentioned in Section 2, this technique of "firing" a call and then "forgetting" it by releasing the call object can be very useful as a signaling mechanism.

Finishing a call

When a client wants to complete a non-blocking method call, it uses the appropriate Finish method of the asynchronous interface. If the RPC call has returned, the [out] parameters are copied out of the RPC buffer and into the client's address space. Note that Finish will call ISynchronize::Wait, so calling the Finish method before the RPC call has returned is equivalent to calling Wait(0,INFINITE) and then Finish. After a call to Finish has returned, the call object can be released or reused for another asynchronous call. You should reuse call objects for increased performance, but it is important to remember that a call object can only support one call at a time. This means that a client has to create a call object for each overlapping non-blocking method call.

Aggregating the call object

If the client wants a call back when a non-blocking method call has returned and is ready to be finished, it can aggregate the call object. In the course of an asynchronous call, the proxy will query the call object for the ISynchronize interface. When the asynchronous call has returned from the server and is ready to be Finished, the proxy will call ISynchronize::Signal(). If a client wants to receive a call back, it can create an object with a custom implementation of ISynchronize. By passing in the controlling unknown from this object as the second parameter to CreateCall, the client can force the proxy to use its implementation of ISynchronize. ISynchronize::Signal() will then be the client's call back when an asynchronous call on the call object has returned and is ready for a call to Finish.

Providing a custom implementation of ISynchronize is fairly easy. One way is for the client to pass ISynchronize::Wait and ISynchronize::Reset calls straight through to the call object's ISynchronize interface. Calls to ISynchronize::Signal would first be passed through to the call object and then used as a call back. Another option is containment of the system-supplied synchronization object CLSID_ManualResetEvent. Note that because the Finish method calls Wait, a client must complete a call to Signal before calling Finish. This can be seen in the following pseudo code.

// Implementation of my custom ISynchronize::Signal.  This method will be called when the
// COM call returns, or in the event of a timeout
CustomSyncObject::Signal(){
   m_pISynchronize->Signal(); //Signal on contained sync object.
   m_pAsyncISample->Finish_Sum(&m_sum);
}

Server Side

A server is free to handle incoming calls either synchronously or asynchronously regardless of how the call was generated. The first time a call arrives, the stub queries the server for the ICallFactory interface. If this query fails, the stub will complete the call—and all future calls—with the server's synchronous interfaces. If the query succeeds, the stub will always attempt to handle calls asynchronously.

Create call

The server indicates that it will handle incoming calls asynchronously by implementing the ICallFactory interface. When an RPC call arrives, the RPC runtime will choose a thread from the RPC thread pool. This thread will eventually call the stub, which will then call the server's ICallFactory::CreateCall method. This method is used to provide a server-side call object for use by the stub. This call object is usually a coclass that implements the asynchronous interface indicated by the first parameter in CreateCall. See Figure 5.

Figure 5. Method used to provide a server-side call object for use by the stub

It is important to note several things about creating a call object on the server side. The first is that CreateCall takes place on the object that also implements the synchronous version of the requested interface. This gives the server the opportunity to synchronize data or pass pointers between objects that implement the synchronous and asynchronous interfaces. Second, the second parameter to CreateCall, (void**) pUnk, is now a controlling unknown. If this parameter is non-null, the server must aggregate the call object with this unknown. Finally, the call object needs to provide an implementation of ISynchronize. This can be done by hand, or more commonly by aggregating the system supplied synchronization object CLSID_ManualResetEvent.

Note that if CreateCall fails, the stub will still attempt to complete the call using the synchronous interface.

Begin

If CreateCall finished successfully, the stub then calls the Begin method. The Begin method should first reset the synchronization object by calling ISynchronize::Reset. Begin must also record that a call is in progress by setting a state flag. This flag will be checked by Finish to ensure that the call is being made properly. At this point, Begin can package all the [in] parameters into a data structure and queue them for processing by a worker thread.

Signal

When the server has finished processing the call and wishes to return the results, it calls ISynchronize::Signal on the call object. When Signal is called, the stub will use the calling thread to execute the appropriate Finish method.

Finish

The Finish method gives the server access to all the [out] parameters of the original call. Before filling these parameters, Finish should check the state flag to ensure that there is a call being processed and then call Wait(0,INFINITE). Finish should then simply fill the [out] parameters so they can be shipped back to the client.

Call object reuse

Note that the stub will attempt to minimize the number of calls it makes to CreateCall by reusing call objects. The developer should avoid putting any state in the call object that is not reset between calls.

4. Implementation

Our first stop in the implementation tour will be the creation of a simple asynchronous server object. This object, SimpleSvr, will show the minimal requirements to allow client-side use of non-blocking method calls. Next, we will tour client implementations that make non-blocking method calls in C++, Java, and Microsoft Visual Basic®. Our final stop will be FullServ, a server that handles incoming calls asynchronously. For more complete samples than those shown here, check the AsyncCalls sample in the Windows 2000 Platform SDK.

Simple Async Server

Creating an object that allows asynchronous calling is surprisingly simple; it requires only a few lines of Interface Definition Language (IDL). However, the first step is to create a standard COM object. The example we'll use here is called SimpleSvr. This ATL-generated component has one interface, ISimpleSvr, with one method as follows:

 HRESULT Sum([in] int i,[in] int j, [out,retval] int * sum) 

C++ Support

To allow C++ clients to make asynchronous calls on the ISimpleSvr interface, we need only add the async_uuid attribute to the .idl file as seen below. This will tell MIDL to create proxy/stubs that are capable of non-blocking method calls and the asynchronous version of ISimpleSvr called AsyncISimpleSvr. To clients, SimpleSvr will now appear to have an ICallFactory interface that they can use to create call objects. Clients can then query for AsyncISimpleSvr on these call objects and begin making non-blocking method calls.

   [
      object,
      uuid(5A7D9165-635B-4858-AC86-89F2958B6230),
      async_uuid(8ABD531E-1BFB-4a78-A951-CA5C8FA8999D),
      helpstring("ISimpleSvr Interface"),
      pointer_default(unique)
   ]
   interface ISimpleSvr : IUnknown
{
   HRESULT Sum([in] int i,[in] int j, [out,retval] int * sum) 
}

Java and Visual Basic support

To support Java and Visual Basic, we need to add information about the supported interfaces to the type library. First, we need to add the ICallFactory interface to the SimpleSvr coclass. Remember, even though SimpleSvr doesn't implement IcallFactory in this case, clients can still use this interface because it is implemented by the proxy. Second, we need to add a coclass for the call objects created by CreateCall. Again, even though SimpleSvr doesn't have a coclass AsyncSimpleSvr, we will use a pointer to AsyncSimpleSvr to point to the proxy created call object.

   coclass SimpleSvr
   {
      [default] interface ISimpleSvr;
      interface ICallFactory;
   };
   [
      uuid(C2F496A7-E72C-4EAF-864F-050EC89EC64E),
   ]
   coclass AsyncSimpleSvr
   {
      [default] interface AsyncISimpleSvr;
      interface ISynchronize;
      interface ICancelMethodCalls;
   };
};

Client-Side Samples

Please note that these examples are illustrative only and do not represent best practices. Specifically, error checks and comments have been omitted in the interests of readability. Real clients must check the HRESULT of calls and take appropriate action in case of failure.

These samples have been broken down into four steps: Step 1 is creating the server object. This step will be familiar to any COM developer. Step 2 is creating the proxy-supplied call object with CreateCall. In this step, we will also obtain a pointer to the asynchronous interface implemented by the call object. Step 3 will be obtaining a pointer to the call object's ISynchronize interface. Note that Step 3 is entirely optional in the following code. As mentioned previously in this document, calling Finish before a call has completed is equivalent to calling Wait(0,INFINITE) and then Finish. This means we could omit the query for ISynchronize and Wait without any change in program behavior. Finally, Step 4 will be making the asynchronous call including calls to Begin, Wait, and Finish.

C++

//1 Create the server.
CoCreateInstance(CLSID_SimpleSvr,NULL,CLSCTX_LOCAL_SERVER,IID_IUnknown, (void**)&pIUnknown);

//2 Create the call object.
pIUnknown->QueryInterface(IID_ICallFactory,(void **)&pICallFactory);
pICallFactory->CreateCall(IID_AsyncISimpleSvr,NULL,IID_AsyncISimpleSvr,
                          (IUnknown**)&pAsyncISimpleSvr);

//3 Get the ISynchronize interface.
pAsyncISimpleSvr->QueryInterface(IID_ISynchronize,
                                 (void**)&pISynchronize);

//4 Make async call.
pAsyncISimpleSvr->Begin_Sum(2,3);

cout<<"Waiting for async call to finish"<<endl;
pISynchronize->Wait(0,INFINITE);

pAsyncISimpleSvr->Finish_Sum(&i);
cout<<"Sum of 2 and 3 is: "<< i <<endl;

Java

This code is nearly identical to the C++ client. There are only two major differences. The first is the necessity of using arrays to pass interface pointers. The second is the try/catch block in Step 4.

In Java, all calls to ISynchronize::Wait should be made in a try/catch block, as in Step 4. ISynchronize::Wait uses the value of the HRESULT to signal call completion. However, in Java, HRESULTS other than S_OK result in an exception. The developer can leverage the necessity of the try/catch to get information about the HRESULT with something like:

boolean callComplete;
try{ 
   pISynchronize.Wait(0,TIMEOUT);
   callComplete = true;
}catch(com.ms.ComFailException){
   callComplete = false;
}

Developers uncomfortable with this method of obtaining the result of a Wait can instead make the wait call through a helper DLL as in the Visual Basic section below.

//1 Create the server.
SimpleSvr pSimpleSvr = new SimpleSvr();

//2 Create the call object.
ICallFactory pICallFactory = (ICallFactory)pSimpleSvr;
AsyncISimpleSvr arrayAsyncISimpleSvr[] = new AsyncISimpleSvr[1];
pICallFactory.CreateCall(AsyncSimpleSvr.iid,null,AsyncISimpleSvr.iid,
                         arrayAsyncISimpleSvr);
AsyncISimpleSvr pAsyncISimpleSvr = arrayAsyncISimpleSvr[0];

//3 Get the ISynchronize interface.
ISynchronize pISynchronize = (ISynchronize) pAsyncISimpleSvr;

//4 Make async call.
pAsyncISimpleSvr.Begin_Sum(2,3);

System.out.println("Waiting for async call to finish");
try{
   pISynchronize.Wait(0,-1); // -1 = INFINITY
}catch (com.ms.com.ComFailException){}

i = pAsyncISimpleSvr.Finish_Sum();
System.out.println("Sum of 2 and 3 is: " + i );

Visual Basic

Because CreateCall is not automation-compliant, Visual Basic cannot make this call directly. In the solution shown here, Visual Basic calls into a DLL, which then calls CreateCall. The relevant portions of this helper DLL, "HelpDLL," can be seen in the following example:

__declspec(dllexport) __stdcall GetCallObject(ICallFactory* pCf,
                                                REFIID riid1,
                                                IUnknown* pUnkOuter,
                                                REFIID riid2,
                                                IUnknown** ppv ) {
   return  pCf->CreateCall(riid1,pUnkOuter,riid2,ppv);
}

__declspec(dllexport) __stdcall WaitCallObject(ISynchronize* pSync,
                                               DWORD dwMilli){
    return pSync->Wait(0, dwMilli);
}

Visual Basic also has to do some extra work to get the IID of the asynchronous interface. First the Visual Basic client declares a data structure to hold the IID. The client then declares all external functions. In this case, the client declares both functions in the "HelpDll" and the CLSIDFromString function, which will be used to fill the GUID data structure with the IID of AsyncISimpleSvr. Except for these simple steps, the non-blocking method call has the same structure as the C++ and Java examples. Note that Visual Basic is able to get the HRESULT of the calls made through the helper DLL.

'A. Define Data struct to hold GUID
Private Type GUID
    Data1 As Long
    Data2 As Integer
    Data3 As Integer
    Data4(7) As Byte
End Type

'B. Declare external functions

Private Declare Function GetCallObject Lib "HelpDll" Alias "_GetCallObject@20" (ByVal pCf As ICallFactory, riid1 As GUID, ByVal pUnkOuter As Long, riid2 As GUID, ppv As Object) As Long

Private Declare Function WaitCallObject Lib "HelpDll" Alias "_WaitCallObject@8" (ByVal pSync As ISynchronize, ByVal dwMilli As Long) As Long

Private Declare Function CLSIDFromString Lib "OLE32" (ByVal lpszCLSID As Long, pclsid As GUID) As Long

'Declare Variables
Private Const szIID_AsyncISimpleSvr As String = "{8ABD531E-1BFB-4a78-
A951-CA5C8FA8999D}"

Private IID_AsyncISimpleSvr As GUID
Private i As Long
Private pICallFactory As ICallFactory
Private pISynchronize As ISynchronize
Private pAsyncISimpleSvr As AsyncISimpleSvr
Private pSimpleSvr As SimpleSvr


Private Sub Form_Load()

'1 Create the server
Set pSimpleSvr = CreateObject("Simple.SimpleSvr")

'Get the IID of the Async interface for use in GetCallObject
Call CLSIDFromString(StrPtr(szIID_AsyncISimpleSvr),
                      IID_AsyncISimpleSvr)

'2 Create the call object
Set pICallFactory = pSimpleSvr
Call GetCallObject(pICallFactory, IID_AsyncISimpleSvr, 0, 
                   IID_AsyncISimpleSvr, pAsyncISimpleSvr)

'3 Get the ISynchronize interface
Set pISynchronize = pAsyncISimpleSvr

'4 make async call
Call pAsyncISimpleSvr.Begin_Sum(2, 3)

Text1.Text = "Waiting for async call to finish"
Call WaitCallObject(pISynchronize, -1) '-1 = INFINITE

i = pAsyncISimpleSvr.Finish_Sum()
Text1.Text = "Sum of 2 and 3 is: " + Str(i)
End Sub

Server Side

Implementing ICallFactory

Note that ICallFactory::CreateCall is essentially a class factory for call objects. In this case, the call objects are CAsyncFullServ instances. As you can see below, CAsyncFullServ provides an implementation of AsyncIFullServ for use by the stub. Note that CreateCall passes a pointer from the CFullServ instance to the CAsyncFullServ instances. This allows the objects implementing the synchronous and asynchronous interfaces to share data.

STDMETHODIMP CFullServ::CreateCall(REFIID riid1,
                           IUnknown * pUnk,
                           REFIID riid2,
                           IUnknown ** ppObj)
   HRESULT hr;
   IFullServ * pIFullServ;

   //Check parameters.
   if (ppObj == NULL) return E_POINTER;
   if(riid1 != IID_AsyncIFullServ || (pUnk && riid2 != IID_IUnknown))         
      return E_INVALIDARG;
   
   //Create the call object aggregating if necessary.
   CComPolyObject<CAsyncFullServ>* pPolyFullServ = NULL;
   CComPolyObject<CAsyncFullServ>::CreateInstance(pUnk,&pPolyFullServ);

   //Pass a pointer from this CFullServ instance to the CAsyncFullServ instance
   pPolyFullServ->m_contained.Init(this);

   //Get requested interface
   return pPolyFullServ->QueryInterface(riid2,(void**)ppObj);
}

Begin

As you can see, the Begin method first checks a flag to ensure that the call object is only being used for one call at a time. Begin then stores a pointer to the ISynchronize interface for later use and resets the synchronization object. Finally, Begin puts the [in] parameters into member variables and queues the CAsyncFullServ instance created by CreateCall for work by the thread pool. In this case, the thread pool being used is the new Windows 2000 system-supplied thread pool. This Windows 2000 feature is covered in detail in "New Windows 2000 Pooling Functions Greatly Simplify Thread Management," by Jeffrey Richter, in the April 1999 edition of the Microsoft Systems Journal.

Note that the HRESULT returned by the Begin method will not be seen by the client. In case of an error, Begin may wish to store an error code so that the Finish method can return it to the client.

STDMETHODIMP CAsyncFullServ::Begin_Sum(int i, int j){

   HRESULT hr;
   ISynchronize* pISynchronize;
   if(m_callInProgress) return RPC_S_CALL_PENDING;
   m_callInProgress = TRUE;
   m_hr = S_OK;
   
   //Get and reset synchronization object.
   hr = ((AsyncIFullServ *)this)->QueryInterface(IID_ISynchronize,
                                      (void**)&pISynchronize);
   ASSERT(SUCCEEDED(hr)); 
   pISynchronize->Reset();

   //Store [in] parameters.
   m_i = i;
   m_j = j;

   //Queue for processing by a worker thread.
   if(! QueueUserWorkItem(WorkerFcn, this, WT_EXECUTEDEFAULT)){
      m_hr = E_FAIL; //Record failure for Finish method.
      pISynchronize->Signal();
   }

   pISynchronize->Release();
   return S_OK;
}//RPC thread returns to the RPC thread pool.

When a worker thread pulls CAsyncFullServ off the queue, it begins execution at WorkerFcn. This function simply calls the doSum method of the CAsyncFullServ instance. The worker thread continues execution in the doSum method and first checks if the call has been canceled. If the call has been canceled, it sets a flag and signals. If the call was not canceled, the worker thread executes the remainder of the function and then calls Signal.

DWORD WINAPI WorkerFcn(LPVOID context){
   CoInitializeEx(NULL,COINIT_MULTITHREADED);
   CAsyncFullServ * pAsyncFullServ = (CAsyncFullServ *) context;
   pAsyncFullServ->doSum();
   CoUninitialize();
   return 0;
}
STDMETHODIMP CAsyncFullServ::doSum(){
   HRESULT hr;
   ISynchronize* pISynchronize;

   //Get a pointer to ISynchronize.
   hr = ((AsyncIFullServ *)this)->QueryInterface(IID_ISynchronize,
                                      (void**)&pISynchronize);
   ASSERT(SUCCEEDED(hr)); 

   //Attempt to get pointer to ICancelMethod Calls.
   ICancelMethodCalls * pICancelMethodCalls = NULL;
   hr = ((AsyncIFullServ *)this)>QueryInterface(IID_ICancelMethodCalls,
                                      (void**)&pICancelMethodCalls);
   //Check for call cancellation.
   if(SUCCEEDED(hr)){
      hr = pICancelMethodCalls->TestCancel();
      if(hr == RPC_E_CALL_CANCELED){
         m_hr = hr; //Record call cancellation for Finish
         pICancelMethodCalls->Release();
         pISynchronize->Signal();
         pISynchronize->Release();
         return S_OK;
      }
   }
   
   //Do the work of the method.
   m_sum = m_i + m_j;

   if(pICancelMethodCalls) pICancelMethodCalls->Release();
   pISynchronize->Signal();
   pISynchronize->Release();
   return S_OK;
}

Finish

The worker thread that called Signal then executes the Finish method. This method checks callInProgress flag set in the Begin method, and then calls Wait to allow the doSum and WorkerFcn functions to return. Finish then checks for call cancellation, and, if all is well, it places the results of the call in the [out] parameter for transfer back to the client.

STDMETHODIMP CAsyncFullServ::Finish_Sum(int * sum){
   ISynchronize* pISynchronize;

   //Get a pointer to ISynchronize.
   hr = ((AsyncIFullServ *)this)->QueryInterface(IID_ISynchronize,
                                      (void**)&pISynchronize);
   ASSERT(SUCCEEDED(hr)); 


   pISynchronize->Wait(COWAIT_WAITALL,INFINITE);
   pISynchronize->Release();
   
   if(FAILED(m_hr)){
      return(m_hr); //Call failed or was canceled. Return error code to the client
   }

   *sum = m_sum; //Package out param
   return S_OK;
}

5. Limitations and Pitfalls

Non-blocking method calls are powerful and easy to use. However, there are some limitations and pitfalls that the developer should be aware of before using them.

Limitations

Windows 2000 only

Non-blocking method calls rely on asynchronous RPC, which is available only on the Windows 2000 platform. This means that Windows NT 4.0 clients can't make non-blocking method calls, and Windows NT 4.0 servers can't handle calls asynchronously. The good news is that you can still have asynchronous calls across platforms. That is, a Windows 2000 client can use non-blocking method calls on an object running under Windows NT 4.0, or a Windows 2000 object can asynchronously handle calls made from Windows NT 4.0 client. To do this, you need two versions of the proxy/stub DLL, one compiled with async_uuid and one compiled without. Install the async_uuid-compiled DLL on Windows 2000 computers, and the other DLL on other platforms. This way the Windows 2000 clients and servers can maintain their functionality but still work with older clients or servers.

Won't work with IDispatch

You can't use asynchronous calls with IDispatch or any interface that inherits from IDispatch. IDispatch doesn't separate in and out parameters, and this is a requirement for generating an asynchronous version of an interface.

Asynchronous servers only in C++

On the server side, CreateCall requires aggregation if the second parameter is non-null. Because a server has to implement CreateCall to handle calls asynchronously, only aggregation-aware languages can be used to create asynchronous servers. At this time, only C++ is aggregation-aware, and thus only C++ can be used to create asynchronous servers.

Needs proxy/stub

All asynchronous calls must go through a proxy/stub. This means you can't use type library marshaling with asynchronous calls. It also means that you can't use asynchronous calls on in-process components in the same apartment. However, if the server implements ICallFactory, you can still use the asynchronous interface in this situation. In this case, the client would be calling CreateCall, Begin, and Finish directly on the in-process component, and all work would be done synchronously.

Pitfalls

Call order dependence

Before marking an interface as supporting asynchronous calls, you should make sure that the server isn't call-order-dependent. Using non-blocking method calls, a single-threaded client can now have two or more outstanding calls. However, the order in which the client began the calls is not guaranteed to be the order in which those calls are executed on the server. For example, a client may call Begin_A and then Begin_B, and due to thread scheduling the server may end up executing method B and then method A. This can be disastrous if method B depends on state set by method A.

Call object reuse

You are encouraged to reuse call objects, but remember that call objects are created for a particular asynchronous interface, and that every outstanding call needs its own call object. Remember also that the stub will reuse server-created call objects.

Dynamic interfaces

Registering the proxy/stub DLL that is non-blocking method call-aware makes several registry entries. Specifically, these include the server's synchronous and asynchronous interfaces and the following entries:

HKEY_CLASSES_ROOT\Interface\{Synchronous IID}\AsynchronousInterface: <default> = (REG_SZ) {Asynchronous IID}

HKEY_CLASSES_ROOT\Interface\{Asynchronous IID}\SynchronousInterface: <default> = (REG_SZ) {Synchronous IID}

Developers using dynamic interfaces, specifically CoRegisterPSCLSID, need to be aware of the need to create these new registry entries if they want to support asynchronous interfaces.

Call object resource allocation

As soon as a Begin method is called, the call object reserves resources for the call's [out] parameters. These resources will remain allocated until the Finish method is called, or until the call object is Released. Developers should keep this in mind when making non-blocking method calls with large [out] parameters.

Overuse of parallel calls

Because non-blocking method calls make it so easy to make calls in parallel, it is tempting to send a very large number of calls to the server simultaneously. However, remember that if a multi-threaded server handles calls synchronously, then it suffers from the scalability problems discussed in Section 2. Well-designed clients will keep the number of parallel calls at a level appropriate to the scalability of the server.

[in,out] parameters between Begin and Finish

With non-blocking method calls, you should pay close attention to parameters that are [in,out]. Consider an interface with one method Square([in,out] int * i). On the Asynchronous interface, this method becomes Begin_Square([in] int * i) and Finish_Square([out] int * i). This is intended to be used as follows:

int i;
...
Begin_Square(&i);
...
Finish_Square(&i);

However, there's nothing stopping a client from calling:

int i,k;
...
Begin_Square( &i );
...
Finish_Square( &k ); //out param copied to a different location.

Or worse:

int i;
int * pInt = &i
...
Begin_Square( pInt );
...
pInt = NULL;
...
Finish_Square( pInt ); //Access Violation.

6. Conclusion

As we've seen, non-blocking method calls are easy to use and have many interesting applications. Non-blocking calls will be especially popular on the client side, where making a non-blocking call is only slightly more work than a standard call. However, developers are encouraged not to overlook the improved server scalability offered by this powerful new COM feature.