House of COM Q&A, MSJ July, 1998

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

July 1998

Don Box is a co-founder of DevelopMentor where he manages the COM curriculum. Don is currently breathing deep sighs of relief as his new book, Essential COM (Addison-Wesley), is finally complete. Don can be reached at http://www.develop.com/dbox.

COM is five years old. As I write this column in April 1998, developers all over the world are celebrating the fifth anniversary of the OLE2 Professional Developer's Conference. At this 1993 event, Microsoft® shipped the release version of OLE 2.0 and the Component Object Model. To commemorate this anniversary, I am dedicating this month's column to looking at the current state of COM, reflecting on what the COM designers got right, and what needs improvement. Hopefully, COM+ (or whatever Microsoft marketing winds up christening the next generation of COM) will build on the strengths of today's COM and repair the mistakes of the past. It's always best to preface criticism with praise, so I'll begin by looking at aspects of COM that were right on the money and withstood the test of time.
IUnknown
      What can be said about IUnknown that hasn't already been said? IUnknown represents an almost minimal model of what is required to bridge the type system and lifecycle policies of two or more potentially disparate systems.
      Given a reference to an object, most object-oriented languages and systems have a way to coerce the type of the reference to another type based on the actual type of the object. For example, in C++, you can use dynamic_cast:

bool TryToMeow(Dog &dog) { try { Cat& cat = dynamic_cast<Cat&>(dog); return cat.meow(); } catch (const std::bad_cast&) { return false; } }
In Java, you can use the C-style cast syntax to achieve the same effect:
public static boolean TryToMeow(Dog dog) { try { Cat cat = (Cat)dog; return cat.meow(); } catch (java.lang.ClassCastException ex) { return false; } }
Both of these examples interrogate the object's type information at runtime to determine whether the object is compatible with the type Cat. If the object is a simple Dog, the cast will throw an exception, indicating that the cast makes no sense for this particular Dog. If the object is a Dog that is also a Cat (via multiple inheritance), the cast will succeed and return a valid reference to the "Cat-ness" of the object, allowing the client to access the Cat functionality of the object (in this case, the meow method). In a dynamically composed system, this ability to interrogate an object's type information is critical for allowing clients to access extended functionality from an object.
      The capability for runtime type interrogation was deemed so important that the COM designers mandated that all objects (and object references) support it by making QueryInterface a method in the IUnknown interface. In languages that do not use a runtime layer or virtual machine (VM), QueryInterface can be called directly:
bool TryToMeow(Dog *pdog) { Cat *pcat = 0; HRESULT hr; hr =pdog->QueryInterface(IID_Cat,(void**)&pcat); if (SUCCEEDED(hr)) { hr = pcat->meow(); pcat->Release(); } return SUCCEEDED(hr); }
In languages that have a runtime or virtual machine, it is up to the VM implementor to thunk the VM's cast operation to COM's QueryInterface, mapping the native language's intrinsic runtime type identifiers to COM GUIDs.
      Different languages and systems also have different mechanisms for reclaiming resources held by object references. To allow interoperation of such systems, IUnknown acts as an impedance matching layer for the lifecycle management of object references. Rather than simply add a delete method to IUnknown, the COM designers determined that it was none of the client's business when an object should reclaim its resources. Rather, all the client can expect is that the object should remain valid as long as the client's reference is still "live." In COM, clients notify objects when an object reference is no longer live by calling IUnknown's Release method independent of the client's policy for invalidating object references (termination of scope, mark and sweep garbage collection, and so on). In languages that have a runtime or VM, it is up to the VM implementor to notify the object that an object reference is no longer live. In languages that do not have a runtime or VM, it is up to the client programmer to release the reference explicitly (which is no different than calling free to offset a call to malloc).
      The use of a passive Release in lieu of an explicit delete operation also simplifies the case where the client holds multiple types of references to a single object. Since there may not be a one-to-one correspondence between object references and objects, using an explicit delete operation per object would add undue complexity to the client's program or supporting runtime.
      Finally, as an optimization, the COM designers added the AddRef method to make duplicating homogeneous object references more efficient. While this is legal COM
STDMETHODIMP GiveMeYourDog(/*[out]*/ IDog **ppd) { extern IDog *g_pMyDog; return g_pMyDog->QueryInterface(IID_IDog, (void**)ppd); }
the following implementation is likely to be considerably more efficient:
STDMETHODIMP GiveMeYourDog(/*[out]*/ IDog **ppd) { extern IDog *g_pMyDog; (*ppd = g_pMyDog)->AddRef(); return S_OK; }
Most implementations of AddRef simply increment an integer. Most implementations of QueryInterface perform multiple GUID comparisons prior to returning an object reference. This is just one example of the compromise between theoretical purity and runtime performance reality that permeates much of COM's design.
      Perhaps one of the most compelling aspects of IUnknown is its simplicity. It is completely minimal and requires no runtime support from the COM library. You can implement this part of COM on any operating system and in virtually any language that allows direct manipulation of memory. In fact, IUnknown was deemed so powerful that the Netscape developers used it in their cross-platform Web browser, Netscape Navigator, even though COM was not (and is still not) available on all of Netscape's target platforms.
Interface-based Programming
      Perhaps the most fundamental concept in COM is the separation of interface from implementation. COM distinguishes the "what" from the "how" by promoting interfaces to first-class status. COM interfaces are pure abstraction with no implementation. When designing a COM interface, the developer is simply specifying the requests that can be made of an object, not how the request will actually be carried out (that is the role of a particular class, not the abstract interface). When coupled with a dynamic interface discovery mechanism (like QueryInterface or dynamic_cast), interfaces act as an effective technique for allowing components in dynamically composed systems to evolve independently. The idea of interface-based programming was brought even further into the mainstream when Sun Microsystems incorporated it into the Java programming language. (For another endorsement of interface-based programming from outside the world of Microsoft, read John Lakos's Large-Scale C++ Software Design, Addison-Wesley, 1996.)
      When decoupled from the relative grunginess of GUIDs and reference counting present in IUnknown, the simple concept of interface-based programming is trivial to understand for any developer used to declaring typed variables. Interfaces model functionality abstractly. Classes provide concrete code that implements one or more abstract interfaces. Objects belong to exactly one class. That's it. The rest is just detail, which, if you use a programming tool like Visual Basic®, is hidden from you anyway. As someone who has been writing with COM code for over four years, I can't imagine building software without using interfaces.
The COM Remoting Architecture
      The COM remoting architecture is amazingly well thought out. It's extensible, fairly efficient, and largely transparent to both clients and object implementors. COM does not require your object to do anything to be accessed remotely. Just implement IUnknown, and the COM library will build a network transceiver (called the stub manager) on top of your object that will intercept remote requests and translate them to local method calls. COM does not create a transceiver for every object. Rather, the transceiver is created on a per-object basis the first time an object reference is exported to an external client. If a particular object is never accessed remotely, there is no additional overhead.
      Because developers tend to define new COM interfaces, the stub manager is extensible. The stub manager only handles calls to the IUnknown interface. For any other interface, COM loads an interface stub that knows how to decode requests for that particular interface. This interface stub is an in-process COM component that is loaded just like any other COM-based DLL, illustrating one way in which the COM remoting architecture is built using COM.
      Clients send remote method invocation requests to stubs by invoking methods on proxies. A COM proxy is an object that resides in the client's context and translates local method invocations into remote request messages. A COM proxy is composed of a proxy manager that implements the IUnknown interface by dynamically loading interface proxies for particular interfaces à la the stub manager's loading of interface stubs. The primary architectural difference between the proxy and the stub is that the proxy manager uses COM aggregation to merge the identities of the interface proxies with its own. This is required since the proxy manager hands out references to interface proxies directly to the client and must maintain the illusion of object identity for both program correctness and efficiency. (The proxy manager caches interface proxies on a per-identity basis to avoid making extra round-trips for redundant QueryInterface or AddRef requests.)
      Another elegant aspect of the COM remoting architecture is its use of lazy protocol registration. The DCOM wire protocol was designed to allow COM processes to postpone the use of a given communications protocol until necessary. When a COM process starts up, it only loads the code for using one (local) communication protocol. As remote clients need to connect to the process, the COM plumbing arranges to load the appropriate transport DLL and acquire an endpoint on the requested protocol. This reduces the buy-in cost for COM when used as a local component infrastructure (since the minimal amount of resources are consumed at CoInitializeEx-time). This also allows you to add new communication protocols (such as HTTP) to the COM runtime system without modifying existing COM code or requiring all COM processes to incur the overhead of multiple protocol use.
      No discussion of the COM remoting architecture would be complete without a discussion of network garbage collection. COM has it. Java RMI has it. CORBA's GIOP/IIOP does not have it. The COM network garbage collector simulates the semantics of IUnknown's inprocess reference model. If your object needs to be notified of client failure, you need it. If your object doesn't care about client liveness, it is an unnecessary expense. COM's wire protocol does a better job of managing remote liveness detection than most applications could do themselves by aggregating keep-alive traffic on a per-host basis (not a per-application or per-reference basis). However, if you want to turn off garbage collection, just enable NOPING on your object. This disables network garbage collection and suppresses any ping traffic that would be created for your object.
      The architecture just described is called standard marshaling, as it is the standard behavior for objects that do not care about remoting details. Provided an interface proxy and stub are available (which is trivial since the IDL compiler emits the source code for them), developers have to do absolutely nothing other than implement IUnknown. Additionally, since proxies are constructed on the fly by COM, clients do not need to statically know whether they are using a proxy or a real object. However, there is a class of objects for which standard marshaling would yield sub-optimal performance at best (or incorrect semantics at worst).
      COM defines the IMarshal interface to allow object implementors to replace the standard proxy and stub with a user-defined remoting mechanism. Objects that implement IMarshal are said to custom marshal. Custom marshaling allows an object implementor to inject a custom proxy transparently without the client's participation. Custom proxies are used to implement marshal by value, objects in shared memory, transparent multicast, fail-over/reconnect, and a variety of other powerful techniques that are impossible to implement using standard marshaling. While some CORBA-based systems allow replacement of the marshaling code for a given interface (sometimes called smart proxies), COM is unique in that the remoting behavior is polymorphically bound at runtime on an object-by-object basis—two references of identical type may be using custom or standard marshaling independently. This allows object implementors to safely evolve their remoting implementation based on performance needs without rebuilding client applications.
Apartments
      No topic other than security has caused programmers more pain than COM apartments, yet the notion of an apartment is one of the key enabling concepts for building a system of binary components. By looking at the Java component model, which does not have any similar concept, one can easily understand the problem. Consider the following Java method:

public void barkLikeADog(IDog [] rgdogs) { for (int i = 0; i < rgdogs.length; i++) { rgdogs[i].bark(); } }

What happens if two threads execute this function simultaneously? Since the barkLikeADog method is not synchronized, the two threads will execute the for loop concurrently. This means that the two threads may issue calls to the same object simultaneously. Since the interface IDog is abstract, each element in the array may point to different implementations of IDog, some of which were written with thread safety in mind, others which were not. This means that depending on the concrete type of the object, this code may or may not be correct. So, what do Java developers do? One solution is to do nothing, assuming that naïve implementors marked their bark method as synchronized.
      Unfortunately, Java methods are unsynchronized by default, which means that only developers who are somewhat naïve will remember to add the synchronized keyword. Completely naïve developers tend to forget this kind of thing. This is one reason the Java language allows clients to add synchronization to other people's objects using synchronized blocks. The following modified method illustrates this technique:

public void barkLikeADog(IDog [] rgdogs) { for (int i = 0; i < rgdogs.length; i++) { synchronized(rgdogs[i]) { // lock object rgdogs[i].bark(); } // unlock object } }

The problem with this method is that it assumes that all implementations of IDog are not thread-safe. This means that objects that could tolerate concurrent bark requests will never be given the opportunity, since the client must program to the lowest common denominator.
      The problem with the Java approach to components and threads is that it requires the client to guess what level of thread-awareness the object implementor may have had and program to the worst-case scenario. This limits opportunities for concurrency and can introduce bottlenecks in multithreaded code. The COM designers decided that thread-awareness was yet another implementation detail that the client had no business worrying about. This is the motivation for apartments.
      A COM apartment is a group of one or more threads in a process that can execute method calls. Threads that use COM must first enter an apartment by calling CoInitializeEx. All threads that call CoInitializeEx with the COINIT_ MULTITHREADED flag enter the lone multithreaded apartment (MTA) of the process. Each thread that uses the COINIT_APARTMENTTHREADED flag (or calls CoInitialize or OleInitialize) enters a new single-threaded apartment (STA) that no other thread will ever enter.
      By default, a COM object belongs to exactly one apartment, and only threads within that apartment can execute methods on the object. This means that objects that live in the MTA must be robust in the face of concurrent access. Conversely, objects that live in an STA do not need to worry about concurrent access, since only one thread will ever enter the apartment.
      In-process component implementors annotate which types of apartments they are compatible with by using the ThreadingModel registry entry. At activation time, COM ensures that your component is created and accessed in the correct type of apartment. If the client thread is in the wrong apartment type, COM will create your object on a COM-managed thread of the correct apartment type. When this happens, the client will get a proxy to the object that is appropriate for use in their apartment.
      This simple model allows non-thread-safe components to be accessed in a multithreaded manner, since COM will return a thread-safe proxy to MTA-based clients. This means that multithreaded clients never need to serialize method invocations on an object reference. If the reference points to a proxy, the proxy serializes the calls to the actual object by forwarding the request to the object's STA thread. If the reference points to an actual object, it must have marked itself as MTA-safe in the registry and therefore has taken deliberate steps to allow clients to access the object concurrently.
      While the notion of an apartment is fundamental to the COM programming model, COM provides facilities for bypassing the semantics of an apartment (such as the freethreaded marshaler and the global interface table) for objects that have special threading requirements. These facilities are optional and only to be used by developers with a strong understanding of threads and apartments. Nonetheless, even objects that use advanced facilities, such as the freethreaded marshaler, can be accessed safely by clients with disparate threading awareness.

Type Information is a Mess
      Now that I've gone over what's right about COM, let's explore some of the problems that plague the COM programming community.
      Today there are two somewhat imperfect ways to describe COM components: IDL files and type libraries. IDL is a reasonable format for use with C and C++, but it does force programmers working with Java, Object Pascal, and Visual Basic to think in terms of pointers and C-style syntax. Given the fact that most C-isms are expressible in COM interfaces (and therefore must be expressible in IDL), COM IDL is arguably better than inventing yet another syntax for describing structures, unions, and pointers. Undoubtedly, the primary problem with type information today is type libraries, not IDL.
      The idea behind type libraries is simple. A type library is an efficiently parsed binary file that contains one or more COM type definitions. While the type library file format is undocumented, COM provides sufficient infrastructure for creating and parsing type library files. Sounds great so far. Here is the problem: COM has two formats for type information—IDL and type libraries—and they are not interchangeable.
      COM IDL is based on OSF DCE IDL with COM-specific extensions. This is why IDL looks like C (blame OSF, not Microsoft). IDL is parsed by the MIDL.EXE compiler, which was originally used to generate RPC stubs for use by MSRPC (since COM uses MSRPC extensively, this wasn't such a radical idea). IDL files are hard to parse at runtime for a variety of reasons: they are ASCII text with a grammar that only a mother could love, and they often use the C preprocessor, which means they are extremely sensitive to include path and compiler switch dependencies. The first version of COM IDL (as it was parsed by MIDL 2.x) was designed primarily for describing COM interfaces in order to build proxy/stub DLLs.
      Type libraries, on the other hand, were created by the developers of Visual Basic to meet the needs of Visual Basic. The team designing Visual Basic took the basic syntax of IDL (which at the time did not support COM) and added new keywords, constructs, and attributes that were appropriate for their needs. The root of the problem is that the Visual Basic team removed core constructs from their private version of IDL (then called ODL) that left two incompatible languages for describing COM data types. IDL had constructs necessary for remoting that were absent in ODL (and in the resultant type library). ODL had constructs for describing implementations that were absent in the original IDL. While the current version of the MIDL compiler supports the union of the two feature sets, the underlying type library format was never extended to support the remoting characteristics in the original IDL (such as size_is and iid_is). When coupled with the fact that many Microsoft and non-Microsoft IDL files use hacks that do not make it into the type library (cpp_quote, for example), the type library cannot act as the sole type description for a large class of components due to its lack of fidelity.
      So why should you care? You should care because the lack of consistent type information holds back innovation both inside and outside of Microsoft. Look at Java's reflection package (java.lang.reflect) for an example of what can be done with full-fidelity type information. Java's reflection facility allows tool and plumbing developers to get at the complete description of a Java interface or class, down to the field level. Hacking up your own version of the JavaBean Juggler demo is trivial for even a novice Java programmer. This makes it possible for tool vendors to do a good job integrating JavaBean components into an IDE or a debugger. (Imagine how much more useful the #import facility in Visual C++ would be if type libraries were more reliable.)
      Full-fidelity type information also gives plumbing vendors hooks for working with method calls as if they were objects, which is useful for building infrastructure like MTS. In contrast to Java's reflection package, look at the definition of ITypeInfo, which is the closest analogue to COM. Not only is the ITypeInfo interface grotesque and arcane, it doesn't give you access to important pieces of information, such as the capacity of a C-style array. Without this full-fidelity type information, it is hard for both Microsoft and third parties to extend the state of the art.

IDispatch
      Can you possibly stand to hear another word about IDispatch? It feels arbitrary. It's slow. Many important data types aren't supported and probably never will be, even when Visual Basic finally gets around to redefining the VARIANT to support simple structures. It doesn't allow developers to take full advantage of interface-based programming, due to the one-dispinterface-per-object limitation. Its name resolution technique makes supporting overloaded method names extremely difficult. It forces object implementors to carry around boilerplate code to support something that ultimately 99 percent of objects never care about or do differently. Any questions?

Figure 2 Server-side Dynamic Invocation

Figure 2 Server-side Dynamic Invocation

      For a great example of how dynamic invocation should have been handled, look at either Java's reflection package or CORBA's DII/DSI infrastructure. The former makes it trivial for a client to build a method call on the fly solely on a text-based description of what the call should look like (this is what scripting engines and interpreters need to do). The object simply implements the interfaces that it needs to expose independent of dynamic invocation, and the reflection plumbing deals with forming the correct type of stack frame based on type information buried in the Java .class file. As far as the object is concerned, the call came from an early-bound compiled client.
      CORBA's DII/DSI does everything reflection does for dynamic invocation, in addition to providing server-side hooks for allowing object implementors to actually participate in method/type resolution. Both CORBA and Java reflect the assumption that dynamic method invocation is a client-side decision that most object implementors really don't care about. At the time of this writing, there was no easy way for an object to munge around with the server-side of dynamic invocation in Java (à la CORBA's DSI), but there's always version 1.3 of the JDK to solve this.
      The saddest part of this discussion is that one potential solution to the problem has been with us since 1993. Microsoft's type library parser knows how to do what the Java reflection package does; that is, create a method call based on some uniform generic API. Figure 1 shows the ATL implementation of IDispatch::GetIDsOfNames and IDispatch::Invoke. Like most modern implementations, ATL simply forwards these calls to the type library parser, where the low-level thunking of VARIANTs and DISPIDs down to stack frames and vtable offsets occurs. This technique is illustrated in Figure 2. There is no reason why clients that now use IDispatch (like the scripting engines) couldn't do the same thing on the client side against non-IDispatch-based interfaces, provided that sufficient type information is available.

Figure 3 Client-side Dynamic Invocation

Figure 3 Client-side Dynamic Invocation

       Figure 3 shows what this technique would look like if it were to be adopted by today's IDispatch clients. Determining the type information of an object at runtime is trivial given today's simple IProvideClassInfo interface. Arguably, all that is really needed for most objects is a CLSID and LIBID from which the client could then load the type library directly. Of course, this is not the way today's scripting engines behave, so we're still implementing IDispatch in every object known to man. The Java and CORBA architects were correct in pushing this responsibility onto the client, not the object. One could argue that IDispatch should be akin to IMarshal—that is, a completely optional interface implemented only by the one percent of objects that have special needs, not by 100 percent of the objects that want to interoperate with brain-dead client environments.

C++ Language Mapping Needs an Overhaul
      When COM was released in 1993, C++ was a very different language. Templates and exceptions were in their infancy (or unavailable if you used the Microsoft C++ compiler). Commercial compilers that supported runtime type identification or namespaces were fairly rare. Given the state of the art in C++ compilers, the original language mapping was not that unreasonable. However, C++ has evolved considerably over the past five years, but the C++ language mapping for COM has remained fairly static.
      One reason that C++ mapping in COM hasn't kept up with the programming language is that the COM team has been consumed with providing new functionality, not making it easier to use the existing functionality from low-level languages like C++. While the Visual C++ team has been quite busy building COM-awareness into their IDE, frameworks, and compiler, this is like taking an aspirin to soothe a brain tumor. It's great that Visual C++ makes COM development easier (may it always be this way). However, COM itself should have gone the extra mile to make it easier for people to program COM from any C++ environment, not just Microsoft's.
      Consider the Visual C++ __uuidof extension. The idea of associating the COM typename with the C++ typename is fantastic. The traditional IID_ and CLSID_ prefixes used to name GUIDs make it difficult to extract the COM typename of a reference without using token pasting or other text-based munging. With __uuidof, one can define very type-aware code such as the following template:

template <typename PItf> struct _com_cast { PItf p; _com_cast(IUnknown *pUnk) { if (FAILED(pUnk->QueryInterface(__uuidof(p), reinterpret_cast<void**>(&p))) p = 0; } _com_cast(PItf pItf){(p = pItf)->AddRef();} operator PItf(void) const { return p; } };

Given this template, one can write the following code:

bool TryToMeow(IDog *pDog) { HRESULT hr = E_NOINTERFACE; ICat *pCat = _com_cast<ICat*>(pDog); if (pCat) { hr = pCat->meow(); pCat->Release(); } return SUCCEEDED(hr); }

Note that this code does not suffer from the common type mismatches that QueryInterface's void ** parameter would normally allow.
      So what's the problem with __uuidof? For one, it requires header files compiled with MIDL version 3.01.75 or later. If you only have the C++-based interface definitions but not the original IDL files, you must explicitly add the __declspec(uuid) attributes by hand (see comdef.h for an example of how this is done). Perhaps more problematic, __uuidof is a Visual C++ language extension, not part of COM. If you write code that uses __uuidof, you are married to using Visual C++ 5.0 or greater from now until the end of time. No amount of macro magic will dig you out of this dependency. Had the COM team elected to provide a standard way to associate COM and C++ typenames, the Visual C++ team would never have been compelled to add the useful but proprietary __uuidof. Ideally, the system headers would contain the following generic template definitions:

template <typename Itf> class comtype {}; template <typename Itf> const IID& uuidof(Itf *) { return comtype<Itf>::uuidof(); } template <typename Itf> const IID& uuidof(Itf **) { return comtype<Itf>::uuidof(); }

Given these templates, the MIDL could then emit the following template specialization along with the standard C++ interface definition:

extern const IID IID_ICat; struct ICat : public IAnimal { virtual HRESULT STDMETHODCALLTYPE meow() = 0; }; class comtype<ICat> { public: static const IID& uuidof() { return IID_ICat; } typedef IAnimal _BaseInterface; };

Assuming this standard type infrastructure was in place, one could easily rewrite the com_cast template as follows:

template <typename PItf> struct com_cast { PItf p; com_cast(IUnknown *pUnk) { if (FAILED(pUnk->QueryInterface(uuidof(p), reinterpret_cast<void**>(&p))) p = 0; } com_cast(PItf pSame) { (p = pSame)->AddRef(); } operator PItf(void) const { return p; } };

Note that the only change was the use of the uuidof template function as opposed to the Visual C++-specific __uuidof.
      The hypothetical C++ interface definition shown previously also provides a mechanism for determining the base type of a COM interface. Given such a facility, it would be trivial to construct a template class that implements IUnknown using fairly simple syntax:

class DCB :public com_implements<IDog,ICat,IBird> { STDMETHODIMP bark() { return S_OK; } STDMETHODIMP meow() { return S_OK; } STDMETHODIMP chirp() { return S_OK; } };

      Because the entire type hierarchy is available to the compiler, it is possible to build a template that traverses the inheritance hierarchy in its implementation of QueryInterface. Note also that because the IIDs are intrinsically linked to the C++ typenames, no additional MFC/ATL-style interface map is needed. (Maintaining these maps is a common source of errors among novice COM developers.) Assuming a similar mapping of COM class names to CLSIDs, constructing an in-process server could become as simple as this:

com_export<Dog, Cat, Bird, DogCat, DCB> classes; DLL_ROUTINES(classes)

      Of course, to get the self-registration right, you'd need higher fidelity type information to capture things like ProgIDs, component categories, and threading models. You could hack in this additional information via custom IDL attributes, but it would be nice if COM could provide the corresponding registration code as part of the infrastructure.

Deployment
      COM relies on the registry. Because every developer and tool vendor must implement self-registration using low-level Win32® registry APIs, it is difficult for COM to guarantee the integrity of the registry. Today, the integrity of the registry decays as users install, uninstall, and upgrade applications and components until finally the user is forced to reinstall the operating system and all applications. In an ideal world, the COM library would provide component registration based on type information. Putting registration under the control of the system would improve configuration hygiene considerably.
      Additionally, because COM offers no configuration infrastructure beyond the low-level registry API, it is difficult to install or configure COM components on remote machines. While remote configuration management of MTS components is possible using MTS Explorer or the MTS catalog interfaces, this functionality is not available to all components today. Hopefully, a future version of COM will remedy this situation.

Security
      The COM security API isn't that bad. Really. Once you understand it, calling the right API at the right time is actually fairly easy. Unfortunately, there are two factors that make COM security problematic. One factor is the common bias against security programming in general. This one is not COM's fault. The other factor is the somewhat arbitrary behavior of COM security due to random system configuration problems. Again, this one may not be entirely COM's fault. Due to the standard configuration of a computer running Windows NT®, remote access to COM components requires a rock-solid security configuration, which is not true for many installations of Windows NT.
      Today's COM makes it necessary for the developer to get all of the security bits correct through manual labor. This means ensuring that accounts are valid on all machines, configuring server processes to run as valid users, managing DACLs, and so on. None of this is especially hard, but much of it must be done at deployment time (by someone other than the developer) or the application will not work properly. Hopefully, as the deployment story of COM gets better, this problem will fade into a distant memory.
      Not all of these COM security woes are due to configuration problems. There is one particularly problematic issue with today's security model that is intrinsic to the current architecture: process-level granularity of access control and authentication thresholds. This coarse granularity makes callbacks very difficult in distributed applications, since the client must be configured properly to allow incoming calls from the server process (which is often running as a different security principal that may or may not be valid on the client machine). Often, the client is a Visual Basic-based application that cannot call CoInitializeSecurity, making callbacks virtually impossible. While there are standard workarounds (such as creating a middleman process that calls CoInitializeSecurity), the situation is far from ideal.

Oversimplification in the Tools
      Programmers using COM need to know the difference between an interface and a class. Period. No amount of dumbing down of the development tools will make this requirement go away. All three Microsoft development tools have tried—to a greater or lesser degree—to undo the advances made by COM by treating interfaces as second-class citizens. For example, the current beta release of Visual J++™ 6.0 assumes that a Java-implemented COM class will implement only one massive interface based on the public methods of the class. While there is a COM Classes property page for choosing which Java classes will make it into the generated type library, there is no corresponding COM Interfaces property page for exporting Java-defined COM interfaces. This is representative of the one-interface-per-object philosophy that permeates other tools as well.
      While the Visual C++ Class View does a good job of showing the relationship between interfaces and classes, the ATL Object Wizard assumes that every COM class will implement exactly one interface that is specific to that class. Granted, while it is trivial for an experienced COM developer to modify the generated IDL and C++ after the fact, it would be nice if Visual C++ at least gave you the option not to generate an interface definition for each COM class.
      Not surprisingly, Visual Basic is the worst offender in terms of oversimplification, bar none. Visual Basic pretends that interfaces don't exist by calling them "Public Non-Creatable Classes." On a similar note, Visual Basic assumes that the default interface of imported COM classes should be hidden from the developer in the IDE's object browser and IntelliSense®-based lists, instead of allowing the component implementor to make the decision herself using the [hidden] attribute. Visual Basic also has a problem implementing interfaces that don't derive directly from IUnknown or IDispatch. Perhaps worst of all, the management of GUIDs in Visual Basic-based servers is enough to make even the most dyed-in-the-wool Visual Basic jockey embrace IDL.
      While Visual C++ and Visual J++ each have minor issues that make certain common-case scenarios inconvenient, at least it is possible to achieve virtually anything the underlying programming languages are capable of by editing some IDL or source code. However, because so much of the COM support in Visual Basic is buried in the IDE and VM, it is considerably more difficult to do hardcore COM development using Visual Basic.

The Next Five Years
      The basic interface-based programming model used by COM has managed to emerge as the predominant programming paradigm for building distributed component-based systems. By all accounts, COM has been a smashing success when measured by market penetration, capabilities, and utility. Most of the major problems with COM today are related to type information and deployment, both of which are completely fixable. I am hopeful that Microsoft will ensure the future success of COM by addressing these critical issues that affect the COM programming community at large.

The views expressed in this article are the personal opinions of Don Box and should not be construed as the opinions of Microsoft Corporation.

Have a question about programming with ActiveX or COM? Send your questions via email to Don Box: dbox@develop.com or http://www.develop.com/dbox.
From the July 1998 issue of Microsoft Systems Journal.