How Microsoft Transaction Server Changes the COM Programming Model --MSJ, January 1998

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

January 1998

How Microsoft Transaction Server Changes the COM Programming Model
David Chappell

Microsoft Transaction Server (MTS) isn't magic, but it does let you write simple, COM-based servers that are still powerful and scalable. MTS provides a range of services to those apps, including thread management, efficient object activation policies, and support for transactions.

This article assumes you're familiar with COM

David Chappell (david@chappellassoc.com) is principal of Chappell and Associates, an education and consulting firm in Minneapolis, MN. He is the author of Understanding ActiveX and OLE (Microsoft Press, 1996). His next book, due out in 1998, will describe Windows NT distributed services, including MTS.

The Component Object Model has changed everything. COM's benefits—well-defined interfaces, local and remote transparency, and much more—allow developers to use it in all kinds of software. But while it solves many of the problems developers have faced, it's arguable whether COM makes creating software a whole lot easier. In particular, building powerful COM servers can be daunting.
Of course, some COM servers are very simple. It's pretty easy, for example, to write a server that never has more than one client at a time, doesn't share information with other servers, and doesn't make changes to persistent data, such as records in a database. This server can be single-threaded and doesn't have to worry about how long it holds database locks. Since it doesn't change anything except its own internal state, this simple server can ignore the possibility of leaving corrupted data behind after an unexpected failure.
It would be great if all COM-based servers could be as simple as this. But servers this simple often don't fit the bill. Solving real problems frequently requires building servers that handle many simultaneous clients, effectively share data with other servers, and atomically make changes to one or more databases. Writing servers that meet these requirements can get really hard.
But suppose developers had a way to write simple servers—servers that support only one client at a time, don't need to worry about committing multiple database changes, and so on—yet somehow could magically use them to handle many clients, atomically change multiple databases, and more.
This is exactly what the Microsoft® Transaction Server (MTS) allows. MTS isn't magic, but it does let you write simple, COM-based servers that are still powerful and scalable. MTS provides a range of services to those applications, including thread management, efficient object activation policies, and support for transactions. By providing all these things automatically, MTS shoulders much of the burden in developing powerful, scalable COM servers. For a general introduction to MTS, see "Microsoft Transaction Server Helps You Write Scalable, Distributed Internet Apps" in the August 1997 issue of MSJ.
MTS is built entirely on COM. In many ways, it's just an extension to the COM you already know and love. But MTS also brings some changes to the traditional COM programming model. Using MTS effectively requires understanding what those changes are and how you can take advantage of them. The goal of this article is to explain how MTS changes the familiar COM programming model and how you can take advantage of those changes to create powerful applications more easily.
I know that some of you are thinking, "I don't need no stinkin' MTS—I'll just write my COM servers like I always have." There are a couple of reasons why you should shake this bad attitude. First, MTS will make your life simpler—really—so using its services makes a lot of sense. It's a standard part of Windows NT®, so there are no extra checks to write. Plus, MTS is actually just an extension of COM (and like COM, it's becoming part of the operating system), so the integration between the two will only get stronger over time. In other words, the changes that MTS brings to the traditional COM programming model are probably unavoidable. The time to start understanding them is now.
The Five Rules of MTS Programming
There are five key changes that MTS makes to the traditional COM programming model, most of them affecting the server. Each one represents a rule you'll need to follow to get the most out of MTS. I'll describe all of them in detail, but let me list them up front. The five rules are:
1. Servers should call SetComplete as often as possible, indicating that they no longer have state to maintain. Doing this helps servers scale well.
2. Clients should acquire and hold onto interface pointers, even for COM objects that they won't use until much later. Getting a reference to an object can be costly, but if the object is stateless, holding it is free with MTS.
3. Servers should get resources such as database connections as late as possible, then give them up as soon as possible. MTS makes acquiring server resources cheap.
4. Servers should use roles and declarative security. The traditional approach—impersonation and ACLs with an entry for each user—just doesn't scale well.
5. Servers should use transactions wherever appropriate. Transactions automatically ensure that either all changes made by an application succeed or none of them do.
Calling SetComplete
What is SetComplete, and why would I want to call it? An MTS application consists of one or more MTS objects, each of which is really an in-process COM object. As shown in Figure 1, these MTS objects are loaded into a process with the MTS Executive, implemented in mtxex.dll. This DLL and the DLLs that provide the code for the application's MTS objects can be loaded into the same process as the client, but this isn't common. Instead, the MTS Executive and the application DLLs (commonly referred to as MTS components) are typically loaded into the simple MTS-provided container mtx.exe.
A client accesses the MTS objects like any other COM object—MTS is invisible to clients, and there's no MTS-specific code on the client—but the client's calls are actually intercepted by MTS before being passed on to their destination object. Because of this, I'll typically show the MTS Executive as sitting between the client and the MTS objects that client is using. And while it's possible for an MTS server and its client to run on the same machine, it's more common for them to be on different machines, with the client accessing MTS objects via DCOM.
To create an MTS object, a client can call CoCreateInstance (from C++) or CreateObject (from Visual Basic®) or some other COM creation function, just as with any COM object. MTS catches this request, however, and does a few special things with it. The most important of these is that, for every MTS object it creates, MTS also creates an associated context object. This context object is (of course) just another COM object, one that supports the IObjectContext interface. The methods in this interface are the primary means that an MTS object has to communicate with the MTS Executive. To call those methods, an MTS object must acquire a pointer to the IObjectContext interface of its context object. Doing this is easy: the MTS object just calls the MTS-supplied API GetObjectContext, as shown in Figure 2. This function returns a pointer to the IObjectContext interface of the appropriate context object.
IObjectContext contains several methods, the most important of which are SetComplete and SetAbort. When an MTS object calls either of these, it's telling the MTS Executive a couple of different things. First, if the object is part of a transaction, this object's work is ready to be committed (if SetComplete was called) or rolled back (if SetAbort was called). I'll say more about transactions later, but whether or not transactions are in use, by calling either of these methods the MTS object is telling the MTS Executive that the object no longer has any state to maintain.
Think about this for a minute. Having no state to maintain means that the object really doesn't need to exist any more—you can just conjure up a new instance when it's needed again. Getting rid of objects that aren't needed right now means that the process those objects are in can take up less memory. For servers that must scale to handle hundreds or thousands of clients, this is a good thing. But at the same time, you don't want each object's client to know about its demise—the client should remain blissfully unaware that its object has gone away. Doing this allows you to build scalable servers while still preserving the appearance of the classic COM programming model for the client.
To make this possible, the MTS Executive wraps every MTS object as shown in Figure 3. When an MTS object calls SetComplete or SetAbort, MTS releases every interface pointer it holds on this object. Since only MTS actually holds direct pointers to this object's interfaces, this causes the object to self-destruct, just as with any COM object. As shown in Figure 4, the client still holds what it thinks is an interface pointer to the object. In fact, this pointer references the MTS-supplied wrapper. In the jargon of MTS, the object has been deactivated. MTS can now reuse any resources the object maintained, including the memory that was taken up by its data.
When the client invokes another method on the object (remember, the client doesn't know that it's gone), MTS just uses the appropriate class factory to create another instance of the object, then passes the client's call on to this instance. Known as Just-In-Time (JIT) activation, this causes the situation to return to that shown in Figure 3. Since the object wasn't maintaining any state, this newly created instance is indistinguishable from the one that was destroyed earlier—the client can't tell the difference. Calling SetComplete as often as possible allows MTS to get rid of objects that are no longer needed. Doing this means that your application will scale much better, but it doesn't really affect your clients—MTS shields them from what's going on.
Some observers have criticized MTS for requiring the creation of stateless objects. Obviously, this criticism isn't correct. Instead, MTS allows developers to manage object state in an intelligent way. The simplest way to understand this is to look at an example. Imagine, for instance, an MTS object that implements the interface IOrderEntry. This interface's three methods—Add, Remove, and Submit—allow the object's client to create, build, and submit orders for products. Here's the psuedocode for this interface's (very simple) Add method:
Add(Item){ If (NoOrderExists) CreateOrder() Lookup(Item) If (Available) AddItemToOrder(Item) DecrementInventory(Item) Else Error() }
If no current order exists, the first call to this method creates one. Next, the database is checked to make sure that the requested item exists. If it does, the item is added to the current order and the current inventory count for this item is decremented in the database. Since this method does not end with a call to either SetComplete or SetAbort, the object isn't deactivated, so its state—the in-memory order being built—is available for subsequent calls.
Here's the Remove method:
Remove(Item){ If (ItemInOrder) RemoveItemFromOrder(Item) IncrementInventory(Item) Else Error() }
It also is very simple—the object just checks that this item is actually in the current order, then deletes it and increments the inventory count in the database. Once again, the method doesn't end with a call to SetComplete or SetAbort, so the object and its state are maintained.
When the client calls Submit, things get more interesting:
Submit(){ Context = GetObjectContext() If (EverythingIsOK()) SubmitOrder() Context.SetComplete() Else Context.SetAbort() }
This method begins with a call to the MTS API function GetObjectContext, which returns a pointer to the appropriate context object. The object then submits the completed order—exactly how doesn't matter here. What does matter is that if all went well, this method's final task is to call SetComplete. If anything failed, the method calls SetAbort. I'll talk later about how these calls affect any transaction this object is part of. What's relevant here is that they cause MTS to call Release on every interface pointer it holds on the object. The object and any state that it is maintaining go away. The next time its client calls Add, MTS will transparently create a new instance of the object.
Implementing IOrderEntry in this way is entirely plausible, and an application built like this would scale reasonably well. A client could use what appeared to be the same object to create and submit many orders—it wouldn't need to create a new one each time—but the server would maintain an object only when an order was in progress. An even more scalable design, though, would be for the object to call SetComplete at the end of every method, not just when the order was submitted. For this to work, the Add and Remove methods would need to read the current order (if there was one) from disk, perform their functions, then save the order back to disk at the end of each call. Once the order was safely on persistent storage, each method could end with a call to SetComplete.
Building the object in this way would allow MTS to deactivate it after every call, making it even more scalable. The trade-off, of course, is the extra disk accesses required in Add and Remove. There are cases, though, where the increase in scalability matters more than the performance hit. It may even be desirable to save the order persistently after each change. Microsoft Merchant Server, for example, does exactly this, allowing a Web-based client to begin creating an order, lose and reestablish its network connection, then pick up where it left off.
The important point is that MTS allows both options. When designing MTS-based servers (something you will do—trust me), you must determine how to manage your state. The rule that MTS encourages is simple: call SetComplete as often as possible.
Acquiring and Holding Interface Pointers
This rule is a reasonably obvious corollary to the one just described. In a bare COM or DCOM application, it may not be a great idea for a client to create a COM object and hold onto the object's interface pointers if it has no intention of using the object for a while. Doing this requires a traditional COM server to maintain that object in memory even though it's not being used. But with MTS, holding onto references to a stateless object is free—MTS deletes the object anyway as soon as it has no state to maintain. While acquiring interface pointers to objects is still a relatively expensive operation, holding onto them is not.
Although most of the changes MTS brings to COM programming are on the server side, this is a good place to think about how MTS affects clients. I said earlier that MTS is invisible to clients, which is for the most part true. There is a subtle way, though, in which the semantics of MTS objects leak through to their clients. Imagine a Visual Basic client that's setting various properties in an MTS object. The client may think it's working with an ordinary COM object—all the language syntax is exactly the same. But suppose that after setting several properties in the object the client invokes a method in the MTS object that calls SetComplete. If this happens, the object will be deactivated, as described above, losing all of its state. If the client isn't aware this has occurred, it may be very surprised. The client might, for example, attempt to get the value of some property it has just set in this object, only to discover that that value has been reset to its default. If the client doesn't know which methods in an object's interfaces call SetComplete, it might be confused by the object's behavior. This in turn implies something else: the client must know that this is an MTS object, not just a vanilla COM object.
And that's not all. Remember COM's rule that making any change to an interface requires giving it a new interface ID (IID), a new GUID? We all know that, for example, adding a new method to an existing interface requires defining an entirely new interface with its own IID. But with MTS, it's possible to make changes to an interface's semantics that require a new IID without changing anything in the interface definition itself. For instance, in the IOrderEntry example just described, there were two possible ways of implementing the interface's methods: calling SetComplete only at the end of the Submit method or calling it at the end of every method. The definition of IOrderEntry itself would be exactly the same in both, but distinct IIDs would be required for each of these two cases.
Changing only which methods call SetComplete also changes the behavior of that interface as seen by a client. If this is done, that new interface must be assigned a new IID, even though it is syntactically identical to the earlier interface. It might be nice if there were some Interface Definition Language (IDL) keyword to indicate which methods call SetComplete, thus providing an obvious marker in the interface definition of what has changed. There isn't, though, so you just have to understand what's going on and follow the rules.
Servers and Resources
The last rule allowed clients to be lazy—just getting whatever interface pointers are needed, then letting MTS worry about efficiently managing the objects those pointers refer to. By contrast, this rule requires servers to be better citizens than they have been up to now.
Traditionally, acquiring a server resource such as an ODBC connection to a database has been an expensive operation. Accordingly, developers would write COM objects to acquire a resource early, then hold onto it for the life of the object. While this made the application run faster—it didn't need to incur the cost of acquiring the resource over and over—it also didn't scale very well. ODBC connections are a finite resource on a machine, and if every object acquires and holds onto its own for the object's entire lifetime, there might not be enough to go around. But the alternative—acquiring, using, and then freeing a connection every time it was needed—has historically been too slow.
By defining the notion of a resource dispenser, MTS fixes this. A resource dispenser provides an efficient way of acquiring and freeing shared, non-persistent resources on a system. The most important resource dispenser in the MTS world (today, at least) is the one that allocates ODBC connections. Implemented by the ODBC 3.0 driver manager, this resource dispenser maintains a pool of available ODBC connections—and since the newer ADO and OLE DB interfaces typically use ODBC to access relational databases, everything described here applies to their clients, too. When an MTS object requests a connection to the database, the resource dispenser hands it one from this pool. When the object releases the connection, the resource dispenser returns it to the pool. All of this is transparent to the ODBC client—it makes the same calls as always.
Because connections to the database aren't really allocated and freed, acquiring and releasing them is much faster. An MTS object can afford to ask for a connection only when it needs one, use it, then immediately give it back. The performance penalty of doing this is greatly reduced by the resource dispenser's connection caching. An object might, for example, request a database connection at the beginning of each method, use it, and then release it. Writing objects that behave in this way allows those objects to scale much better since they can more effectively share scarce resources.
Server Roles and Declarative Security
When a client invokes a method in a COM object, that object may need to verify that the client has the right to execute that method. If the method accesses some other resource, such as a file, the object may also need to verify that the client has the right to use that resource in the desired way. COM and DCOM servers can use the methods in IServerSecurity to learn about the client, then do whatever is necessary to determine whether the client is authorized to carry out the requested function. One common technique is to have the server thread that's handling this client request impersonate the client, then try to access the requested resource. Assuming the server operating system is Windows NT, the resource's ACL will be checked automatically, saving the programmer the trouble of doing it herself. If the access works, great. If not, the call can be rejected.
Does anybody really like this approach to authorization? I'm inclined to doubt it. This approach is complicated to get right, and it doesn't scale well. The creators of MTS were among the non-fans of this approach, so they pro- vided an alternative way to let MTS objects make authorization decisions. It's called declarative security, and it's much simpler to use. (And for the hardcore among us, you can still use more detailed mechanisms if desired, an approach that's now known as imperative or programmatic security.)
Declarative security depends on the idea of roles. A role is just a collection of Windows NT users and/or groups assigned a particular character-string name. (Role names are unique only within a defined collection of MTS components known as a package—they aren't globally unique.) In a banking application, for example, some obvious roles might be Teller, Manager, and Loan Officer. The MTS administrator defines which users and groups are associated with each role by using the all-purpose MTS administrative tool, the MTS Explorer.
Once roles have been defined, the administrator can also use the MTS Explorer to specify exactly which roles are allowed to access each MTS component. It's even possible to assign role-based permissions on a per-interface basis. For instance, you can allow access to an interface on some object to Managers but not Tellers. When an MTS object is executing, MTS itself will check every incoming call on every interface, determining what role the caller is in. If the caller is not in a role that has access to this object or interface, the call is rejected. Otherwise, the call is passed through to the object.
Notice what the object itself must do to make this work: nothing. You don't need any code for security in an MTS object that's using declarative security. Once an administrator has set things up, MTS itself blocks unauthorized calls.
Of course, there are some things that you just can't do this way. Suppose my object wants to allow Tellers to make transfers under $10,000, but let Managers make transfers for any amount. If I want the same code to handle both cases, I can't rely only on declarative security. Instead, I'll need to include code to check what role the client is in, then make an appropriate decision. To support this, the IObjectContext interface (the same one that includes the SetComplete and SetAbort methods) provides a method called IsCallerInRole. Using this, I can make MTS check what role the current caller is in. If the caller is a Manager, I can allow a transfer of any size. If the caller is a Teller, however, I can reject transfers over $10,000.
Roles are much simpler to use than per-user authorization checks. They're not the only option when using MTS, but whenever feasible, they're the best choice.
Server Transactions
It's probably fair to say that "Microsoft Transaction Server" is a bit of a misnomer, since much of what MTS provides has nothing to do with transactions. Instead, a large part of the MTS functionality focuses on making it easier to build scalable COM servers. But support for transactions is a requirement for many kinds of servers, and it's certainly an important part of what MTS offers.
What exactly does it mean to "use transactions"? In general, a transaction allows you to group together two or more units of work into a single, indivisible unit. For example, in the order entry scenario described earlier, a record in the inventory database must be changed for each item the client requests. For a client ordering several items, several records will be changed. If the order is submitted, all of those changes must be made permanent. If the client cancels the order, all of those changes must be rolled back, leaving things just as if the order had never existed. Allowing some of the order's changes to persist while others do not would leave the data in an inconsistent state. Rather than worrying about this itself, an application can rely on MTS to ensure that either all the changes occur or none of them do.
But wait—can't databases do this themselves? Of course they can. If you're using ODBC to access a database, for example, you can use ODBC calls to start a transaction, make your changes, then commit or roll back those changes. The DBMS will make sure that either all of the changes happen or none of them do. But what if you're making changes to more than one database and you want all of those changes to be part of a single, atomic transaction? Those ODBC calls for transactions are no help here, since they only work on a single database. And even if your object makes changes to only one database, you'll be better off letting MTS take care of transactions for you rather than relying on the database directly (and I'll explain why shortly).
So when should you use transactions? It's easy: if your COM object needs to make two or more changes, such as updates in a database, and you want all of those changes to either succeed or fail—partial success isn't allowed—use transactions. If your object doesn't need to do this, then you'll still probably want to use MTS to write your server (it makes your life easier), but you won't need to use its transaction features.
Using transactions in MTS turns out to be surprisingly easy. When it's installed, an MTS component can have its transaction attribute set appropriately. You (or an administrator) can do this using the MTS Explorer. As shown in Figure 5, the transaction attribute is set using a simple dialog box. Choosing "Requires a transaction" or "Requires a new transaction" will guarantee that all data accesses made by this component will be committed or rolled back as a unit (and in some cases, "Supports transactions" will have the same effect). That's all it takes—you don't have to write any extra code to handle transactions. (To see what really goes on, though, see the "How Transactions Work" sidebar).
The way an MTS developer deals with transactions is simple. It's also quite novel, since it merges the idea of transactions with existing notions of components. Unlike traditional transaction systems where a client makes explicit calls to begin and end transactions, MTS hides transaction boundaries from clients. This is another example of how MTS strives to preserve the traditional client semantics of COM, even when transactions are being used. Even more atypically, MTS hides transaction boundaries from the MTS objects themselves.
The benefit of designing things this way is that the same component can run in its own transaction or be combined with others into a larger transaction. If each component made its own BeginTransaction and EndTransaction calls, this wouldn't be possible. By allowing MTS to automatically create transactions when required, it's possible to combine components in various ways. This allows you to create transactional applications from objects that were written by different organizations at different times, yet can still work together.
For example, imagine that I have a component that performs order entry functions, like the one described earlier, and another component that knows how to transfer money between two different bank accounts. Each component is useful on its own, and each requires a transaction (otherwise, the changes the component makes might wind up only partially done). But suppose I want to use both components in a single transaction, combining the order entry with actually getting paid for that order. With MTS, doing this is straightforward. Here's how it works.
Assume that both components have been configured (using the MTS Explorer, as shown in Figure 5) to require a transaction. A client creates an instance of the order entry component using CoCreateInstance. Since MTS intercepts this request, it can determine that this component needs a transaction, so it automatically starts one. Any changes the order entry component makes will be part of this transaction.
Now suppose that the order entry component creates an instance of the money transfer component (it can do this through a method called CreateInstance in IObjectContext). When MTS loads this component, it again notices that a transaction is required. But since the creator of this component is already part of a transaction, the new instance of the money transfer component automatically becomes part of this existing transaction. When the money transfer component completes its task, it will call either SetComplete or SetAbort, like any other MTS object. MTS takes note of this, but does not end the transaction. Instead, the transaction ends only when the order entry component, the root of the transaction, calls SetComplete or SetAbort. If both components called SetComplete, all changes made by the components will be committed. If either one called SetAbort, all changes will be rolled back. The important point here is that the very same component binary can run in its own transaction or can be grouped with one or more other components into a single transaction. Microsoft calls this feature Automatic Transactions. It allows you to combine the traditional idea of transactions with the much newer notion of components—something that's essential for building component-based servers. It also explains why, even if your component only accesses a single database, you don't want to use the ODBC transaction features when MTS is available. Using ODBC transactions means your component will explicitly start and end a transaction, making it impossible to combine with other components into a single transaction. If you rely on MTS for transactions, your component and other MTS components can be combined in arbitrary ways to create transactions. Using MTS allows you to create transactional components that can be used much more flexibly.
Using transactions with components also affects how you design those components. To be most useful, each component should encapsulate some discrete chunk of work. Doing this lets you combine that component with others in arbitrary ways. Adding transactions means those useful chunks of work shouldn't span transaction boundaries. For example, think about an application that must submit items of work to a queue, then process those items. It would be entirely possible to create one component that does both tasks, but it might also make sense to handle each of these functions—submitting work and processing it—in separate transactions. If both tasks are implemented in a single component, doing them in separate transactions isn't possible. Think about transaction boundaries before you design your components—the result will be more useful components.
Conclusion
The easiest way to think about MTS is to view it as just an extension to COM. This makes learning to use it easier, since COM has become an ingrained part of development. But MTS also brings a few changes to the traditional way COM programmers have done things. Those changes bring benefits, but change can also be painful. Following the rules described here will help maximize the benefits and minimize the pain.
From the January 1998 issue of Microsoft Systems Journal .