ActiveX Q&A, MSJ March, 1998 - State Management in MTS

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

March 1998

Don Box is a co-founder of DevelopMentor where he manages the COM curriculum. Don is currently breathing deep sighs of relief as his new book, Essential COM (Addison-Wesley), is finally complete. Don can be reached at http://www.develop.com/dbox/default.asp.

Q Everyone says Microsoft Transaction Server (MTS) is a stateless programming model. Is this true?
Arthur Lane
The Himalayas

      People tell me that MTS scales because all MTS objects are stateless. How can an object be stateless and still be an object?
Shannon Ahern Ikeda
Osaka, Japan

      I hear that MTS-based applications are scalable due to MTS support for object pooling. How is this so?
Anita Lichtenberger
Green Bay, WI

      The scuttlebutt indicates that MTS won't scale without object pooling. Also, I've read that if my objects have state, they cannot scale. I am confused!
Helena Hathaway Hearings
Plano, TX

A AAAHHHHHRRRGGGHH! The madness must stop right here!
      No technology since the dawn of COM has been more misunderstood than MTS, and no one aspect of MTS has been more misunderstood than the issue of state. It would be easy to point fingers at sources both inside and outside Microsoft, but since COM is about love and acceptance, I would rather just use this column to put these misconceptions about MTS and state management to rest.
      First, let's review what an object is. By most people's definition, an object is state + behavior. This is true in most object-oriented languages, and is true in the COM programming model. This is also true in the MTS programming model. Consider the following C++ class:

class Dog { int m_nHairs; public: void Shed(void) { m_nHairs -= 100; } };

The m_nHairs data member represents the state of this object. The Shed method represents the object's behavior.
      Historically, COM has been a state-ignorant programming model. COM knows only about behavior. Period. Your object's behavior is represented by one or more interfaces. Because of its interface orientation, COM is very behavior-aware. In fact, COM is infatuated with your interfaces (and the behavior they represent). COM will wrap its loving arms around them, allow remote access to them, protect them from concurrent access, secure them, and allow them to be accessed from virtually any language under the sun. If this isn't love, I don't know what is.
      However, COM is completely ignorant of the state your object may need to accomplish its behavior. I defy you to find one COM system call or interface that explicitly deals with your object's state. You cannot, because the entire COM API is expressed in terms of interface pointers (behavior). This does not mean COM objects are stateless. It simply means that COM provides no explicit support for managing the state of your object. It is up to you, the object implementor, to decide when, where, and how various parts of your state will be implemented. Therein lies the problem.
      As I tried to outline in my sidebar to David Chappell's excellent article in the January 1998 issue (How Microsoft Transaction Server Changes the COM Programming Model ), there are many difficult problems you must face when building a COM-based application. Most of these problems stem from the fact that many concurrent processes/threads/users may want to access your application at any given time. In the face of concurrent access, you obviously must protect your state from corruption due to multithreaded updates. To help facilitate concurrency management, COM currently supports two types of apartments, the single-threaded apartment (STA) and the multithreaded apartment (MTA). Unfortunately, neither of these apartment types makes it easy to build an application that can safely scale to large numbers of users.
      When it comes to building scalable applications, many developers naïvely believe that the MTA is the right answer. However, MTA programming is much harder than you might be led to believe. Consider the following interface that models a Dog:

interface IDog : IUnknown { HRESULT Chase([in] ICat *pCat); HRESULT YelpInFear( ); }

Also assume that the following interface models a Cat:

interface ICat : IUnknown { HRESULT OnPursue([in] IDog *pDog); HRESULT SwallowHair( ); HRESULT CoughUpFurBall( ); }

One reasonable implementation of IDog::Chase might be:

STDMETHODIMP Pug::Chase(ICat *pCat) { pCat->OnPursue(this); }

However, if this code is expected to run in the MTA, it must ensure that it is guarded against concurrent access. The most common technique is to use a Win32® critical section.

STDMETHODIMP Pug::Chase(ICat *pCat) { EnterCriticalSection(&m_cs); pCat->OnPursue(this); LeaveCriticalSection(&m_cs); }

However, if the Cat's implementation of OnPursue calls back to the Dog

STDMETHODIMP Tom::OnPursue(IDog *pDog) { EnterCriticalSection(&m_cs); pDog->YelpInFear(); LeaveCriticalSection(&m_cs); }

the YelpInFear method must be careful not to reacquire the critical section. If it attempts to reenter the critical section

STDMETHODIMP Pug::YelpInFear(void) { EnterCriticalSection(&m_cs); this->m_nHairs -= 1000; LeaveCriticalSection(&m_cs); }

this method will cause a deadlock since the callback to YelpInFear will execute on a different thread than the Chase method (which is where the lock is currently held while waiting for ICat::OnPursue to return).
      The deadlock that occurs in the previous example is due to the fact that the locking mechanism (a Win32 critical section) has no COM awareness. That is, the EnterCriticalSection API cannot detect that the second thread is actually part of the original activity that acquired the lock and that no harm would be done by letting the second thread pass. Unfortunately, there is no easy way to build such a lock using currently documented COM and Win32 API functions.
      Given the hazards of the MTA, one might naturally turn to the STA for solace. Since objects that run in an STA have thread affinity, any OS locks they might acquire can be safely reentered during a nested call back into the object. However, the thread affinity of STA objects requires a dedicated thread per apartment. This means that it is impossible to dedicate an STA to each object without severely hampering scalability (dedicated threads are expensive). While it is possible to build a scalable STA-based application, doing so requires a nontrivial amount of explicit multithreaded programming to ensure that the application's objects are spread across multiple STA threads. While a simple thread pool as used in the OLEAPT SDK sample is a good start, it is by no means a general-purpose architecture for building highly concurrent systems.

Figure 1 MTS State Hierarchy

Figure 1 MTS State Hierarchy

      MTS addresses this concurrency problem by introducing the concept of an activity. An MTS activity is a group of one or more objects that performs work on behalf of a client. MTS ensures that objects within an activity do not execute concurrently. In this respect, an activity is like an STA. However, the activity ID of an object is publicly visible to MTS-aware lock managers, enabling the creation of locks that have activity affinity, not thread affinity. Such a lock would have solved the deadlock situation described above. Today, MTS 2.0 implements activities inside of COM STAs, with the MTS executive managing a pool of STA threads that are used to house the objects from one or more activities. For a server process with a light load, each activity gets its own dedicated thread. As the demand on a server process increases, MTS starts to multiplex new activities onto existing threads from the pool to reduce thread pressure on the system. Future versions of MTS will do a more effective job of thread management when certain tools (OK, I'll say it—Visual Basic®) can generate objects that do not require thread affinity.
      This brings me back to the discussion of state. The MTS programming model forces the object implementor to be explicit about state management. Rather than simply declare an object's state entirely as data members, MTS implies a hierarchy for placing state in the most efficient location possible. Figure 1 illustrates this hierarchy. Looking at Figure 1, level 1 state (client-managed state) represents the state of an object that is cached on the client side of a connection. Level 1 state is often client-specific state and may or may not be seen by an object's methods.
      In classic COM, the extended properties added by an ActiveX® control container are a great example of level 1 state. When Visual Basic creates a visual control on a form, the Visual Basic runtime adds extended properties to the control, allowing programmers to reposition the control, assign tags, or change the tab order of the control. Sometimes level 1 state stays at the client side of the connection and is simply used for client-specific purposes. Other times, level 1 state is propagated to an object's methods via explicit parameters (for example, when drawing a control, its position is passed to IViewObject::Draw).
      Level 2 state (instance state) is what most developers think of as the state of their object. This state is represented by the instance data members declared in the implementation language (Visual Basic, C++, and so on) and is transient—that is, it does not survive a system failure. Although often allocated contiguously with the vptrs of an object, any private per-instance data referred to by an object's data members could be considered instance state as well. Level 2 state is very efficiently accessed from an object's methods. Additionally, MTS ensures level 2 state will never be accessed concurrently, so there is no need to worry about locking when accessing this state.

In the MTS programming model, transactional objects are not notified of the success or failure of a transaction. Rather, these objects must simply call one of the four IObjectContext methods described above, trusting the DTC and any participating resource managers to either make the state changes or ignore them.

      The primary downside of level 2 state is that it cannot be shared. Unlike classic COM, the MTS programming model is based on clients calling CoCreateInstance and creating new, private COM instances at activation time. The MTS programming model discourages (but does not prohibit) shared access to a single instance. One reason for this is that if a distributed application requires access to one particular instance, this would create a hot spot since the MTS model guarantees that all calls to a particular instance will be serialized. This means that singleton-style programming is more or less verboten under MTS. For most applications, this is not a limitation because you really need shared access to state, not shared identity.
      Level 3 state (shared, transient state) is where singleton-style state is stored. If two or more instances need to share state, it potentially needs to be protected from concurrent access. If both objects belong to the same activity, there is no problem because MTS guarantees that all calls within an activity are serialized. If the two objects belong to different activities (for example, they were created by two distinct clients), then an MTS-aware lock manager must be used. Today, MTS provides the Shared Property Manager (SPM), which allows reentrant access within an activity, but protects against concurrent access from objects in different activities. The SPM is one example of an MTS resource dispenser, which is a highly stylized allocator of shared resources.
      State stored at levels 2 and 3 is transient—that is, it does not survive system failure. Since most applications (distributed or not) need to be resilient to crashes, part of an object's state often must be saved to persistent storage, hence the need for level 4 state (persistent, shared state). For a distributed application built from components running on a variety of host machines, keeping the persistent store consistent requires considerably more effort than in a traditional Windows®-based application due to the partial failure modes that do not occur in a single-machine, single-process application. In the past, distributed applications have used the concept of a transaction to act as a synchronization point for ensuring that a collection of changes to the persistent store happens (or doesn't happen) atomically. The transaction gives the developer a model to program against that is conceptually fairly simple: make changes to the durable store; any errors will cause all changes within the transaction to be revoked.
      To formalize the concept of transactions, the infamous ACID properties were introduced in the 1980's by Härder and Reuter and are summarized below:
       Atomicity All state changes within a transaction are atomic—either all changes happen or no changes happen.
       Consistency A transaction is a correct transformation of state. The aggregate changes represented by a single transaction do not leave the state in a corrupt or inconsistent state.
       Isolation While two or more transactions may execute concurrently within the system, the net state change is as if all transactions executed sequentially.
       Durability Once a transaction successfully commits, any state changes will survive a system failure.
      MTS provides a runtime environment for transaction management. In particular, MTS provides the Distributed Transaction Coordinator (DTC) that makes sure one or more resource managers either commit or roll back changes made within a single transaction. Today, the most popular resource managers are SQL Server™ and Microsoft Message Queue, but more exotic resource managers are certainly possible.
      In an MTS application, level 4 state tends to be the primary focus. One reason for this is that only level 4 state survives a crash. A more important reason is that only level 4 state can have its changes rolled back when a transaction aborts. This does not mean that all state must be stored in a level 4 resource manager. It does mean the MTS programming model does everything it can to ensure that the ACID properties of a transaction are guaranteed. This is where the confusion often sets in.
      The MTS programming model is based on implicit transaction management. There are two types of MTS-based objects: transactional and nontransactional. The developer indicates which type of object his class supports using either MTS Explorer or IDL. Classes that are marked as "Requires a transaction" or "Requires a new transaction" always produce transactional objects. Classes that are marked as "Does not support transactions" always produce nontransactional objects. Classes that are marked as "Support transactions" produce transactional objects if they are created by a transactional object, but produce nontransactional objects if created by a nontransactional object or base client.
      When a transactional object returns from a method after calling either IObjectContext::EnableCommit or IObjectContext::SetComplete, it is telling the MTS executive that the current transaction can be committed at any time. When a transactional object returns from a method after calling IObjectContext::SetAbort, it is telling the MTS executive that the current transaction must be aborted and any changes to resource manager-managed state must be rolled back.
      When a transactional object returns from a method after calling IObjectContext:: DisableCommit, it is telling the MTS executive that the current transaction cannot be committed at this time. When a method returns after a DisableCommit, any attempts to commit the transaction will force the transaction to abort, and any changes to resource manager-managed state must be rolled back.
      Note that there is no explicit Commit method. Instead, the MTS programming model is based on passive consent. If no objects have aborted the transaction by calling SetAbort and no objects have disabled commitment via a call to DisableCommit, MTS will automatically commit the transaction when the client releases its object references. The top-level object can hasten this commitment by calling SetComplete, which, when called from the root of the transaction, causes the transaction to commit without requiring the client to release its references.
      In the MTS programming model, transactional objects are not notified of the success or failure of a transaction. Rather, these objects must simply call one of the four IObjectContext methods described above, trusting the DTC and any participating resource managers to either make the state changes or ignore them. This has a severe (but necessary) impact on the programming model. If a transactional object maintained intermediate results from a transaction, what would happen if these results were used in the next transaction? Consider the following method:

Sub IBroker_BuyShares(ByVal symbol as String, ByVal nCustID as Long, ByVal nAmount as Long) ' read current stock value ' debit customer account by nAmount * value ' debit share count by nAmount ' inform MTS that transaction can commit GetObjectContext.EnableCommit End Sub

      What would happen if the object cached either the stock price or the customer's updated balance in level 2 state (as an instance data member) for efficiency? Because the method called EnableCommit prior to returning, it is possible that the transaction would commit prior to the next method invocation. It is equally possible that the transaction would abort due to some other object deciding to call SetAbort. Unfortunately, this object will never be notified one way or another. If the transaction aborted, then the cached customer balance would be inconsistent with the durable store. This would violate the consistency property of the transaction. Even if the transaction were to commit successfully, another object could update the customer's balance in another transaction prior to the next method invocation, rendering the cached balance incorrect. This would violate the isolation property of the transaction.
      To ensure that transactional objects do not maintain state across transaction boundaries, MTS automatically destroys any instance state after a transaction either commits or aborts. Technically, the MTS context wrapper (which represents your object's COM identity to the outside world) releases its reference to the transactional object and creates a new instance at the beginning of the next transaction. This happens whether or not your object calls SetComplete or SetAbort. However, these routines perform aggressive resource reclamation and cause the context wrapper to release its reference immediately upon completion of the current method. Even if your object does not call either of these methods, if it is a transactional object, the context wrapper will release it and create a new instance at the beginning of the next transaction.

Another common misconception is that MTS will be radically enhanced by object pooling. MTS object pooling is based on reattaching a deactivated instance (that is, an object that has been released by the context wrapper) to another client based on its response to IObjectControl:: CanBePooled.

      The fact that MTS releases your instance at each transaction boundary has led many developers to assume that MTS objects are stateless. First, this recycling only happens for transactional objects (objects marked "Does not support transactions" do not get recycled automatically, although they can call SetComplete to achieve aggressive resource reclamation). Second, only one level of state is "lost" at transaction boundaries—that is, level 2 or instance state. Any state stored at the client, in a resource dispenser, or in a resource manager is still maintained. Be aware that storing cross-transaction state in the SPM or at the client requires attention to the ACID properties to ensure correctness. However, with sufficient care it is possible to cache certain transaction-invariant state without compromising data consistency or transaction isolation.
      Because instance state can be lost at virtually any time, it is common for transactional objects to expose interfaces that require the client to pass initialization information at every call. For example, this classic COM style interface

interface IBroker : IUnknown { HRESULT SetStock([in] BSTR bstrSymbol); HRESULT SetCustomer([in] long nCustID); HRESULT BuyShares([in] long nAmount); }

would break if the client used it as follows:

Dim bk as IBroker Set bk = new MTSBroker bk.SetStock "MSFT" bk.SetCustomer ID_LARRY bk.BuyShares 1000 bk.SetStock "IBM" bk.BuyShares 2000

If the transaction committed after the first call to BuyShares, the instance state (which would likely include the current customer and stock symbol) would be lost. When the second call to BuyShares is issued by the client, the object would have lost its notion of the current customer and would be unable to carry out the operation as expected. The only way the object could guarantee that its level 2 state would remain valid would be to call DisableCommit. However, this would require the client to call some distinguished method to enable the commitment after all operations are done. Because transactions abort automatically after some timeout (60 seconds by default), the overall series of operation must execute fairly quickly or else all was for naught. Given that lots of roundtrips are being performed, it would be fairly common for a transaction to time out using this approach.
      A more MTS-friendly version of the interface would be:

interface IBroker : IUnknown { HRESULT BuyShares([in] BSTR bstrSymbol, [in] long nCustID, [in] long nAmount); }

This interface requires the object to keep no level 2 state between transactions (or even methods in this simple case). This means the client is now responsible for holding the customer ID and stock symbol (in level 1 state) and propagating this state on every method invocation.

Dim bk as IBroker Set bk = new MTSBroker bk.BuyShares "MSFT", ID_LARRY, 1000 bk.BuyShares "IBM", ID_LARRY, 2000

      This is why Figure 1 showed a fifth level of state. Level 0 state (context) represents the implicit and explicit state that is visible during method execution. Part of the context of a method is the explicit parameters of the method. Other aspects of a method call include the security ID of the caller, any possible exceptions, and the current activity. Currently, COM and MTS do not provide any documented facilities for extending the context (although see my January 1998 column for a discussion of an undocumented extensibility point, the channel hook). Future versions of COM—namely, COM+—promise to provide richer facilities for context management.
      The fact that MTS destroys some objects between transactions has somehow been mistaken as the primary aspect of MTS that increases scalability. This is a common misconception that is simply not true in most cases. MTS destroys transactional objects at transaction boundaries to help enforce semantic correctness. Any scalability artifacts that may result in your objects no longer consuming memory are minor in the overall scheme of things. MTS's scalability comes primarily from having a sane concurrency and error-recovery model based on transactional programming. The fact that MTS does a very reasonable job with thread management is also a big win. The fact that MTS provides a standard infrastructure for managing pools of shared resources (the resource dispenser model) is also a plus.
      As for the memory consumed by your instances that gets reclaimed at transaction boundaries or when you call SetComplete, consider the following:

COM consumes roughly 420 bytes per instance for the stub manager and an interface stub for as long as the client holds an extant proxy. This memory is needed until the client releases its references to your object, even when your instance has gone SetComplete.

MTS consumes roughly 100 bytes per activity for as long as the client holds an extant proxy to an object within the activity. This memory is needed until the client releases all references to all objects with the activity, even when all instances are gone. This overhead does not count the actual thread that may be dedicated to the activity, as MTS will multiplex additional activities onto the thread if needed.

MTS consumes roughly 600 bytes per instance for as long as the client holds an extant proxy to an object. This memory is needed until the client releases all references to all objects, even when the instance has gone SetComplete.

      This means that even when your object is released by MTS, roughly 1000 bytes are needed by MTS and COM to keep the proxy connected. Unless your objects are fairly large or coarse-grained, the reclamation of your object's memory will probably not yield the massive performance or scalability improvement you may be expecting. However, because MTS knows that your object is being destroyed, fairly expensive resources that have been reserved for your object (database connections) and its transaction (locks) can be recycled for other objects to use. So while it is a good idea to call SetComplete as often as possible, the primary scalability improvement is the fact that subordinate resources such as locks can be released more aggressively. This can yield massive scalability gains. It is important to focus on the fact that MTS is a rich, transactional programming model and that objects lose instance state primarily due to the semantics of transactional programming.
      Another common misconception is that MTS will be radically enhanced by object pooling. MTS object pooling is based on reattaching a deactivated instance (that is, an object that has been released by the context wrapper) to another client based on its response to IObjectControl:: CanBePooled. Today, this feature is disabled due to the thread affinity of many objects and development tools. Since a deactivated object could only be reactivated in an activity on the same thread, this would greatly limit the hit rate that the pool manager would encounter. However, even when pooling is enabled, increased performance will likely come from saving initialization cost, not from reducing resource consumption. In fact, object pooling is based on spending bits to save cycles; it assumes that the memory consumed by an object in the pool is less costly than the cycles required to create a new object from scratch. Again, this reinforces the fact that the memory consumed by instances is not a fundamental barrier to scalability.
      So, to wrap up: I, Don Box, hereby certify the following statements to be accurate and current as of the December 1997 release of MTS 2.0:

COM is an interface-based programming model that is a refinement of classic object-orientation.

COM requires developers to be deliberate about behavior.

COM is ignorant of state, leaving state management completely up to the developer.

MTS is a state-conscious programming model that is a refinement of classic COM.

MTS requires developers to be deliberate about state.

MTS provides facilities for managing state, but you have to explicitly program against these facilities.

MTS objects can maintain state.

Transactional MTS objects cannot maintain instance state across transaction boundaries. Period.

Calling SetComplete simply hastens the inevitable for a transactional MTS object.

Nontransactional MTS objects can maintain instance state forever.

The loss of instance state is required primarily to maintain transactional semantics. Any scalability gains that are directly derived from an object losing its instance state are minor.

MTS does not support object pooling today due to the thread affinity of most current (1997) COM objects. This has little impact on the MTS programming model as you now know it.

The MTS programming model discourages shared access to COM objects, preferring a model based on multiple COM objects sharing access to common state.

MTS is the future of COM.

Have a question about programming with ActiveX or COM? Send your questions via email to Don Box at dbox@develop.com or http://www.develop.com/dbox/default.asp.

From the March 1998 issue of Microsoft Systems Journal.