This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.


MIND

Extreme C++
mindplus@microsoft.com          Download the code (55KB)
Steve Zimmerman
Scalability, State, and MTS
A
s the PC software development paradigm has shifted from the desktop to the Internet client/server model, application scalability is now more important than ever. Rather than designing software to maximize performance—reducing the amount of time it takes to service a single request—Internet developers must code for maximum throughput by increasing the number of requests that can be serviced in a given amount of time.

    At first, it seems as though those two design goals are identical. After all, if I reduce the time it takes to service a single request by half, haven't I also doubled my application throughput? This naïve conclusion may be true for single-user, standalone applications that handle only sequential, nonoverlapped requests, but it is rarely true of applications that support multiple simultaneous users. The reason is due to a lack of scalability—the measure of an application's resistance to performance degradations when it services multiple simultaneous requests.

    A perfectly scalable application would provide constant performance regardless of the number of simultaneous users (see Figure 1), but such applications only exist in theory because they require infinite resources. In the real world, whenever two simultaneous requests vie for a shared resource—such as memory, database access, or CPU time—one request must wait until the other is finished using that resource, resulting in degraded performance. To minimize this problem, you can do two things: increase the supply of available resources (more processors, memory, and database handles), or design the application so that each request uses shared resources as efficiently as possible. To provide the greatest degree of scalability, you must do both.

Figure 1: Scalable App
Figure 1: Scalable App

Figure 2 shows two applications that differ in scalability. When servicing requests from very few simultaneous users, application A is much more responsive than application B, but as the number of simultaneous users increases, A suffers from severe performance degradation. Like many standalone PC applications, A was probably not developed with scalability in mind. On the other hand, application B pays a performance penalty for its efficient use of resources, but is much better equipped to handle multiple simultaneous requests. As the number of concurrent users increases, B is clearly superior to A.
Figure 2: Scalability Differences
Figure 2: Scalability Differences

    If you're developing applications in which performance is much more important than throughput—or if you have the misguided assumption that your application will always run on systems with an unlimited supply of resources—you may not need to concern yourself with scalability. But as the Microsoft ® architecture expands from the desktop to the enterprise, more Windows ® developers will face the unfamiliar challenge of having to design for maximum throughput rather than maximum performance.

    The good news is that as the demands on PC software developers have increased, so has the sophistication of the Windows architecture. With the release of Microsoft Transaction Server (part of what is now being commonly referred to as Microsoft Component Services), developers can take advantage of system-provided services—layered atop COM—that make it relatively easy to develop sophisticated, highly-scalable applications. Among the services provided by Microsoft Transaction Server (MTS) are connection pooling, just-in-time (JIT) activation, role-based security, transaction processing, and deployment packages (see Figure 3). Obviously, MTS provides a number of scalability services to all COM developers, not just to those who are concerned with transactions. Several articles have already appeared in both Microsoft Internet Developer and Microsoft Systems Journal that discuss these features. I'll take a hands-on approach here. I'll examine a minimal COM object—a handy dandy TipOfTheDay component—and walk you through its progression from a simple in-process object to a highly-scalable business rules component suitable for a three-tier Windows DNA architecture.

The TipOfTheDay Component

    If you've developed MFC-based applications using Visual C++®, you're probably familiar with the Components and Controls Gallery. As shown in Figure 4, the Gallery allows you to add two different types of pre-built components to your application: source code components and binary components.
Figure 4: Components and Controls Gallery
Figure 4: Components and Controls Gallery

Strictly speaking, source code components aren't really components at all. They are simply bits of prefabricated MFC code that get added to your Visual C++ project using what amounts to an intelligent copy-and-paste engine. (Clipboard reuse, as a friend of mine would say.) I'm not saying that prefabricated source code is necessarily a bad thing, but the Gallery code is generally only beneficial if you're building by-the-book standalone MFC applications.

    The available binary components are simply the ActiveX® controls registered on your machine. If you tell the Gallery to insert an ActiveX control, Visual C++ will add a wrapper class to your project that can be used to create and manipulate that control as if it were a CWnd-derived C++ object. The benefit of ActiveX controls is obvious: you can reuse discrete chunks of user interface functionality written in any language that supports COM. The downside is that ActiveX controls cannot exist as self-sufficient entities. They must be loaded into the process space of (and hence on the same machine as) the client application that creates them. As a result, sharing the functionality of an ActiveX control with 100 users typically means installing that control on 100 different machines.

    One of my favorite Gallery components is called the Tip of the Day. It adds the necessary C++ code and resources (a dialog box and an icon) to your project so that the user can cycle through a set of tips displayed when the application starts up (see Figure 5). The tip component reads the tips one by one from a text file that typically resides in the same folder as the application. You can provide your own tips by adding entries to the text file.
Figure 5: Tip of the Day
Figure 5: Tip of the Day

Although this component is handy, it has several limitations. First, the Tip of the Day component can only be used in MFC apps using Visual C++. Developers working in the Java language, Visual Basic®, or ASP are simply out of luck. Second, the Gallery-generated tip code is pasted directly into the application rather than being linked at runtime as a separate binary module, so the tip code cannot be updated or enhanced without rebuilding and redistributing the entire application.

    Converting the component into an ActiveX control would easily solve these problems, but would expose yet another subtle problem. The tip component tightly couples two orthogonal pieces of functionality—how the tips are generated (the business logic) and how they're displayed on the screen (the user interface). Changes to the underlying structure of the tip data or the way the tips are drawn would require the component to be reinstalled on every client machine. Even something as innocuous as adding new tips would require massive redistribution of an updated tip text file.

    For large-scale applications that service the requests of many simultaneous users, it would make more sense to deploy a logic-only Tip of the Day COM component on a remote machine and then let clients on other machines make requests whenever they need to display a tip. Rather than having the component dictate the user interface, each client would be responsible for rendering the tip text in whatever manner was appropriate for that application.

    The simplified IDL for such a component might look like the code shown in Figure 6. As you can see, this version of the TipOfTheDay component simply serves up tips in BSTR format (perhaps dictating the format of the tips to some degree using HTML or another markup language). Each client application is responsible for performing its own rendering. Tips are maintained and warehoused by the server-side component using whatever manner is deemed appropriate: hardcoded into the component, stored in a local file, or even stored in a remote database. Regardless, the storage mechanism is completely transparent to the client.

Efficient Use of Resources

    Using COM, clients create an instance of the TipOfTheDay component and repeatedly call the ITipOfTheDay::GetNextTip method as necessary to cycle through the tips. An MFC-based client might use the TipOfTheDay component as shown in Figure 7. If you examine the sample client code closely, you'll see that it was written primarily with performance in mind rather than throughput. The CTipDlg code creates a single instance of the TipOfTheDay object that it keeps alive during the entire lifetime of the dialog box. This approach makes sense if the TipOfTheDay component resides in an in-process server on the same machine as the client application. After all, why create and destroy a new TipOfTheDay object every time you want to get the next tip? However, keeping the component alive in between calls to GetNextTip may result in inefficient use of resources in a distributed environment.

    Consider this ATL-based snippet from one possible implementation of the TipOfTheDay coclass:

 HRESULT CTipOfTheDay::FinalConstruct()
 {
     // Hold open a database connection throughout the
     //  lifetime of this component

     CDataSource connection;
     HRESULT hr = connection.Open(_T("MSDASQL"), "TipOfTheDay");
     if (FAILED(hr))
         return hr;
     return m_session.Open(connection);
 }

 HRESULT CTipOfTheDay::FinalRelease()
 {
      m_session.Close();
      return S_OK;
 }
Rather than storing the tip data in memory or in a flat file, this implementation code uses an ODBC data source as the storage medium and uses the OLE DB consumer templates (a feature of Visual C++ 6.0) to retrieve the data. The result is a simple incarnation of the proverbial three-tier architecture, with separate logical boundaries for user interface, business logic, and data.

    The TipOfTheDay component lives in the business logic tier and serves simply as an iterator, determining the order in which the tips will be delivered to the client. The component knows nothing about the underlying structure of the tip data. This architecture is scalable and robust because it makes use of the synchronization and scalability features provided by database management systems (DBMSs) such as SQL Server™, allowing administrators to add, modify, or delete tips in the database without disrupting concurrent access from clients. To keep things simple, the code for this component (available from the link at the top of this article) uses a Microsoft Access database (see Figure 8 for configuration information), but the same data source name could easily be reconfigured to use a SQL Server or Oracle database without requiring component recompilation.

Figure 8: Configuring TipOfTheDay
Figure 8: Configuring TipOfTheDay

As with the MFC-based client code described earlier, the TipOfTheDay implementation shown previously prefers performance to throughput. As part of its construction sequence, the component opens and caches a connection to the database so that it doesn't have to suffer the overhead of reestablishing the database connection each time its GetNextTip method is called. As mentioned previously, this type of optimization makes perfect sense in the standalone application model, but is inefficient when the component must service tip requests from many simultaneous users. Although the round-trip time required to service the GetNextTip request may be considerable, it is likely insignificant compared to the time it takes the user to actually digest the information contained in the tip. Thus, while the user leisurely peruses one tip after another, the TipOfTheDay component holds a largely unused—but very expensive—database connection. Not a very efficient use of resources at all.

    To help overcome these inefficiencies, you can alter the implementation of both the client and server so they use resources more effectively. The client code—knowing nothing about how the server manages its database connection—could create an instance of the TipOfTheDay component immediately before calling GetNextTip and call Release immediately thereafter. The server—knowing nothing about how long the client will hold it in memory—could open and close the database connection inside the body of the GetNextTip method rather than caching the connection at create time.

    With these changes, the application would use resources more effectively, but would suffer the penalty of having to regenerate the component and its associated database connection each time a tip was requested. Throughput would improve at the expense of performance. To minimize the impact of repeatedly opening and closing database connections—especially in large-scale, large-budget applications—you might be inclined to devise some sort of homegrown best-of-both-worlds scheme whereby the component maintains a small pool of database connections to satisfy client requests. The pooled connections would need to be reinitialized between uses, but not reopened, thereby reducing the impact of the hurry-in, hurry-out resource usage.

To JIT or Not to JIT

    As described in Figure 3, MTS provides two features (JIT activation and connection pooling) that reduce the development effort required to make the throughput optimizations described previously. When an in-process component is added to an MTS package (see Figure 9), its registry entries are modified so that component instances are created in the process space of an mtx.exe surrogate process rather than being loaded directly into the address space of the client application. This allows an in-process component to be activated on a remote machine, of course, but it also allows MTS to seamlessly manage tedious infrastructure details on behalf of the component by placing it in a context wrapper. To enable connection pooling, you simply need to add your component to an MTS package.
Figure 9: Adding an In-process Component
Figure 9: Adding an In-process Component

To take advantage of JIT activation, you must do two things: develop your component so that it preserves and restores its state at specific points during execution and call the SetComplete member of the MTS-provided IObjectContext interface at those points, as shown here:

 STDMETHODIMP CTipOfTheDay::GetNextTip(
     /* [in, out] */ VARIANT *pvCookie, BSTR *pbstrText)
 {
     // Retrieve tip from database using cached connection

     •••

     // Destroy component and release resources

    IObjectContext* pObjContext = NULL;
    ::GetObjectContext(&pObjContext);
    return pObjContext->SetComplete();
 }    
Calling SetComplete causes the TipOfTheDay component to be released from beneath its context wrapper. The next time GetNextTip is called, MTS creates a new instance of the object, maintaining what appears to the client as a contiguous reference to a single object.

    Since the TipOfTheDay component is essentially a simple iterator, preserving its state is as simple as storing the index of the next tip to be displayed. Users expect robust applications to remember the current tip state from run to run, so it makes sense to have each client store its own tip index (the state of the component, in other words) and pass it to the GetNextTip method as an [in, out] cookie. Since the TipOfTheDay component is stateless—or, more accurately, state-estranged—using JIT activation as the way to release the database connection is probably justifiable. It is important to point out, though, that the primary reason for calling SetComplete is to complete a transaction, not simply to invoke object deactivation.

    If your component does not participate in transactions and is complex enough that preserving its state is a difficult task, you'd be better served by making sure your code uses expensive resources effectively than by trying to make it stateless just so you can call SetComplete. Figure 10 shows an implementation of the TipOfTheDay component that uses the manual approach. Notice the absence of a call to SetComplete.

Tip of the Web

    If you looked closely at the IDL for the TipOfTheDay component in Figure 6, you may have noticed that the ITipOfTheDay interface is marked as [dual], meaning that it's an Automation-compliant interface derived from IDispatch. Because the component exposes IDispatch, it can be used from within Active Server Pages (ASP) and Microsoft Internet Information Server (IIS). Rather than having to create a remote instance of the TipOfTheDay component and deal with the security issues of Distributed COM over the Internet, Web-based users can simply point their browser to an ASP page that invokes the TipOfTheDay component and passes the tip text back as HTML:


 <%@ LANGUAGE="VBSCRIPT" %>
 <%
 If Request.Cookies("TipOfTheDay") = "" Then
     Response.Cookies("TipOfTheDay") = "0"
 End If
 set tipOfDay = Server.CreateObject("TipServer.TipOfTheDay")
 lCookie = Request.Cookies("TipOfTheDay")
 tipText = tipOfDay.GetNextTip(lCookie)
 Response.Cookies("TipOfTheDay") = lCookie
 Response.Cookies("TipOfTheDay").Expires = #December 31, 1999#
 Response.Expires = 0
 %>
 <HTML>
 <BODY>
 <P>The tip for the day is:</P>
 <P><FONT color=blue>
 <% Response.Write tipText & "<br>"%>
 </FONT></P>
 <P>(Click refresh to see the next tip).</P>
 </BODY>
 </HTML>
Using the browser to store a cookie on the client machine, the ASP page ensures that the user will see the next tip in the list each time he or she returns to that page. Now that's a thin client!

Conclusion

    Although system-level services like MTS (and those that will surely follow in the highly anticipated release of COM+) make it easier to distribute, manage, and scale your applications, distributed development is still a challenge. There's no silver bullet that will make poorly designed components scale well, so the responsibility of scalability still rests squarely on your shoulders. The good news is that if you learn to craft your components with care—making sure to separate business logic from the user interface and use resources efficiently, among other things—the future looks mighty bright, indeed.


From the March 1999 issue of Microsoft Internet Developer.