Microsoft Corporation
January 18, 1995
The momentum to client/server computing is clear: More and more enterprises of all sizes are adopting the client/server model as the basis for their data-processing and data-management solutions. Microsoft® Windows NT® Server has become a critical component for a growing number of these enterprises because its true 32-bit architecture, portability, scalability, and performance provide the best possible platform for client/server solutions. Independent software vendors (ISVs) targeting this emerging client/server environment will want to ensure that their server-based applications perform well when running on Windows NT Server. This paper highlights some of the considerations involved in making this a reality.
A server-based application running on Windows NT is most likely to perform well if it:
The remainder of this document discusses each of these points in detail.
First and foremost, a well-designed server application running on Windows NT Server is a well-designed Win32 application. The Win32 API provides a powerful set of tools that allows an application to take advantage of all the advanced features of Windows NT. Particularly important among these features for an application are:
The following sections describe these features in greater detail.
Server applications require a high degree of reliability. One of the more difficult aspects of making an application reliable is dealing with unexpected errors that divert flow of control within a program. The Win32 API provides two mechanisms for managing such errors: structured exception handling and termination handling. Structured exception handling enables an application to divert the flow of control to appropriate exception handlers. Termination handling enables an application to release memory and perform other cleanup tasks whenever program control, for whatever reason, leaves a guarded block of code.
Because structured exception and termination handling allow programmers greater control over how a program executes when an unexpected error occurs, they significantly improve an application's reliability and help ensure that other applications running on the server are not adversely affected by an application that encounters such errors. In addition, structured exception and termination handling provide a mechanism through which an application can notify users of unforeseen circumstances that require their intervention. In extreme cases, users will have the opportunity to terminate an application or even shut down a server in a controlled way that minimizes possible negative consequences of the error.
Structured exception handling also enhances the reliability of applications by aiding in the debugging process. When an exception occurs during a debugging session, a sophisticated debugger designed to take advantage of structured exception handling can give the programmer access to information about the state of the affected thread and the nature of the exception. The programmer can then manipulate the environment of the process before the exception handler is executed. Alternatively, the debugger can instruct the system to continue the program execution without calling the exception handler.
Although exploiting structured exception and termination handling require extra effort and add complexity to an application program, this cost is more than offset by the considerable enhancement to the program's overall reliability through improved debugging and run-time error handling.
As markets expand globally, adapting an application for the variety of national languages that the application must support becomes more and more difficult. The Win32 API makes this task significantly easier by supporting Unicode, a global character-encoding standard that is capable of representing every character in modern computer use, including technical symbols and special characters used in publishing.
Because each character is represented with 16 bits (as compared with the 8 bits of the ANSI standard), Unicode can define up to 65,536 characters. In addition, the Unicode standard defines semantics for each character, standardizes script behavior, provides a standard algorithm for bi-directional text, and defines cross-mappings to other standards.
The Win32 API provides sets of functions that use either Unicode or the ANSI character set. Applications can use functions from both sets, as desired. A series of macros and naming conventions makes it relatively easy to migrate an application to Unicode, or even to compile both non-Unicode and Unicode versions of an application from a single set of sources. Win32 assigns Unicode strings a specific data type, allowing compilers to perform type checking for functions that require Unicode strings as parameters.
In the Win32 API, a service is an executable object about which information is installed in a registry database maintained by the service control manager. Included in this database is information that determines whether each installed service is started on demand or automatically when the system starts. The database can also contain logon and security information for a service, allowing it to run even when no user is logged on. System administrators can customize the security requirements for a particular service. Implementing a Windows NT server application as a service provides a number of benefits that derive from the architecture of Windows NT services.
A server application running as a Windows NT service can do so using a unique service account for remote administration and control. For example, if a database application and a host connectivity application are both running on the same Windows NT Server system, these applications can run as services under separate service accounts. This helps ensure that only designated host connectivity administrators can administer the host connectivity server application, and that only database administrators can administer database server applications. An additional benefit of this configuration is that it makes it possible for an administrator to control a group of services that work under a common user account.
Running as a Windows NT service helps a server application impersonate a client while accessing objects on behalf of the client. This capability ensures that the server application can act on behalf of a client without requiring the server application to run with an inappropriately high privilege level. It also ensures that the server application will not be able to perform actions that would be denied the client directly.
Services can be configured to start automatically, either when the Windows NT Server itself starts, or when the service is started by a dependent service that starts automatically. In either case, the service starts without human intervention, that is, without a user being required to log on and then explicitly start the service. By starting automatically, a server application implemented as a service is guaranteed to be available whenever needed so long as the Windows NT Server system is running. Moreover, related services can be started easily and, where appropriate, automatically.
Finally, implementing a server application as a service allows the application to be installed and controlled using standard user and Win32 programming interfaces. Such a service can be started and stopped both locally and remotely, providing network administrators an easy and consistent way to control the service across the network.
If a server application is implemented as a collection of services, the application should use remote procedure calls (or similar remote mechanisms) between services to allow each service to run on separate computers. This distributed function provides users with the benefits of greater capacity and scalability.
Perhaps no other feature of the Win32 API contributes more to sheer performance than its support for the preemptive multitasking of execution threads. Because threads require a smaller share of system resources than do processes, server applications that are implemented using multiple processes in other environments will benefit from being restructured to use threads instead. Multiple threads are especially beneficial to server applications in the following cases:
A number of threading models can be employed in a server application's design. The particular model used depends on the nature of the application, including its scaling requirements. The following paragraphs discuss the five most common of these models, including their suitability for particular applications.
Single thread, single client at a time. This is the simplest of all threading models and is therefore the easiest to implement. In this model, the server application has a single loop in which it accepts a single incoming client, which it immediately services. Because the server cannot accept a new client until the previous client is serviced and released, this model is inappropriate for all but the most basic services. Moreover, server applications based on this model cannot take advantage of multiprocessor systems.
Single thread, multiple clients. This model is only slightly more complex to implement than the single thread/single client model. The server application must be able to maintain and select among multiple connections within a single thread. For example, the Windows Sockets API supplies a set of functions (centered on the select function) that provide this capability. Although this model supports a powerful service, performance can suffer because every network I/O call must pass through the select function, making the service CPU-intensive. As with the previous model, server applications based on this model cannot take advantage of multiprocessor systems.
One thread per client. Probably the most commonly used, this model is also the fastest one for server applications that service fewer than 16 clients. In this model, the server application executes a single loop that accepts incoming connections and then creates a thread for each connection. This model is relatively easy to implement, especially when porting a server application previously implemented as a UNIX® daemon. A drawback to the model is that its simplicity does not scale well to serving large numbers of clients because the larger numbers of threads impose greater demands on system services, and because the context switching required for the operating system to execute each thread can impose a significant penalty on CPU usage.
Worker threads with synchronous I/O. Although this model is more complex and less efficient than the one-thread-per-client model when handling a small number of clients, it is more efficient when managing a large number of connections. In this model, one thread executes a loop that accepts and monitors connections. When this thread determines that a connection needs to be serviced, it dispatches the task to a worker thread that actually performs the requested service.
Worker threads with asynchronous I/O. This model is the most powerful and, not surprisingly, the most complex. The key element of this model is that socket handles are native Windows NT file handles. As a result, the server application can use the Win32 functions ReadFile and WriteFile to receive and send data asynchronously on the socket, as though it were a disk file being accessed in overlapped mode. In overlapped mode, a server application can initiate multiple I/O requests without waiting for previous requests to complete, thereby enabling it to service multiple clients asynchronously using a single thread. This capability makes this model the best for supporting a very large numbers of connected clients without sacrificing performance when only a few clients are connected. This efficiency is possible because a server application can increase or decrease the number of threads servicing clients as needed. Using an I/O completion port (created by the CreateIoCompletionPort function) as a synchronizing mechanism can simplify the process of creating and allocating connections among threads.
In summary, the actual model to be followed when developing an application for Windows NT Server depends on the nature of the service being provided, the anticipated number of connections, and the expected number of processors on the computer running the service.
Most server applications are judged primarily by one criterion: how well they perform. A rule of thumb is that a server application's benchmark results when running on Windows NT should be within five percent of its results when running on another operating system, using comparable hardware.
Of course, how that performance is measured depends on the particular services provided by the server application, as well as the anticipated environment in which both server and client applications will run. For this reason, it is impossible to give definitive guidelines for improving the performance of a server application. However, Windows NT supports a variety of tools that can be used to monitor the performance of a server application to determine where improvements can be made. This section describes some of these tools and suggests how they can be used to tune a server application.
A critical factor in determining a server application's performance is its ability to run well on a multiprocessor system. The ability of Windows NT to run on symmetric multiprocessor (SMP) systems provides a major performance benefit to server applications that are properly designed to take advantage of this feature.
An obvious prerequisite for exploiting a multiprocessor system is support for multiple threads, which Windows NT can run on separate processors. Successful scalability, however, requires more than supporting multiple threads. An efficiently scalable application must support at least one thread for each processor in the system on which it is running. Beyond that, however, the application must also control the number of threads it creates to avoid burdening the system with the overhead required to maintain each thread.
An additional factor that determines how effectively a server application scales to SMP systems is the methods used to synchronize threads. The Win32 API provides a wide assortment of synchronization objects. Mutexes, events, and semaphores provided by the Windows NT kernel, for example, provide a very powerful and flexible method for synchronization, but because accessing these objects requires a kernel call, there is some overhead involved. When the synchronization period is very short and very frequent, the overhead (and any resulting context switches) can be much greater than the synchronization period itself. When speed is more important than flexibility, a server application can utilize critical sections to synchronize threads within a single process.
Improving the performance of any application requires the programmer to observe its behavior in situations like those in which its customers will actually use it. Microsoft provides a wide variety of tools to help in this process. These tools include the following:
Perhaps the most important tuning tool, Performance Monitor provides an important beginning point for a programmer seeking to improve the performance of an application. Performance Monitor provides a graphical view of a broad range of performance metrics, including such items as processor usage and network I/O. More importantly, an application can add and maintain its own Performance Monitor counters that reveal how it is performing certain specified tasks. These counters will make it easier to tune the application, both while it is under development and when it is in actual productive use.
Using Performance Monitor, the application programmer can locate problem areas and then use other, more specific tools to locate and correct the cause of the problem.
The Call Attributed Profiler (CAP) supplied with the Win32 Software Development it (SDK) for Windows NT version 3.51 shows how internal function calls occur within an application. Unlike previous profilers, CAP is not a sampling profiler. Instead, CAP records how long a function takes to execute and how it spends that time. If a function calls other functions, CAP presents its results both excluding and including time spent in those functions. CAP creates a separate call tree for each thread, making it easy to see the flow of control within a particular thread. A significant drawback of CAP is that it requires the application to be recompiled with the -Gh and -Zd options and linked with the CAP.LIB library to place a special call before each function.
The Win32 API Profiler (WAP) supplied with the Win32 SDK for Windows NT version 3.51 is similar to CAP, but it profiles calls to functions in the six main Win32 dynamic-link libraries (DLLs). WAP creates separate text files for each DLL. These files contain the following information for each API that WAP profiles:
The Win32 API Logger provided as part of the Win32 SDK for Windows NT version 3.51 records each Win32 API call, its parameters, and its return value. Because the data is recorded as the API calls occur, using the Win32 API Logger is very disk-intensive.
The File I/O and Synchronization Profiler (FIOSAP) supplied with the Windows NT Resource Kit profiles calls to file I/O and synchronization APIs in KERNEL32.DLL. It times API calls and collects statistics for file activities and for event, mutex, and semaphore activities.
The Windows NT symbolic debugger supplied with the Win32 SDK for Windows NT version 3.51 provides the wt command to trace calls in a program and to show the number of instructions between them. The information presented is similar to that provided by CAP, but is provided much more slowly. However, the wt command does not require you to recompile your application as CAP does.
Pmon, an easy-to-use console utility provided with the Windows NT Resource Kit, displays memory information on each running process, plus the percentage of elapsed CPU time and percentage of CPU time used by each process. The display updates every five seconds.
The Working Set Tuner (WST) included in the Win32 SDK for Windows NT version 3.51 helps the linker rearrange the order of functions in a program, placing together those functions that are called frequently, which reduces the number of code pages that have to be kept in memory. Using WST can reduce an application's code space by 25 to 50 percent. Like CAP, it requires the application to be recompiled and linked to a special library.
The Virtual Address Dump (vadump) utility supplied with the Windows NT Resource Kit provides a view of the working set of a process, determining the nature of each page. This utility is available only for x86 systems.
When an enterprise moves its critical data and services from centrally managed mainframe computers to distributed application servers, security becomes at once increasingly important and more difficult to maintain. Unlike other network operating systems, Windows NT was designed from the beginning to provide a high degree of security.
For server applications, security has three main purposes:
A server application benefits most from Windows NT security when the application is implemented as a service. Running as a service ensures that only authorized users (such as users logged on as members of the Administrators group) can stop the service, for example.
Windows NT security provides facilities that a server application can use to control client access to the services and objects the application provides. For example, the server application can assign owners and access permissions to rows in a database owned by the server.
Finally, when the server application is acting on behalf of a client, Windows NT security allows the server application to impersonate the client when accessing objects. Windows NT then grants the server application the access appropriate for the client.
Because user identification and authentication are key to overall security, a server application should not maintain its own database of authorized users, except in very unusual circumstances. Instead, it should utilize the user identification and authorization services provided by Windows NT security. Doing so allows the server application to operate within, and take advantage of, the network-wide single-logon capability provided by Windows NT Server.
The easiest way to make use of the user identification and authentication service of Windows NT is for the server application to implement the remote procedure call (RPC) interface. The RPC run-time library allows the application to specify that the application will accept only those connections that have been authenticated by a particular authority (for example, the Windows NT authentication package), and allows the application to specify packet-level authentication, integrity, and privacy.
Some server applications cannot rely on RPC to provide user authentication. For example, a port of a UNIX server application can be required to authenticate users with clear-text credentials. In such cases, the server application can use the new Win32 API function LogonUser to authenticate a client user and obtain an access token for the user. The access token identifies the user and specifies the user's group memberships, access privileges, and other security-relevant information. The Windows NT security subsystem uses the information in this token to determine whether to grant access to a particular service or object.
By their very nature, server applications perform services on behalf of clients. To allow server applications to access system services and objects securely, Windows NT allows a thread to impersonate a user. In effect, when the thread impersonates a user, the thread begins using the access token of the user being impersonated, rather than using the token of the process that created it. When the thread attempts to access an object—a file, for example—Windows NT compares information in the user's access token to the security descriptor of the object (which contains, among other items, access-control information for the object) to determine whether the access should be granted.
Using impersonation provides two critical benefits to a server application.
First, impersonation ensures that the server application is able to provide a client the appropriate access to protected objects. For example, an application that provides file-transfer services will not allow a client to read the contents of a file if the security descriptor of the file does not permit the client to do so.
Second, impersonation allows a server application to operate according to the principle of least privilege. That is, the server application itself does not have to invoke a high level of privilege to be able to provide services for a client user with a high privilege level. For example, if a particular file can only be accessed by a user whose account belongs to the Administrators group, the server application itself need not be logged on with such an account (or be run by a user logged on with such an account) in order to provide access to the file for a user with an account in the Administrators group. Instead, the server application can use one of its threads to impersonate the user for only as long as necessary to access the file. When the thread no longer needs to provide access to the protected file, the thread can then cease impersonation, thereby replacing the user's impersonated access token with the thread's default token. While impersonation is taking place, only the access token of the impersonating thread is affected; the access token of the process that created the thread as well as the access tokens of other threads created by that process are not affected.
The use of impersonation also allows system administrators to audit object access more completely. When a thread impersonates a user, information about this impersonation is included in audit records in the security log, making it possible for system administrators to identify the client users responsible for accessing particular objects.
Because of the central role that impersonation plays in the Windows NT security model, it is essential that server applications that access system services and objects on behalf of clients do so by impersonating those client users.
To reduce the complexity of dealing with multiple security providers, Windows NT (and all Microsoft operating systems that support RPC) supports the Security Support Provider Interface (SSPI), an API that allows applications to treat security opaquely. SSPI is modeled on the Generic Security Service API originally defined by John Linn of Digital Equipment Corporation.
SSPI provides an interface that allows an application to use a variety of security models available on a computer or network without changing the interface to each individual security system. SSPI is based on credentials (information that authenticates a user) and contexts (security-relevant data associated with a particular connection).
SSPI itself does not provide a mechanism for establishing the credentials of (that is, logging on) a user because that is generally a privileged operation handled by the underlying operating system. However, SSPI can provide an application with a handle to those credentials; the application can then use that handle to create a security context and then impersonate the client in much the same fashion as provided by the Win32 API.
A well-designed client/server application does not rely on a single transport protocol; instead, it is designed to take advantage of standard, transport-independent interprocess communications (IPC) interfaces. This helps ensure that the client and server application programs will be able to run on a wide variety of network platforms with little or no modification.
Windows NT supports two sets of networking APIs that provide this transport independence: the Windows Sockets interface and the remote procedure call (RPC) interface. This section discusses the benefits of using these APIs and looks to the future when OLE will provide an object-oriented method for client/server communication. Finally, this section discusses transport-specific considerations that still need to be addressed even when a client/server application is built on transport-independent network APIs.
A growing number of independent software vendors (ISVs) are choosing to build their client/server applications using the WinSock (Windows Socket) interface. In many cases, these ISVs are porting to Windows NT Server existing services that already rely on a sockets interface, while others are seeking to support a variety of client platforms that do not share a common higher-level protocol.
WinSock is an open standard designed by a number of network providers that allows consolidation of a variety of protocols within a single, binary-compatible API. WinSock and other sockets implementations do not introduce additional protocols on top of the network transport protocol. Use of such a "direct" transport interface ensures that an application written for one vendor's sockets interface will be able to communicate with applications written for other vendors' sockets implementations.
WinSock is more flexible than RPC because it allows the application greater control over the data that is transmitted on the network. Unlike RPC, WinSock allows an application to use protocol-specific features and to operate with a variety of existing services. With appropriate modifications, WinSock can be adapted to use a variety of data sizes and transaction models.
An important benefit of writing a server application using the WinSock interface is the ability to use standard Win32 I/O APIs to access a socket. Because a socket handle is a native Windows NT file handle that is overlapped by default, an application can use such functions as ReadFile, WriteFile, and DuplicateHandle to manage the socket and its data, which makes it possible to communicate asynchronously over the socket and share sockets between threads and processes. If synchronous communication using C run-time library functions is desired, the socket can be opened as a synchronous handle instead, although doing so is recommended only for the sake of compatibility.
The remote procedure call (RPC) interface is another transport-independent interprocess communications interface supported by Windows NT. As its name suggests, RPC provides a mechanism for calling functions in other processes, even processes running on different computers on a network.
The RPC provided by the Win32 API complies with the Open Software Foundation (OSF) Distributed Computing Environment (DCE) specification. As a result, applications written using the Win32 RPC API can communicate with other RPC applications running on other operating systems that support DCE, regardless of the underlying protocol. Unlike WinSock, RPC provides a threading model, endpoint (socket/pipe/port) mapping service, and hooks to a name service. Supporting software on each system performs the necessary data conversions (such as byte order and floating-point format) required by the system architecture, eliminating the need for the application to do so.
RPC is especially desirable when a tight coupling between server and client code is desired and when network I/O occurs in small (single-packet) transactions.
In future versions of Windows NT, OLE will be extended to provide distributed services on the network using RPC. This will make it possible for applications to be more easily distributed while hiding the details of the underlying network from the OLE server and client process.
Although the ideal client/server application is completely network independent, this goal is rarely completely possible. One factor that makes achieving this goal difficult is the different methods used by various protocols to register services in a name space. For example, in UNIX environments services often register their addresses using Directory Naming System (DNS) and the /etc/services file, while server applications that support NetWare® clients must use Service Advertisement Protocol (SAP) and possibly the NetWare Directory Service (NDS) to make their services known to potential clients.
Previously, these different name spaces would require the inclusion of transport-specific code in the client/server application to allow clients to locate and connect to the server application. Now, however, a new set of Registration and Resolution (RnR) APIs have been added to the Win32 SDK for Windows NT version 3.51 that eventually will become the name-resolution standard for WinSock version 2.0. Currently, these APIs include functions that allow an application to enumerate available protocols, obtain the address of a service by name, obtain the name of a service based on its type (and vice versa), and specify or retrieve the properties of a particular service.
Because TCP/IP and IPX/SPX are the dominant protocols in use on large networks, it is important that client/server applications run well on these protocols. ISVs designing such applications need to balance the need for network independence with the requirement that the applications use transport-specific code to enhance their performance on the prevailing types of networks.
Another important factor to consider is the possibility that client/server transactions might take place over a slow link. Client/server applications should not be designed to rely on a particular transfer rate. If necessary, they should implement their own flow control to ensure that transactions can occur, if necessary, through extremely slow and possibly unreliable channels.
Although the model of client/server computing tends to distribute applications throughout the network, most enterprises still require the ability to manage those applications from a central location. For this reason, an effective server application must be capable of being administered remotely.
The simplest way to support remote administration is to implement a server application as a Windows NT service. This allows remote administrators to stop, start, and set startup options for the service through Server Manager.
In addition, a server application that supports remote administration should make use of the additional remote administration facilities of Windows NT by doing the following:
In addition to supporting remote administration through built-in Windows NT facilities, a server application can improve the ability of network administrators to manage the application by supporting Microsoft Systems Management Server (SMS). SMS gives network administrators the ability to remotely install, configure, and uninstall applications. A server application that supports SMS, therefore, will provide a method for installing, upgrading, and removing itself that does not require human intervention at each computer. Ideally, the installation program will be capable of being controlled through command-line switches and parameters and will report detailed status information to SMS using a management information file (MIF).