How Many Users Can I Put On A Single Exchange Server?

A whitepaper on Microsoft Exchange Server performance and capacity planning.

Eric Lockard
Program Manager, Microsoft Exchange Server Development
Matt Durasoff
Software Design Engineer, Microsoft Exchange Server Development
Eric Savoldi
Sr. Consultant, Microsoft Consulting Services

I. The Impossible Question

How many users can you host on a single Microsoft Exchange Server?

This simple question is without a doubt the single most frequently asked performance-related question about Microsoft Exchange Server. We hate it. Why you ask? Because its such an easy question to ask, so simply put, so all-encompassing, important and pertinent. And, of course, it's impossible to answer, at least without writing a ten page essay on the subject, which is what we've finally decided to do.

Individuals ask the question for different reasons. Customers ask the question because they may already have server hardware and want to know how many users it will support or they may want information in order to make hardware buying decisions in planning their Microsoft Exchange Server rollout. Others may use the answer as a basis for comparing Microsoft Exchange Server to other messaging or workgroup products for evaluation purposes. A few just want to ask something intelligent sounding, the same way they might ask how many pistons their lawnmower has.

This paper makes an attempt to discuss, in fairly simple terms, the performance related issues involved in addressing the users per server question with an eye towards planning an Exchange Server rollout. It is also intended to be used as raw material for a chapter on server performance in the Microsoft Exchange Server Administrator's Guide.

II. Definitions and Terms

A. User

The problem with asking the users per server question of course lies in the definitions of user and server. No two individuals have the same thing in mind when they ask the question.

Users vary widely in their Exchange messaging, scheduling and workgroup usage levels. Different user usage patterns, schedules and activity levels depend upon, among other things, the degree to which email, group and personal scheduling and public folder applications are utilized for business and personal activities within the organization as well as the organization's corporate culture and geographical dispersity. A "user" may receive hundreds of messages each day or only 1 or 2. They may originate upwards of 50 messages a day or only one per week. Some users may interact with the server for 12 hours at a time or they may only connect once a week for a few minutes. They may read hundreds of messages in dozens of public folders every day or none at all.

One thing is for sure, no two users are exactly the same. In a given user community, a small percentage of the users may generate a disproportionate percentage of the total load on a server because they are the heavier, more aggressive users of the system.

B. Server

Definitions of "server" vary widely as well. One organization with large, centrally located or well-connected sites may wish to use a small number of very expensive, high-end, multiprocessor server machines with gigabytes of RAM and dozens of disk drives and host as many users per server machine as possible. Another may be interested in using many hundreds of inexpensive, lower-end machines to tie together their many small, geographically disperse locations.

Many organizations consist of locations of differing sizes and levels of connectivity and have several different requirements with respect to servers and the users they host. All customers want the most cost-effective solution not only in terms of hardware, but also with respect to maintenance costs, administration costs, bandwidth costs, floor space, etc.

It is this wide variation in server and user definitions which makes the users per server question so difficult to answer.

C. Actions

In general, the main factor in determining the number of users a given server will support is the load each user places on the server. The load a specific user places on a server generally falls into two distinct categories: user-initiated actions and background actions.

1. User-initiated actions

When a user interacts with the server directly, the actions they perform place some immediate incremental load on the server for some duration of time. User-initiated actions are those operations the server performs as a direct result of a user action and are generally synchronous from the user's point of view. For example, opening a standard unread message in a private folder in the server information store entails processing time on the server to receive and interpret the open request, evaluate any access restrictions, retrieve the message from the database, mark the message as unread, update the unread count for the folder, marshal and return the requested message properties to the client and generate a folder notification to the client (to notify the client the message has now been read). All of this happens in the time it takes for the Remote Procedure Call (RPC) issued by the client application to return control to the client. The actual time the user perceives the operation taking will be this time plus any additional processing time needed by the client application to draw the window, unmarshal and display the message properties, etc.

User-initiated actions are the single most significant contributing load factor on Microsoft Exchange Servers which directly support users (as opposed to servers acting primarily as a backbone or gateway server) and are directly proportional to the number of users actively interacting with the server per unit time and the actions they perform.

2. Background actions

In addition to the work a server may do to satisfy synchronous user-initiated actions, a Microsoft Exchange Server performs asynchronous or background actions on behalf of users. Accepting, transferring and delivering messages, making routing decisions, expanding distribution lists, replicating changes to public folders and directory service information, executing rules, monitoring storage quotas and performing background maintenance such as tombstone garbage collection and view index expiration are all examples of the work a server may do asynchronously on behalf of users whether they are currently connected or not. We term this work asynchronous because the time it takes for these actions to complete does not directly influence the users perception of the speed of the system, at least not as long as the actions are completed within some reasonable amount of time.

In general, the load due to background actions, like user-initiated actions, is proportional to the number of users on the server. However, other factors such as whether the server acts as an inter-site connector for messages for users on other servers or hosts a messaging gateway can have a large impact. On Microsoft Exchange Servers which act purely as a gateway or backbone machine and do not directly host any users at all, the load due to user-initiated actions is essentially non-existent and the load on the machine can be considered to be entirely due to background actions.

3. Inequality of actions

Although it is often convenient for modeling purposes to think of every user-initiated or background action to be equivalent to every other in terms of the load each places upon the server, this is certainly not the case. For example, a user copying a 500K message to their personal information store places more load on a server than that same user copying a 1K message. Similarly, sending a message to a distribution list containing 100 members creates more background actions than sending the message to only a single recipient.

We can, however, make good predictions about actions impacting server load if we deal in aggregates over time. If we examine the set of actions a particular user performs during a certain time period, say an 8 hour work day, we can sum up the user-initiated actions during that time period - all the interactions they have with the server to send or read messages, look up recipients in the address book, etc. - and all the background actions which occur on the server on their behalf and make some classification of that user's activity level relative to other users. We can quantify some definitions of users by characterizing how many actions they perform per unit time, the set of actions and the individual actions load characteristics (e.g. what size messages are used) and come up with sample user definitions which can be used for making rough performance predictions of actual user communities.

For example, in the testing the Exchange Performance Team did in the labs, we classified our users as low, medium, and heavy usage. A low usage user to us was a user that, on a daily average:

Our definition for a medium usage users was, again based on a daily average:

We defined a heavy usage user as a user who, again based on a daily average:

Each organization will need to determine what their users' patterns are when trying to correctly answer the users per server question.

D. Server Load

1. A server under load

When we say a server is "under load", what is it exactly that we mean?

A particular server machine consists of a set of hardware including: one or more CPUs of a given architecture and processing speed, some amount of primary memory (RAM) and one or more disk drives of certain speeds and sizes and their controller(s). It is this hardware, and specifically these three hardware elements - the CPU(s), RAM and I/O subsystem - which make up the critical hardware resources of the server machine.

When the Exchange Server software services a user-initiated action or performs background actions, it utilizes each of these three resources to some degree over some period of time to perform its operations. For example, responding to an open message request from a client may require several milliseconds of CPU processing time, one or more disk accesses, and enough memory to hold the code and data necessary to perform the operation. In the case where actions occur well spaced out in time, 100% of the server hardware is dedicated to each action. Each action will complete as fast as possible and will not need to wait for hardware resources to become available. The server machine is essentially idle between each action and actions are spaced out sufficiently in time so as not to overlap. The server is essentially unloaded.

When many users are initiating actions close together in time or a large number of background events are occurring, we get competition for the server hardware resources. Bottlenecks occur, where the code servicing a particular action must sometimes wait for hardware to become available in order to complete its tasks. When this happens, we say a server is under load.

2. Load and response time

When a server is under load, actions may take longer to complete than they would if the server were unloaded. For user-initiated actions, this can exhibit itself as increased response time. If the server is excessively loaded, users may perceive the server as slow or unresponsive. Imagine a given server with some specific hardware and a homogenous user community where each user performs the same set of actions randomly but, on average, evenly spaced out over a given time interval. With only a single user connected to the server, each user-initiated action completes before the next is initiated. The response times that user enjoys will be near the theoretical minimums allowed by the particular hardware in use on the user and server machines and the underlying network.

If we begin allowing more users to connect to the server, actions will begin to overlap in time. The Exchange Server software will begin to be bottlenecked for short durations while one portion of the software performing an action waits for another to relinquish needed hardware resources. Eventually, this bottlenecking will begin delaying actions to the point that user-initiated actions will take noticeably longer to complete.

It is this relationship between load and response time which defines the number of users a server can support. At some point, as the load on a server increases, the response times a given user experiences move from being acceptable to being unacceptable. This crossover point defines the number of these theoretical, homogenous users the particular server hardware can support.

3. Quantum effect

Of course, actions which cause load are not evenly distributed over time. The morning hours of the workday may exhibit the highest load as users arrive at work and spend time catching up on mail or public folder information which has arrived since the previous day. Conversely, lunch time may represent a lull in the day's activity as do evenings and weekends.

What's more, even if we ignore these macro-level load variations and assume fairly uniform usage intervals, load will vary over time due to individual differences in usage level and schedule. One way to think of user-initiated actions are as quantum events where the load a given user places on the server can be thought of as evenly distributed in time only in the aggregate, when the number of users on the server is high enough to "average out" the quantum effects of individual user usage patterns. For example, there will be bursts of activity due to coincidence, where a larger than usual group of users initiate actions at or about the same time, or due to scheduled events which impact groups of users on the same server such as when a large meeting adjourns. Additionally, the close to real time notification mechanisms inherit in Microsoft Exchange may contribute to the quantum effect to a certain degree. For example, users on the same server are notified a new message has arrived at essentially the same time. If the message was sent to a large number of users, a burst of activity may occur as many of those users read the message in a short time period.

Fortunately, the quantum effect becomes less and less of an issue as the number of users hosted on a given server increases. Given the number of users Microsoft Exchange Server can typically host even on smaller server machines, the impulse loads due to most user schedule and usage variations should not represent a large percentage of the total user population on the server except in extreme situations.

III. Factors Affecting Performance

A. Hardware

1. CPU

The type and number of CPU's in a server will dictate the performance potential for the Exchange Server environment. For example, computers based on the Pentium processor will offer better performance that computers based on the 486 chip. Also, a 133MHz Pentium will perform better than a 100MHz Pentium. Computers with multiple CPU's will offer performance potential that exceeds computers with a single CPU, but performance gains may not be linear. That means that a computer with two 133MHz Pentium CPU's will not typically provide twice the performance of a computer with only one 133MHz Pentium CPU.

2. Memory

Paging can be viewed as a contention for memory. As with most resources, some contention is tolerable. However, as the contention increases, the system will eventually reach a point where system resources (CPU time, bus bandwidth, disk time, etc.) are increasingly spending more of their efforts passing pages back and forth between the various processes which are contending for memory than actually doing real work. If we were to graph memory contention vs. average response times, we would see a fairly smooth line from zero contention up to a point where response times start to increase dramatically. This is often called thrashing. As memory contention increases past the point of thrashing, response times typically increase exponentially. Short periods of thrashing may be acceptable for some environments, but one should try to prevent systems from thrashing, especially during active, mission critical activities.

3. I/O Subsystem

When examining the I/O subsystem as it relates to performance, there are many factors to take into consideration. System administrators should take into account the type and number of disk controllers, the type of drives installed, and the choices required for fault tolerance and RAID configurations. Overall system performance for an Exchange Server, can be dramatically affected by these variables.

It is important to use all SCSI channels that are available and to add more channels, if necessary, to improve performance. Furthermore, the addition of more disk drives will help performance, especially if, like the Exchange Server Public and Private Stores, disk I/O is random in nature. All drives have mechanical limitations that limit their performance, and by adding more drives, the workload can be distributed more efficiently.

4. Network Hardware

For optimal network performance, one should consider adapter types and the type of network medium (such as twisted pair wire, optical fiber, coaxial cable, etc.). Some general guidelines in helping to optimize performance is to install a high-performance network adapter card in the server, only use protocols that are necessary and keep them to a minimum, use multiple network cards and segment the LAN, if appropriate. Network adapters can provide widely varying levels of performance. The characteristics of an adapter that can most affect performance involve the bus type, bus width, and the amount of onboard memory on the adapter.

B. Exchange Related Factors

1. Connected users vs. total users

When trying to answer the number of users per Exchange Server question, one has to consider how many users will be connected to the Exchange Sever simultaneously. In other words, I may be able to put more user accounts on one server if I know that not all of them will be connected at the same time. For example, if there is an organization that has two shifts of people, where both sets of users won't be connected to the Exchange Server at the same time, then they may be able to host more user accounts on the same server.

Now, there is a Background Cost for having more users on a server than are actually connected, but it isn't anywhere close to the cost of having more simultaneously connected users. It is the background vs. user-initiated tasks comparison that we are talking about. The things that will still happen are directory updates, rules will still get triggered, messages delivered, etc. even when the user is not connected.

2. Location and use of stores—PST vs. IS

Exchange allows users to store their messages in the server-based store or in Personal Folders (a PST file) on their local machine or on a network drive. If a user has personal folders, they will still have a server-based store. In fact, it is not possible for a user to only have personal folders. The server-based store must exist and will still be used to receive messages, process rules, etc. However, if users make their default delivery location their Personal Folders and have most of their messages stored there, it will offload a lot of work from the server. Messages will still be delivered to the user's server store and rules will still get fired in behalf of that user, but these rules may be deferred until the user logs in if it involves working with the users' PST file.

There are a couple of reasons why using personal folders offloads work from the server. First, whenever a user requests data from their personal folders, the server doesn't have to get involved to read the attributes and data into the buffer cache, mark the data as read if it wasn't already, update the unread message count, and send the data to the client. Second, if a user is using the option to save a copy of everything they send in the Sent Mail folder, it will be done locally and the server won't have to get involved. The same is true of the deleted items folder. When a user deletes a message, a copy is saved in the local Deleted Items folder and again, the server doesn't get involved. Finally, the server has less data to worry about on a continuing basis which will help during backups among other things.

3. Types of users

As explained previously, the users' traffic and interaction patterns will vary greatly between organizations. Some companies, like Microsoft, will basically run their entire company through its messaging system which creates very heavy user interaction. Other organizations may not rely so heavily on the messaging system and have low user usage levels. How heavily your users work with the Exchange Server will have a direct impact on the performance of an Exchange Server and the number of users that can be hosted on one box. The exact same hardware, performing the exact functionality may be able to host 500 light users in one organization, but only host 150 heavy users in another organization.

4. Public folder usage and replication

The use of Public Folders on an Exchange Server can have a dramatic affect on an Exchange Servers performance. The things that govern how much of a performance impact a Public Folder causes on a server is the size, frequency of access by users, the different views on that folder, the number of replicas, the replication schedule, and how often its contents change. If a public folder is large, then it will consume a lot of the store and cause more disk I/O to read them. If many users frequently access a Public Folder, then the server is going to be kept busy satisfying those requests. Also, it is the Public Folder that keeps track of the expansion state of each folder and the read/unread state of each message on a per user basis.

Although Public Folder replication is very fast (initial testing has shown that it takes only 61 seconds to replicate 1000 - 1 KB messages between two servers), it is very bursty in nature. Changes to a public folder get batched up and then get replicated out according to the replication schedule setup by the administrator. Even if the replication schedule is set to always, it only happens every 15 minutes. This can have an impact on a user because if replication kicks in when a user is perusing a public folder or the public folder hierarchy, the user will notice a sudden slow down that will only get better when the replication is finished. If there are a number of public folder replicas to update, then this activity can put even more stress on the server and the burst of replication activity becomes extended.

Public folders can contain messages and free standing documents. Messages are like any other mail message as far as the resources they consume on a server. Free standing documents raise another issue because they are typically large and in some cases, can be very large. If I were to put a 1 MB file in a public folder (which is very easy now since I can simply drag and drop a file into a folder), then any user that accesses that document brings 1 MB of data down across the wire. This obviously will affect the performance of the Exchange Server for every user. Also, since this data will also go through the buffer cache on the way in and out and consume lots of buffers, this will also clear out 1 MB of the cache for everything except this message, which means more data will have to be brought into the buffer cache when other users initiate an action.

Another important performance issue to consider, as seen by the user, is if a site spans a slow link and there is a replica of a public folder on both sides of the link. The algorithm used by Exchange to determine which public folder replica a user connects to is as follows. First, the user's home server (i.e. where the user's server store resides) is checked. If there is no replica of the public folder there or no pointer to a specific Public Folder server for all users on that Exchange Server to use, then any server in the site where a replica of that public folder exists, including a server across the slow link, is chosen at random. Once that server is chosen for that user, they will continue to get connected to that server every time they access that public folder. The reason the same server is chosen is because that server is responsible for keeping up with the read/unread state of each message and the expanded state of each folder in the hierarchy for each individual user. Since that information is stored in the folder at a specific server, the user must continue to connect to that public folder replica to keep this information in sync. In situations where this is the case, it is most often a good choice to put these servers in separate sites.

5. Rules and Views

Rules are user defined actions for a server to perform on behalf of the user. Examples of rules are to popup a notification when a message is received from a specific person or to automatically move messages into a specific folder based on the contents of the message. Initial testing has shown that this rule processing is very negligible in regards to performance in all but the extreme cases when every user has 10's of rules setup.

Views are a little bit different in that the server must store and keep track of the indexes that make up a view. A cache is used to keep the most recently used indexes, but a user may notice a small performance hit when using a seldom used view. In the overall scheme of things, however, this performance hit is basically so minimal as to be in the noise range.

6. Electronic Forms

Electronic Forms are basically like all other messages for the most part. The differences comes when a form is executed the first time. In this case, the client must first download the form into the local forms cache and then execute it. This is a one time hit on the client and will only happen again if the form changes. This can cause a burst of activity, however, in the case where a new electronic form is sent to a lot of users and they all try to execute it at about the same time.

7. Schedule+

Schedule+ uses a hidden public folder to replicate the free and busy information. Within a site, this replication just happens. Between multiple sites, this replication must be setup by the administrator. This replication of data turns out to cause very minimal impact on Exchange Server.

The real performance concern when working with Schedule+ is whether the user is working with a local schedule file or not. If the user is working with a local file, which is the default, Schedule+ is very responsive and only has to update the schedule information on the server in a periodic fashion. This is designed to be very efficient. However, if a user chooses not to work with a local schedule file, then each change to a person's schedule causes interaction with the server. Since Schedule+ was designed to work in a standalone manner and not just against a server, it was optimized to work with a local file first and then update the server rather than to just interoperate with the server. Initial testing has shown that not working with a local schedule file can cause a significant impact on that user and, if enough users are working in this manner, then it affects all users.

8. MAPI Applications

It is hard to quantify the effect MAPI Applications will have on an Exchange Server because each one will be different. However, it is important to note that it is very easy to write poor performing MAPI applications. Most of this performance issue is related to how efficient the MAPI application is at using RPC's to communicate with the server. Many MAPI functions can be batched together to go to the server in one RPC, rather than each causing its own RPC. An example of this was seen in the Exchange Development team where, initially, the default Exchange read note took 14 RPC's to retrieve and display a message and now, the same action takes 2 RPC's. This may not seem like much if you are just talking about one user. However, if all 250 users on a single Exchange Server were causing 14 RPC calls to retrieve and display a message, the impact on the server would be tremendous. Much of the work done by the Exchange Server Performance Team was to help each area of the Exchange Client and Server code be more efficient in using RPC's.

Any time a developer is writing a client/server application, it is extremely vital to write the code with performance in mind. If just one poor piece of code is present in the application, then this gets multiplied by the number of users using that code and can really affect performance.

9. Connectors and gateways

Both the connectors and gateways are processes that run on the server. As such, they will contend with the other Exchange processes for resources on the server. These connectors and gateways communicate directly with the store, also causing contention for this resource with the other Exchange Server processes. Although this contention does degrade the performance of the server, they operate in a very bursty fashion, so this degradation may be only temporarily noticed by the users.

The beauty of Exchange Server is that if a specific connector or gateway is causing a lot of performance problems for users, one can do one of two things, dedicate a machine for this connector or gateway, or add a second connector or gateway instance somewhere else in the site. Since sites are treated as one in Exchange Server, both of these options are transparent to the users and are fairly simple to implement.

10. Directories, Replication, and the Bridgehead Server

A bridgehead server is a server that acts as the door in and out of the site's directory. In other words, when the administrator configures a directory replication connector between two sites, the administrator chooses which server in each site will act as the bridgehead server. It is the responsibility of this bridgehead server to exchange all directory updates with the partner bridgehead server. Within a site, directory replication just happens amongst all servers. Since this processing is proportional to the number of directory changes that occur, it is hard to state how much of an affect it will have on the overall performance of a server. During the migration period to Microsoft Exchange Server when the directory is changing often or in organizations where hundreds of directory updates happen every day, this may have some impact on performance. Generally, though, the directory replication process would be in the noise range as far as performance is concerned.

11. MTA

The main responsibility of the Message Transfer Agent is to route messages between multiple servers. However, even in the single server scenario, it also has tasks to perform. These tasks are the expansion of distribution lists and the routing of any outgoing gateway or connector message. The point here is that the MTA will have its greatest impact on performance when multiple Exchange Servers are deployed, but it will also cause some processing to occur, albeit less than in the case with multiple servers, even if it is working in a single Exchange Server environment.

C. Non-Exchange Related Factors

1. Other Server Activities

All of the tests that the Performance Team performed in our labs was done on a pure Exchange Server. In other words, Exchange Server was installed on a Windows NT Server 3.51 machine that was a member server and had nothing else running on it. We realize that, in real world implementations, this won't always be the case. In fact, one of the beauties of Windows NT Server lies in the fact that you can run multiple server applications on one box as many of our customers are doing today. Obviously, running multiple server applications simultaneously on the same server will have an effect, perhaps even a dramatic effect, on performance for all applications on that server.

It is outside the scope of this document to delve into the performance intricacies of all of the other Windows NT Server applications and how they might affect performance of Exchange Server. However, some general statements can be made. For example, simultaneously running server applications that aren't very resource intensive, like having the server act as a Windows NT Domain Controller, WINS Server, or DHCP Server where there aren't a lot of client activities requesting those services, will not have that much of an impact on Exchange Server performance. Likewise, simultaneously running applications like Microsoft Systems Management Server, SQL Server, or SNA Server, which are much more resource intensive, will have a large impact on Exchange Server performance.

2. Network Quality and Connectivity

One item that is often not considered when discussing performance issues for a client/server application is the quality of the network that it is being run on. For example, putting the Exchange Server on a FDDI ring with capacities of 100 Mb/s will provide better connectivity and performance than a server that is attached to a Token Ring network. Also, an overloaded ethernet segment where lots of collisions occur will reduce the performance of Exchange Server. WAN connectivity and quality should also be considered, especially when trying to determine site boundaries or whether or not to have clients access their Exchange Server across the WAN.

IV. Finding the Limits

Microsoft Exchange Server has no explicit limit on the number of users you can configure to reside on a single server. There may, however, be practical limits not directly related to server performance which may ultimately prove to be the limiting factor on the number of users a given server can support.

At present, there is a 16 GB limitation on the size of the public and private information store databases (32 GB total). This could become the limiting factor on larger servers hosting many users depending upon the amount of server storage utilized by each user, whether user quota limitations are used and their values, the degree to which single-instancing of messages and attachments increases the logical storage capacity of the server, the extent to which personal folder stores are used, and to a lesser extent, the number of rules, views and finders that are defined by users on the server. There may also be practical limitations which are dictated by the time it takes to backup a very large server database.

Disregarding these limitations, how does an organization determine how many of their users they can host on their servers. The purpose of this section is to explain to an organization just how they go about answering this question.

A. Testing Methodology

So, what are the steps an organization should take to answer the question? Generally, the process has four major steps. First you optimize your server. Optimization includes optimizing the hardware and then optimizing the Exchange Server setup. Second, you classify your users and set expectations as to what acceptable response times might be. Third, input your user data and utilize the LoadSim simulation tool over range of number of users. Finally, analyze your data to get your answer.

B. Optimizing Your Server

Say you have purchased some hardware on which to run Microsoft Exchange Server, but you are not sure how to properly configure that hardware in the most efficient manner for use by Microsoft Exchange. Should you stripe all your drives together or configure a bunch of separate partitions for the different server components to use? How much memory should you dedicate to the Exchange Server buffer cache? Read On!

1. Optimizing the server hardware

Since there are three different areas of hardware that affect Exchange Server, let's look at each one individually.

1.1 CPU's

There is not a lot of optimization you can do with the CPU(s). They are either in your server or not. The key is how many and of what type. In our testing in the Exchange Performance Lab, we determined that Exchange Server does not scale well past four CPUs. We are still working on the reason why this is the case, but for now, we must just take it as a fact of life. Given this fact, for a Windows NT Server dedicated to Exchange Server, having more than four CPUs does not provide you any boost in performance and those CPUs could be put to better use elsewhere. As far as the type of CPU, the faster the better. Therefore, using a Pentium 133 chip will provide you with much more performance than a 486/66 and a little more performance than a Pentium 100.

There is one optimization you can make with Windows NT Server that affects how your server uses its CPUs. This parameter is found in the system applet in the Control Panel under the Tasking button. You should set this parameter to Foreground and Background Applications Equally Responsive. This boosts the system priorities Windows NT Server places on background tasks, of which all Exchange Server services are.

1.2 Memory

As with the CPUs, there isn't a lot of optimization you can make with memory. However, unlike CPUs, Exchange Server will use all the memory you can give it up to the total size of your Information Store. At that point, you have your whole database in memory. Now, buying enough memory to put your whole Exchange Server database into memory is a little unreasonable, but Exchange Server will use most of the memory you give it for a buffer cache so that writes to the Information Store will get cached in memory for off-peak processing. The minimum amount of RAM for an Exchange Server is 24 MB, but our testing showed that having 64-128 MB or RAM provided much better performance, much more so than an upgraded CPU type.

There is some optimizations that you can make with Windows NT Server that will improve its use of memory and be better for Exchange Server. For example, in the network applet in the control panel, you should configure the server service to Maximize Throughput for Network Applications if it is not already set this way. Also, in the system applet in the control panel, you should configure the Virtual Memory to have a large page file. It is very difficult to determine exactly how large the page file should be, but a good rule of thumb is to choose 125MB+(Amount of physical RAM). For example, if I had 64 MB of RAM in my server, I would set my page file size to be (125+64) MB or 189 MB.

1.3 Disk I/O subsystem

The Disk I/O subsystem is one area where lots of optimization can be done. However, before explaining how to optimize the disk I/O subsystem in your machine, some background information should first be understood.

1.3.1 The Server Transaction Logs - Sequential Access is your Friend

Microsoft Exchange Server issues I/Os to the disk subsystem on the server to read data from disk into memory or to write data out to permanent storage. For example, when a user opens their inbox, the set of properties in the default folder view must be accessed for each of the first 20 or so messages in the users inbox folder and returned to the user. If this information is not already cached in memory on the server from a recent, previous access, it must be read from the server information store database on disk before the action completes. The disk read I/O is synchronous to the user-initiated action.

Similarly, if a message is transferred from another server, the message must be secured to disk before the receipt of the message can be acknowledged to prevent message loss in the case of power outages. In this case, the disk write I/O is synchronous to the background action of transferring and accepting the message.

The disk I/Os issued by Microsoft Exchange Server are either reads or writes and are either synchronous or asynchronous. Additionally, while all read I/Os and asynchronous write I/Os can be considered random, many of the synchronous writes issued by Exchange Server are sequential. That is, a special method of writing changes to disk known as a sequential, write ahead transaction log is used in order to speed up actions which require synchronous write I/O.

The beauty of the transaction log architecture lies in the design of today's modern disk drives. The time necessary to physically move the disk heads from some random location over the magnetic media to the particular track on which the required data on the disk resides (or in the case of writes, the location where the data is to be written) is the primary dictate of disk access speed in today's disk drives. Many disk drive manufactures publish an "average seek time" (e.g. 10 ms) for their drives which gives an indication of the time it takes for a disk head movement to occur on average from one random disk location to another. This seek time makes up the majority of the time needed for most random disk access I/Os. The reciprocal of the average seek time for a given drive generally gives an upper bound of the number of random disk I/Os per second a given disk drive can support (e.g. 100 random access I/Os per sec.). In reality, like miles-per-gallon ratings for automobiles, your mileage may vary and will generally be lower due to disk rotation time and other issues relating to performing the disk I/O operation.

If, however, the I/O to the drive is completely sequential in nature, the disk head will generally not need to move at all, only occasionally needing to shift over a single track. The average seek will drop to close to zero, dramatically increasing the number of disk I/Os per second the drive can support.

For this reason, hosting the Exchange Server database transaction logs, and in particular, the Exchange Server Information Store logs on their own physical disk drive is critical to insuring good disk write I/O performance on all but the smallest, single drive Exchange Server configurations. Placing the Information Store Transaction logs on their own physical disk with no other sources of disk I/O on the drive is the single most important aspect of Exchange Server performance. Also, it is best to utilize the FAT file system for this drive since it performs the best with sequential activity. (Please note that the NTFS file system must be used once this log exceeds 2 GB.) It is important that there are no other sources of disk I/O to the drive since although the transaction log may be written to sequentially, other I/O to the drive will cause the disk head to move away from the log file, thus causing longer seek times. It is also important to note that even many "read" user-initiated actions involve writing to the server information store transaction log. For example, when a user reads an unread message, the message must be marked as no longer unread and the number of unread items in the folder updated accordingly.

While the Exchange Server Directory Service also utilizes a write ahead transaction log architecture to speed up synchronous writes, changes to the directory service which would cause disk write I/Os occur seldom enough that dedicating a separate, physical disk drive to the Directory Service Transaction Log is probably not worth the extra hardware expense with the possible exception of servers on which a large number of directory service modifications are made, such as a server on which large directory imports occur.

1.3.2 Random disk access I/O—black and white and read all over

Other than the transaction logs, the remaining sources of disk I/O on an Exchange Server are generally random in nature. This includes the NT pagefile, server databases, message tracking logs, etc. However, the number of disk I/Os issued by the different parts of the system will vary over time as different server components do their work in turn. For example, when a message is received from another server and delivered to a user, the MTA first secures the message to disk in its transient database causing a single random write the MTADATA directory. It then makes a call to the System Attendant who writes a log entry into the message tracking log. The MTA then notifies the Information Store that a message is available at which time the IS receives the message from the MTA and writes the message into its own permanent database, generating synchronous write I/O to the IS transaction log as well as asynchronous read and write I/O to the Information Store database in the MDBDATA directory. During this time, page faults may also have occurred it the system is under memory pressure, adding additional I/Os to the NT pagefile or executable files.

Due to the random access nature of the non-transaction log server disk I/O, utilizing the remaining disk drives in the system to enhance the number of I/Os per second of the partitions on which these random I/Os occur is generally the best way to increase the random disk I/O performance of the server as a whole. Because the source of these random server disk I/Os varies over time, combining the remaining disk drives together into a software or hardware stripe-set allows the combined capacity of the all of the remaining disk drives in the system to be available to whichever server component is performing I/O at a given moment.

1.3.3 What about the pagefile?

Well, lets first look at when you are likely to page. Remember, paging is just Windows NT doing its virtual memory thing. You page when the all the processes running on the Windows NT machine (including Windows NT itself) need more code and/or data pages over a period of time then there is physical memory in the machine.

So, if you have a lower memory Exchange Server machine, with say 24Mb and 50-100 users, your going to page some on machines doing the normal set of Exchange Server things - handling user requests, moving mail off/on server, etc. Said another way, in such a configuration, paging will make up a significant percentage of your total disk I/O. This is because the working sets of all the Exchange Server processes (and Windows NT Server) don't all completely fit in physical memory all at once. If the server is the only one in the site and so off-server traffic is nil, then the MTA won't be doing much (except expanding the occasional Distribution List) and it will swap out leaving more room for the rest of the server processes and you will page less. Conversely, if you run additional processes beyond the core Exchange Server services on that same box such as the Internet Mail Connector, or doing anything non-typical like importing directory objects, generating the Offline Address Book, starting up the server services, etc., in general, you will page more.

If you have "enough" memory in your Exchange Server machine so that the majority of the pages needed by all the server processes fit into physical memory at the same time, you won't page very much during normal operation. Remember, however, that the working set of the Information Store and the Directory Store includes the buffer caches. Even on big monster boxes with 128Mb of RAM or more, you can page with very little activity if the buffer caches are set too high for the amount of RAM in the machine. The Exchange Performance Optimizer tries to set these correctly, but sometimes, as with beta 2, we guess wrong. (In case you haven't heard yet, you should manually drop the MDB buffer cache by 30 to 40% below what the optimizer sets it on beta 2 machines for optimal performance - otherwise - surprise! you will page excessively).

When paging does occur, the I/O is fairly randomly distributed over the page file. Additionally, it can be bursty - like when a really big message flows through the system and memory pages have to be paged out to make room for it and then paged back in after its gone. For these reasons, it is almost always best to host the pagefile on the stripe set. On lower memory machines, the paging will be a significant percentage of your I/O and you will want all that I/O capacity at its disposal. Speeding up disk I/Os due to paging in this configuration will speed up your machine. Low memory machines will often be bound by the pagefile (but to fix this, you need to add memory, not disk!).

Again on the larger memory machines, paging will be less than on the lower memory machines. You could entertain the notion of hosting the pagefile just about anywhere in this case because most of the time, paging will be about 0. That is, except for bursts, startup, etc. You could even host the page file on the transaction log drive if you really wanted to because it won't get accessed very often, it won't interfere with the sequential activity of the transaction log (very often). In general, however, to handle the bursts and be sane about buffer settings, you should almost always just host it on your big, random I/O partition - the same one the databases should be on.

It is not recommended to dedicate a disk just to the pagefile because a) if you are paging, you would get better I/O performance by combining that drive into the stripe set and putting the pagefile there and b) if you aren't paging, its kind of a waste of a drive - might as well add it to the stripe set and boost the size and random I/O capacity there by some percentage.

You are going to want to stripe drives together to handle big databases since they are big, single files. Thus, you probably won't have that many separate partitions over which to spread the pagefile. Putting it on the stripe set achieves the same thing, however, and as Martha Stewart would say, it's a "Good Thing". It lets the majority of the disk I/O subsystem answer the call of whatever portion of the system - including the pagefile - needs random access disk I/O capacity at the time.

2. Running the Exchange Performance Optimizer

As mentioned earlier, a server under load which exhibits poor response time will generally be bottlenecked on one or more of the three critical server hardware resources: CPU processing capacity, RAM or the I/O subsystem. But which one?

Wait. Before you go racing out to your local computer shop to pick up more RAM, the first step is to make sure your Microsoft Exchange Server software is configured properly for your specific server hardware.

Server machines vary widely from low-end single processor, single disk servers with small amounts of RAM to giant multiprocessor monsters which rival the power of mainframes. Microsoft Exchange Server will run on all of them, but there are configuration settings which need to be adjusted to insure Exchange Server is properly optimized for your specific server hardware.

The Microsoft Exchange Performance Optimizer will automatically detect your server hardware and adjust those configuration settings which are hardware dependent.

If you have not run the Optimizer after installing Microsoft Exchange Server on your server hardware, all bets are off. Exchange Server should still run, but it will almost certainly perform poorly. Any time you add or remove hardware to your Microsoft Exchange Server machine, you should re-run the Exchange Server Optimizer to insure Microsoft Exchange Server is properly tuned for your hardware.

C. Using LoadSim

1. User definitions

When an organization is trying to find their own answer to the users per server question, it will be important to represent their users as accurately as possible. There will be some organizations that will have no idea how their user base will use Exchange because they may have never had any messaging products before on which to base their numbers. Other organizations will have some idea from previous experience how their users will utilize a messaging platform, yet at the same time, Exchange will provide some features and functionality never before available to their user community, so the organization may not fully understand what impact these new features may have on their users. This goes to show that this process is not an exact science and should just be used as an intelligent estimate for the number of users per server question. Therefore, using LoadSim as the means for estimating the number of users a server may host may be somewhat iterative in that an organization may want to run LoadSim a second time after their users have had some experience with Exchange and as the organization plans for future growth.

Before giving the data we used, it is important to make a few notes for the reader. All of this data is presented so that the reader can have some basis for setting their own user classifications and for understanding the numbers we received through our tests. It should, by no means, suggest that the Exchange Performance Team has all the answers for classifying user load levels and that this is the final word on the number of users per server. This data is just a sampling of numbers from our labs, in our environment, with our classifications and there are no warranties, implied or otherwise, that these are the correct numbers for any other organization.

As stated previously, in the testing the Exchange Performance Team did in the labs, we classified our users as low, medium, and heavy usage. Initially, the users' states looked like the following table:

Parameter

Low Usage User

Medium Usage User

Heavy Usage User

Number of Non-Default Folders

20

40

60

Number of Messages per Folder

5

5

5

Number of Messages in Inbox

1

4

9

Number of Messages in Deleted Items Folder

1

1

1


A low usage user to us was a user that daily:

Our definition for a medium usage user was a user who daily:

We defined a heavy usage user as a user who daily:

The above data gave the following computed averages:

Low Usage User

Medium Usage User

Heavy Usage User

Messages Sent per Day

6.7

19.8

38.6

Messages Received per Day

20.4

55.9

118.9


The message mix for these users was also varied. We used four different message types: 1K, 2K, and 4K messages, plus a message with a 10K attachment. A mix of these messages was used for each user type and weighted as follows:

Message Type

Light

Medium

Heavy

1K message

9

7

6

2K message

(none)

2

2

4K message

(none)

(none)

1

10K Attachment

1

1

1


2. Server Definitions

The following table shows the definitions the Performance Team used to define the low-end, middle, and high-end servers:

Server Type


Manufacturer


Processor


RAM


Disk Config


Network Card

Low-End

Server A

Gateway 2000

1- 486/66

32 MB

1 - 515MB

1 - 1 GB

Intel EtherExpress Pro

Server B

Compaq Proliant

1 - 486/66

32 MB

1 - 1 GB

1 - 2 GB

Compaq Netflex II

Middle

Server C

Compaq Proliant

2 - 486/66

64 MB

1 - 2 GB

Compaq Netflex II

Server D

Compaq Proliant

1 - Pentium 90

64 MB

1 - 2GB

1 - 8 GB Stripe

Compaq Netflex II

High-End

Server E

AT&T 3555

8 - Pentium 90

512 MB

2 - 2 GB

1 - 24 GB Stripe

1 - 16 GB

3 Com Etherlink III


Running LoadSim

Without going into great detail (that's another 10 page essay) on how to run LoadSim, there are a few basic steps.

    Make sure you have classified your users and setup your servers that you want to test.
  1. Install the actual Exchange Client on the Windows NT Workstation(s) or Server(s) you plan to use as the LoadSim client(s). LoadSim only runs under Windows NT.
  2. Determine what the acceptable response time should be for your users. Usually, 1 second (1000 milliseconds) is a good number to use for this, but you may decide that 1.5 seconds (1500 milliseconds) is adequate for your company.
  3. Using the LoadSim tool itself, define your test topology and generate your user import files.
  4. Using the Exchange Server Administration program, import those user definitions into your Exchange Server Directory. You will have to do this on all of the servers you plan to test.
  5. Using the LoadSim tool, define your initial state of both your users and public folders.
  6. Run the User Initialization and Public Folder Initialization tests against your server to populate the Exchange Server information store.
  7. Define some tests using your user classifications that you have previously defined and run them against your different server platforms. For example, you may decide you want to test 250 light users against you medium defined server.

Each pass of the LoadSim tool will run for several hours and produce one number which is the 95% weighted average response time in milliseconds that each of the LoadSim users experienced during that test. You will use several different user counts to generate several different data points and then graph the results. Some example results are shown in the next section.

Sample LoadSim results

The following tables show some sample LoadSim results that were derived from the testing performed by the Exchange Server Performance Team. Disclaimer: All of this information is Microsoft confidential and subject to change without notice at any time before Exchange 4.0 ships. All performance results should be considered preliminary. Microsoft is not responsible for the results of any changes, policies or decisions made in reaction to, or on the basis of, this data or any conclusions drawn from it.

The way to interpret the above graphs is to found out where the line crosses your desired response time. That will tell you how many of those types of users can be hosted on that specific hardware platform. For example, if we take the first graph which shows light users running on a 486/66 class machine with 24MB RAM, and one 1GB hard drive, you will see that the 95% weighted average response time crosses our desired response time of 1000 milliseconds somewhere between 100 and 125 users. That means that this specific server can host around 112 light users as we have defined them.

V. Zen and the Art of Performance Analysis—Detecting Performance Bottlenecks

Say you have an Exchange Server whose users complain the server is too slow. Do you need to buy more hardware? What do you buy, more RAM, more disks, another processor board perhaps? Perhaps you have not yet purchased hardware on which to run Microsoft Exchange Server and need more information on which hardware components in what ratio to purchase? If so, than this section is for you.

First off, it is important to realize that detecting performance bottlenecks is almost an art and it gets better with experience. For example, you may think that memory is your bottleneck and go out and buy more memory for your server only to find out later that it isn't performing any better than before because the CPU is the bottleneck. The goal is to determine which of the three major parts of the server - memory, disk I/O subsystem, or CPU - is the bottleneck. The Windows NT Performance Monitor is a great tool to aid in making the right decision.

A. Memory

To determine if the system requires more memory, you need to investigate how much your system is paging. Use the Performance Monitor to track Paging File: % Usage and Memory: % Available. If the paging file is over 50% used and if the percent memory available is less than 25%, then it would be a good time to add more RAM.

B. Disk I/O

What you are looking for to determine whether or not the disk I/O subsystem is the bottleneck is whether or not the server is being I/O bound on asynchronous I/O's to the Exchange Server database. Some good performance monitor counters to look at for this are the PhysicalDisk: Disk Queue Length and PhysicalDisk: %Disk Time.

The Disk Queue Length parameter shows how many outstanding disk requests there are per physical disk. It includes requests in service at that time. Multi-spindle disk devices (like a RAID stripe for example) can have multiple disk requests active at one time. Therefore, what you want to look for is the disk queue length minus the number of spindles on this disk device. You are disk I/O bound when the difference between the length of this queue and the number of spindles on this disk device is consistently high. According to the performance monitor, this difference should average less than 2 for good performance.

Disk time is the percentage of elapsed time that the selected disk drive is busy servicing requests. If this is a high percentage, then it shows that your system is spending most of its time servicing disk requests and needs to either get faster disk drivers or more disk drives over which to spread these requests.

C. CPU

The CPU(s) may have a high utilization even if it is the disk and/or memory that is the actual bottleneck. Therefore, it is best to check those two areas first and then to check the CPU. To determine if the system requires more processors, track System: % Total Processor and Process: % Processor—Process X. If the total system processor usage is averaging greater than 75%, then it usually means you need to add another processor. It is interesting to review the usage of all processors to understand the symmetrical multiprocessing capabilities of the Exchange functions on a Windows NT Server. Remember, for the first release of Exchange Server, testing has shown that it does not scale well past four CPU's. Therefore, if you already have four CPU's and are still CPU bound, then you either need to upgrade the processor type or add a second server.

VI. Conclusions

So, how many users can you host on a single Microsoft Exchange Server?

Hopefully through this whitepaper you have seen how this simple question to ask is not so simple to answer. Even the question itself must be defined before one can proceed as we saw when trying to define what a user or a server actually means. The other thing to remember is that every organization that tries to answer this question will get a different result that will be specific to their environment.

Now, it's not that this question can't be answered, for indeed it can be answered. However, it is up to every organization to figure this answer out for themselves in order for the answer to be accurate. At least now, an organization knows what factors affect the performance of a Microsoft Exchange Server, how to make optimizations in their configuration to get the most out of an Exchange Server and what to look for when trying to detect bottlenecks.