For a client application, the chosen threading model typically does not have a significant effect on performance. However, services that handle tens or hundreds or remote users need to select carefully the threading model they use because a poor choice can lead to poor performance.
The simplest thread model is a single process with a single thread that handles all the connected clients, typically using the select() API to multiplex between the clients. A single loop calls select() repeatedly, calling send(), recv(), or accept() when the select() call indicates that an action can be performed on one of the sockets. While simple to implement, the performance of services using this model can suffer because every network I/O call passes through select() which incurs significant CPU overhead for each I/O. This is acceptable when CPU use is not an issue, but presents a problem when the service requires high performance.
The next most complicated model is a single thread for each client. Every time a client connects, the service calls CreateThread() to create a thread which handles the client for the duration of the connection. This model can achieve very high performance when the number of connected clients is small, but at around 40 clients the thread context switching and resource overhead begins to present a significant burden for the system.
A similar model to thread per client is process per client. In process per client, an entire new process is created using CreateProcess() for each client that connects. This is the model typically used in UNIX Daemons. This model is discouraged for Windows NT and Windows '95 because processes are much more expensive than threads, both in terms of resources used and in the CPU overhead of context switching.
The most efficient threading models use a worker thread pool. In this model, a pool of threads services all the connected clients. In worker thread models, the service will usually use overlapped I/O with the Win32® WriteFile() and ReadFile() APIs to facilitate multiplexing and minimize the number of threads required by the service. In Windows NT 3.5, I/O completion ports may be used for additional efficiency. While this threading model is the most efficient for large numbers of clients, it is also the most complex. Therefore, it should only be used where it is required to have high performance with large numbers of connected clients.