Power Outlets in Action: Windows Sockets

Ruediger R. Asche
Microsoft Developer Network Technology Group

Created: October 21, 1994
Revised: June 1, 1995 (redesigned class definitions; incorporated information on MFC sockets)

Click to open or copy the files in the CommChat sample application for this technical article.

Abstract

Following up on the first two articles in my networking series, "Communication with Class" and "Garden Hoses at Work," this article describes how Microsoft® Windows® Sockets can be used to establish and maintain communications between computers. As in the case of named pipes, I've provided a C++ class that encapsulates the specifics of Windows Sockets.

This article also discusses addressing schemes for TCP/IP and IPX, which are the most important network protocols for which sockets are currently available. Socket support for a wide range of other protocols is available in Windows NT™ version 3.5.

Road Map

This article is third in a series of technical articles that explore network programming with Visual C++™ and the Microsoft® Foundation Class Library (MFC). The series consists of the following articles:

"Communication with Class" (introduction and description of the CommChat sample)

"Garden Hoses at Work" (named pipes)

"Power Outlets in Action: Windows Sockets" (Windows® Sockets)

"Aristocratic Communication: NetBIOS" (NetBIOS)

"Plugs and Jacks: Network Interfaces Compared" (summary)

The CommChat sample application illustrates the concepts and implements the C++ objects discussed in these articles.

Introduction

The first version of the CommChat sample application (in the October 1994 edition of the MSDN Library) gave you only one choice—named pipes—for the communication type. All other choices were greyed in the Select Communication menu. To run CommChat, you needed two Microsoft® Windows NT™ machines that had the Server service running (and thus could create named pipes).

The new version of CommChat that I've provided with this article has the sockets option enabled, so you can now select sockets as your communication type. However, to create and communicate with sockets successfully, you will need to install the appropriate software on the machines that will use sockets. Before we discuss the software requirements, we will need to examine the nature of sockets.

Beginning with Visual C++ version 2.1, MFC supports a built-in socket class hierarchy, CAsynchSocket, and its derivative, CSocket. My class implementation uses Windows sockets directly, that is, it accesses the socket application programming interface (API) provided by the operating system without the MFC encapsulation. In the "MFC Sockets" section, I will discuss the relationship between my class hierarchy and the class hierarchy provided by MFC.

What Are Sockets?

There are two ways to look at sockets: (1) as a mechanism for transferring data between remote or local processes (similar to named pipes); or (2) as a mechanism for making the transmission control protocol/Internet protocol (TCP/IP) suite available to user applications. These two views do not contradict each other if you view them within a historical context: Initially, sockets were designed as local interprocess communications (IPC) mechanisms. Later, they turned out to be useful for providing applications with access to TCP/IP-based communications. Eventually, the sockets application programming interface (API) proved itself to be both abstract enough to provide communication objects without explicitly addressing TCP/IP and flexible enough to be implemented on non-TCP/IP protocols. Let's look at TCP/IP first.

Another Diagram with Squares and Arrows!

TCP/IP consists of two things: TCP (transmission control protocol) and IP (Internet protocol). In this section, I will sort out the two protocols in a nutshell; see the bibliography at the end of this article for a list of books that discuss the ins and outs of TCP/IP in detail.

IP is a low-level protocol that roughly defines two things about packets to be transmitted over a network: (1) what the packet should look like, and (2) how the packet knows which machine it is going to. The destination machine is defined via an "address" (the Internet protocol expects an Internet address) encoded in the header that IP prefixes to the packet. We will look at what constitutes an Internet address later on.

IP resides on the network layer (that is, the third layer from the "bottom") of the ISO/OSI hierarchy. (For an illustration of the hierarchy, see "OSI Reference Model" in Chapter 15 of the Windows NT Resource Kit, Resource Guide [MSDN Library, Product Documentation, SDKs, Resource Kits, Windows Resource Kits].) The only layers below IP are the data-link layer and the physical layer, which are represented by the network card driver and the network card, respectively.

Although it is theoretically possible for an application to address IP directly—the application would have to know the IP format, manually wrap the transmission data into an IP packet, and take care of all other aspects of the communication—IP is normally addressed only by TCP or a slightly simpler protocol called user datagram protocol (UDP). Both TCP and UDP deal with complete transmissions, that is, they allow a user-mode application to pass a chunk of data to the respective protocol and work with IP and the network card to send the data over to another machine. Applications and services written for TCP are more common than applications coded for UDP.

The main difference between TCP and UDP is that TCP is connection-oriented, reliable, and byte-stream–oriented, whereas UDP is not. These terms are explained below.

Connection-oriented: Under TCP, two machines that wish to enter in a conversation must first establish a connection with each other. A connection is defined by four components: the IP addresses of the two machines and the ports over which they communicate. (We will look at ports later on.) Note that this makes it possible for a machine to use the same port for different communications. The TCP header includes all four pieces of information: the receiver port and address to route the packet correctly, and the sender port and address so the receiver can assign the packet to a specific connection.
Basically, the only difference between IP and UDP is that UDP incorporates port numbers into the communication and optionally computes the checksum of the packet so that a certain degree of reliability can be achieved.
Reliable: TCP sorts out the data to be transmitted into appropriate-size datagrams, sends the datagrams over the network, and waits for acknowledgements. If it receives no positive acknowledgement, TCP times out and retransmits the data. The receiver of a TCP communication waits until all datagrams of a transmission have arrived, sorts the datagrams in the right order, and reassembles them into a complete stream. All UDP does is send the datagrams over the network; UDP does not guarantee the arrival of the datagrams or ensure that they are delivered in the order in which they were sent.
Byte-stream–oriented: Under TCP, it is up to the protocol to disassemble a transmission into datagrams of appropriate sizes and put them back together on the other side; in other words, a user-mode application knows only that it sends x bytes to the TCP layer, and that the other side gets x bytes back. TCP may (and probably will) chop up the data into appropriately sized datagrams, send those over, receive the datagrams on the other side, unwrap the data, and put it back together. UDP, on the other hand, takes the data to be transmitted as one big chunk, and sends the chunk over the network. The lower levels (IP, network card, network card driver) may disassemble the chunk, but the protocol sees one packet as only one message.

Not a place where ships pull in . . .

Now, what exactly is a "port"? If the only way you could transfer data between two machines was by using the machine names or addresses, there could be only one active communication by any machine at any time because a machine would be the smallest granularity to which a network packet could be sent. We want to be able to do better than that, though—it should be possible for a machine to communicate simultaneously with multiple machines, inasmuch as most operating systems are capable of running more than one process at the same time. Would it be feasible for a process that wants to establish a network communication to be refused the connection because another process claimed the single network resource?

The concept of a "port" was defined to solve this problem in the IP world. A port is basically a refinement of an IP address: A computer that receives a packet from the network can further refine the destination of the packet through a unique port number that is determined when the connection is established.

A port is similar to the name part of a named pipe: If a named pipe server supported only a single pipe (for example, \\<server>\pipe), all processes wanting to address a pipe on any machine would have to share the same pipe. The named pipe architecture allows the creation of several pipes with unique names, thus enabling multiple processes to concurrently access different pipes on the same machine.

A number of so-called "well-defined" ports have reserved numbers that correspond to predetermined functionalities. An application program can use any port that hasn't been reserved to establish and maintain a connection. To transmit packets across the network, both UDP and TCP require that ports on both the transmitting and the receiving sides be known.

Using a port and a destination address, it would now be possible for an application program to access TCP or UDP directly to establish and maintain a network communication. However, an application normally interacts with some kind of abstraction that hides the gory details of the protocol from the application programmer. Several such abstractions are currently available, for example, transport layer interface (TLI), streams, proprietary interfaces, and sockets. By far the most common of these are sockets, which we will examine from an application programmer's point of view in the next section.

The figure below shows how the different protocols interact.

A Socket as a Communication Abstraction

A socket is similar to a named pipe—it is an abstraction that allows applications to view a network communication almost as they would an I/O stream. The "core" set of API calls that are used to access sockets appears quite similar to the named pipe API (these calls comply with the Berkeley Software Distribution [BSD] socket specification), and a set of additional calls allows sockets to work with the message-driven Microsoft Windows® API. These two API sets constitute what is known as Windows Sockets.

This article does not discuss the Windows-specific extensions of the sockets API because in a multithreaded world it is very easy to accomplish the same effect as asynchronous or message-driven I/O by merely relocating the I/O into secondary threads. The Windows extensions do not have much to do with how sockets work and how they accomplish data transfers over the network (which is my main interest in this article). You can find a good discussion of the sockets extensions in the article "Plug into Serious Network Programming with the Windows Sockets API" by J. Allard, Keith Moore, and David Treadwell in the Microsoft Systems Journal (MSDN Library, Books and Periodicals, Microsoft Systems Journal, 1993 Volume 8, July 1993 Number 7).

Addressing Sockets

One of the most important advantages of sockets is that they provide a network-independent, yet network-configurable interprocess communication mechanism. This means that you don't have to redesign an application when you port it, say, from a TCP/IP-based socket implementation to an IPX-based implementation, but you can still take into account the different addressing schemes that TCP/IP and IPX employ.

To make this a little bit clearer, let us look at named pipes. To address a named pipe on a remote machine, an application uses a server name and the pipe name, for example, \\BEAKER\PIPE\THEPIPE. This naming convention, however, requires that the network software and the operating system understand and can parse this syntax, and know how to associate the pipe name with a network connection.

It would be difficult to imagine how, say, a VAX® machine that follows totally different conventions for filename syntax and network addressing would react if you passed the string \\BEAKER\PIPE\THEPIPE to it, even if that machine had some notion of a named pipe. Conversely, a machine that runs Internet networking software has no concept of machine names—it identifies computers by numbers instead, as we will see later on. How could such a machine translate a pipe name into a numeric address?

Sockets address this problem by encapsulating the address of a machine into a data structure that is defined by the naming convention of the underlying network software. The sockets API includes functions that (1) retrieve a machine identifier in whatever form the underlying network expects it, and (2) store the identifier within an opaque data structure. The application does not generally need to know the format of a machine address, although, as we will see later on, a machine may have to be configured in a special way so it can address other machines.

Sockets vs. Named Pipes

I mentioned earlier that sockets and named pipes are fairly similar in that both provide a way to view a network connection through open/close/read/write functions, like any other I/O stream. Thus, it makes sense to look at the similarities and differences between the two approaches.

Let us first look at the similarities between named pipes and sockets:

Both named pipes and sockets can be used to transfer data transparently between two processes on the same machine, or between processes on remote machines.
Both named pipes and sockets operate on the open/read/write/close paradigm. In Windows NT, both pipes and sockets are internally implemented as file-type objects; that is, you can transfer data over pipes or sockets using the ReadFile and WriteFile functions, and you can use a pipe or a socket as the destination or source for redirecting input from and output to console applications.
Both named pipes and sockets hide the underlying network architecture and protocol from the communication (although sockets allow a higher degree of control over the network protocol employed for a particular communication, as we will see later on).
Connections with both named pipes and sockets define a "server" end and a "client" end. In either case, the same server can service multiple clients.

The differences between named pipes and sockets are as follows:

By definition, sockets are bidirectional, whereas named pipes can be opened either bidirectionally or unidirectionally.
Sockets give you a much greater degree of control over details of the communication. In particular, named pipes don't have the flexibility to dynamically select a particular transport protocol; the operating system does this automatically through an arbitration phase at connect time.
Sockets were originally designed for use with the TCP/IP protocol, which addresses remote machines across network boundaries. Thus, it is possible to use sockets on top of TCP/IP to establish and maintain communications between machines that are not hooked up to the same physical LAN but can communicate via an internet. (I use the phrase "an internet" as opposed to "the Internet" to mean any assembly of networks connected with one another.)
However, you can also use sockets with non-TCP/IP network protocols that may not be able to address machines across network boundaries.
When a named pipe is created or opened, it is automatically bound to a location on the network, whereas a socket must be explicitly bound to an endpoint. (Don't worry, we will clarify these terms later on.)
Both named pipe and socket handles are shareable between processes. However, some socket properties are kept on a per-process basis so that different processes can open the same handle in different modes, whereas a named pipe will always behave the same for all processes that decide to share it.
A socket server application requires two calls to accept connection attempts from a server—one to indicate that the server is ready to accept a connection (listen), and one to establish the connection (accept)—whereas named pipes require a single call (ConnectNamedPipe).
Named pipes come with built-in security under Windows NT.

In the sections below, we will look into each of these characteristics. Please see the last article in this series, "Plugs and Jacks: Network Interfaces Compared," for a comprehensive discussion of similarities and differences between named pipes and sockets.

Addressing Under TCP/IP

Addressing a remote machine using named pipes is easy, because a named pipe is identified by a server name and the name of the pipe on the server. To connect to a socket on a remote machine using TCP/IP, you must also be able to specify the server as well as the socket modifier (the "port"); however, TCP/IP addresses machines by an Internet address. The important thing to know about Internet addresses is that they are unique and assigned worldwide; that is, you can theoretically address any machine in the world by using the appropriate Internet address.

An Internet address is a 32-bit integer that has three parts: a variable-length header, the net identifier, and the machine identifier. The 32 bits of an Internet address may be distributed differently among the three parts on different machines because the Internet address can target several categories of networks. Keep in mind that a computer may potentially have more than one address because it can be hooked up to several networks at the same time, and that a physical move of the machine from one location to another may change its address. The references in the bibliography provide excellent introductions to the Internet addressing scheme.

To make matters worse, a machine may change its Internet address dynamically. This is part of the dynamic host interface protocol (DHCP) specification that is incorporated in Windows NT 3.5 and will be available in future versions of the Microsoft Windows operating systems. This capability affects you as follows: If you know only the name of the machine that you wish to establish a connection with, you will need to convert the name to an Internet address. The sockets interface provides the gethostbyname function for this purpose. gethostbyname retrieves the address or addresses that correspond to a given computer name either by looking into a file that contains the name-to-address mappings or by requesting the address dynamically. (The mapping file used in the first method is generally called /etc/devices/hosts on UNIX® systems. In Windows NT, the file resides in the %SYSTEMROOT%\system32\drivers\etc directory.) I will not describe the process here; for details on the address resolution procedure, please see Internetworking with TCP/IP (Comer 1991).

Creating C++ Encapsulations for Sockets

This section discusses the implementation of the CClientSocket/CServerSocket object class hierarchy as a derivative of CClientCommunication/CServerCommunication. Recall that CClientCommunication is derived from CFile and therefore implements the standard CFile member functions Open, Close, Duplicate, Read, and Write. In this section, I will describe how each of these member functions is implemented.

Read and Write can be implemented using the socket API functions recv and send—the syntax and semantics of these function pairs are almost identical. Note that it is possible to use ReadFile and WriteFile with sockets under Windows NT. Let us look at the implementation of Read and Write:

void CClientSocket::Write(const void FAR* pBuf, UINT iCount)
{
 if (m_sComm->Send((const char *)pBuf,iCount) == SOCKET_ERROR)
     WSAGetLastError();
};

UINT CClientSocket::Read(void FAR* lpBuf, UINT nCount)
{ 
 UINT iResult;
 iResult = m_sComm->Receive((char *)lpBuf,nCount);
 if (iResult == 0 || iResult == SOCKET_ERROR)
  {
   WSAGetLastError();
   return 0;
  };
 return iResult;
};

The most complicated part of the socket class is probably the code that opens a socket. The code below shows the steps required to open a socket as a client (as "active open").

BOOL CClientSocket::Open(const char* pszFileName, UINT nOpenFlags,
      CFileException* pError)
// currently we only support the READ and WRITE open flags...
{
 BOOL bReturn;
 if (!bAreWeInitialized) 
    return FALSE;
 m_theSocket = socket( AF_INET, SOCK_STREAM, 0);
 if (m_theSocket == INVALID_SOCKET)
 {
  WSAGetLastError();     // convenience for the debugger
  return FALSE;
 };
SOCKADDR_IN saTemp;
saTemp.sin_family = AF_INET;
switch (nOpenFlags)
{
 case modeRead:
  saTemp.sin_port = htons(WRITE_PORT);
  break;
 case modeWrite:
  saTemp.sin_port = htons(READ_PORT);
  break;
 default:     // assume bidirectional
  saTemp.sin_port = htons(BI_PORT);
};
if (!pszFileName) return FALSE;        // NULL means we want to be the server
// m_bWeAreServer = FALSE;
 m_phe = gethostbyname(pszFileName);
 if (m_phe == NULL)
 { 
  WSAGetLastError();
  bReturn = FALSE;
 }
 else
 {
  memcpy((char FAR *)&(saTemp.sin_addr), m_phe->h_addr,
      m_phe->h_length); 
  if (connect(m_theSocket,(PSOCKADDR)&saTemp,sizeof(saTemp)) == SOCKET_ERROR)
  { 
   WSAGetLastError();
   bReturn = FALSE;
  }
  else
  {
   m_bIsCommunicationEstablished = TRUE;
   m_CommSocket = m_theSocket;          
   bReturn = TRUE;
   };
  };
 if (!bReturn)
 { 
  closesocket(m_theSocket);
  return FALSE;
 }
  else  
  return TRUE; 
};

bWeAreInitialized is a global variable in the SOCKETS.CPP file that indicates whether the WSOCK32.DLL sockets library is initialized correctly. The constructor for the first socket object to be created will also call the WSAStartup function to initialize WSOCK32.DLL, and the last socket object to be destroyed will call WSACleanup to clean up the library. Note that there is a slight degree of sloppiness in the code caused by the application architecture of CommChat: In CommChat, we guarantee that all calls to the constructor and destructor of the CSocket class are submitted from the same thread; if that were not the case, we would need to find a way to shield the constructor calls against multithreaded access problems. (In particular, we would need to shield the iObjectCount variable from being incremented and decremented incorrectly.) WSAStartup should be called once per process. An alternative to calling WSAStartup in the constructor/destructor would be to call it in the initialization of the CWinApp derivative of your application. (This is the solution that MFC adopts for applications that support CSockets.)

A socket is generated by a call to the socket function. By specifying the first parameter as AF_INET, we ask for a socket that accesses the TCP/IP protocol; other parameters would request sockets over other protocols. The best way to check which protocols are supported is to first check the sockets header files to see which constants are provided and then check the appropriate protocol and the SOCKADDR_xxx structures (where xxx stands for the supported protocol) to see what a target address should look like.

However, a socket that is allocated with the socket call is like a power outlet with neither power nor a consuming device. To establish a communication, a socket must be associated with an address and a port—that is, it must be "bound." On the client side, a communication is established via the connect call. Remember that the active open is identified by a non-null pszFileName parameter; the code allocates a variable of type SOCKADDR_IN (the IN stands for Internet), whose Internet address member is filled with the address portion returned by gethostbyname on the target machine name, and the port is filled with the hardcoded port number READ_PORT or WRITE_PORT, depending on whether the socket object was opened for read or write access.

Note the use of the htons function, which is responsible for translating the port number into a unique representation. Different processors store numeric values differently in memory, and htons ensures that the representation of numbers is the same when arriving on any machine on the network.

Some comments are in order here. READ_PORT and WRITE_PORT are hardcoded on the server side. We could allow the system to assign unique port numbers for new connections so we would not have to hardcode the ports, but at least one known port must exist between the client and server. If no port exists, the client and server cannot tell each other the numbers of the dynamically allocated ports—we need one "seed" connection at the least. (The FTP connection works exactly like that: The client and server use one well-known FTP port to inform each other of dynamically allocated ports.)

In the future, protocols may be available that allow an application to dynamically query ports on a given host machine, similar to the way DHCP implements dynamic assignments and queries of IP addresses.

If you do not wish to hardcode the port numbers into the application, your code can add the port numbers it uses to the SERVICES file (under Windows NT, that file would typically be in the %system_root%\drivers\etc directory) and use the getservbyname function to retrieve the port number at run time.

Also, we said earlier that a connection under TCP/IP consists of four parts: The Internet addresses of the two communicating machines and the respective ports on both sides. However, the client submits the connect call after the socket call, and only three known parts exist at that point: the Internet address of the client, the Internet address of the server (as obtained by the gethostbyname call), and the hardcoded port on the server side. What about the client-side port? We never assigned a port to the connection explicitly, did we?

No, we didn't, and the reason is very simple: The connect call implicitly assigns a port on the client side. Note that the client can explicitly assign itself a port number using the bind function, as we will discuss later on. If the client, after having successfully connected to a server, needs to learn the port number it was assigned by the system, it can call getsockname to retrieve this information.

The code to open a socket as a server (also called a "passive open") is shown below.

BOOL CServerSocket::Open(const char* pszFileName, UINT nOpenFlags,
      CFileException* pError)
// currently we only support the READ and WRITE open flags...
{
 BOOL bReturn;
 if (!bAreWeInitialized) 
    return FALSE;
 m_theSocket = socket( AF_INET, SOCK_STREAM, 0);
 if (m_theSocket == INVALID_SOCKET)
 {
  WSAGetLastError();     // convenience for the debugger
  return FALSE;
 };
SOCKADDR_IN saTemp;
saTemp.sin_family = AF_INET;
switch (nOpenFlags)
{
 case modeRead:
  saTemp.sin_port = htons(READ_PORT);
  break;
 case modeWrite:
  saTemp.sin_port = htons(WRITE_PORT);
  break;
 default:     // assume bidirectional
  saTemp.sin_port = htons(BI_PORT);
};


if (pszFileName)         // NULL means we want to be the server
 return FALSE;
// m_bWeAreServer = TRUE;
 saTemp.sin_addr.s_addr= INADDR_ANY;     // bind to any address 4 multihomed machines
 if (bind(m_theSocket,(PSOCKADDR)&saTemp, sizeof(saTemp)) == SOCKET_ERROR)
 {
  WSAGetLastError();
  bReturn = FALSE;
 }
 else
 if (listen(m_theSocket, MAX_PENDING_CONNECTS ) == SOCKET_ERROR)
 {
  WSAGetLastError();
  bReturn = FALSE;
 }
 else
 {
  m_bIsCommunicationEstablished = FALSE;
  bReturn = TRUE;
 };
 if (!bReturn)
 { 
  closesocket(m_theSocket);
  return FALSE;
 }
  else  
  return TRUE; 
};

After a successful socket call, the code does a bind. This call establishes what is called "half" a connection: After bind, the socket knows the Internet address and the port on the server side, so a client's call to connect can now establish the other half of the connection, as we discussed earlier. Think of the socket as a power outlet that now has power supply from the inside; the only remaining task is to plug in a consuming device. (Ever wonder why the icon for the socket help file shows a power outlet? Now you know . . .)

To bind the socket to the local machine, the code binds to the "wildcard" address INADDR_ANY, which means that any of a machine's Internet addresses can be used for incoming connections. Recall that on the Internet, it is possible for a machine to have more than one address. If a socket needs to be bound to a specific address, it can first submit a gethostname call (which returns the name of the local machine) and then pass the returned name to gethostbyname to obtain the Internet address, exactly as the passive open code does.

Note that when bind is called, the port number is defined as READ_PORT for the port that the server will read from, and WRITE_PORT as the port that the server will read to. Once a client connects, it will flip the port numbers based on whether the client sockets are opened with the read or write option. (This is analogous to the named pipe code in which a client that wants to read from a pipe must attach to the server's WRITE pipe.)

The one thing that looks odd here is that the server must also submit a call to listen before it can start to accept network connections. listen establishes a queue that is associated with the server end of a socket; whenever a connection is accepted on the socket, an entry is removed from the queue. This way, a client can try to connect to the server before the server is ready to accept a connection: A client can try to connect as soon as the server calls listen, but only an accept call on the server's side will establish the connection.

This approach is similar to the multiple-instance concept of named pipes: Several clients can connect to the same server end of a pipe, and the server will create a new instance for each client that connects to the server.

The last piece of the puzzle is the accept call, which is the socket's implementation of AwaitCommunicationAttempt:

BOOL CServerSocket::AwaitCommunicationAttempt(void)
{ 
 m_CommSocket = accept(m_theSocket,&m_sa,0);  // Block until a client connects.
 if (m_CommSocket == INVALID_SOCKET)
 {
  WSAGetLastError();
  return FALSE;
 };
 m_bIsCommunicationEstablished = TRUE;
 return TRUE;
};

The accept call blocks until a client connects to the server (if a client has previously submitted a connect call after the server submitted its listen call, accept returns immediately, servicing the pending connect), at which time it will fill the m_sa address with the client data of the connection (that is, the client's port and Internet address). This is where a connection is established, and the send and recv calls can be used to transmit data between the two ends.

Interestingly enough, the original socket specification does not contain a call to cancel a pending socket command; thus, to implement the CancelCommunicationAttempt member, I had to revert to the Windows-specific extensions. CancelCommunicationAttempt is implemented by means of the WSACancelBlockingCall function. Note that WSACancelBlockingCall does not expect any parameters, so it is not possible to tell the function which operation to cancel. (This is different from NetBIOS and named pipe communications, where a cancel call always references a particular pending operation or object.) Thus, if your application has more than one active socket with outstanding operations, you might be in for a surprise if you submit a CancelCommunicationAttempt call on one of those objects. Note that the MFC-provided socket class solves this problem by managing the blocking/unblocking logic itself.

Modularity Is Both a Blessing and a Curse

Initially, I was very proud of the fact CommChat required only a few changes to support both named pipes and sockets: I added the SOCKETS.CPP file to the project and made a few changes in COMMCHAT.CPP to allow for differences between named pipes and sockets. This meant that both the functionality of CommChat and the interface between CCommunication objects and other application modules were generic enough to work under multiple network communication strategies and protocols.

However, it turns out that sockets can be much more powerful than named pipes, simply because they were designed to work on heterogeneous networks that may have powerful built-in functionalities. A stand-alone sockets version of CommChat would have been easier to develop because we would have taken advantage of the socket API functions and built-in operating system services that use sockets.

To understand the conveniences of the sockets API set, consider CommChat's chat functionality. A client must establish a connection with a server machine to determine whether that server is available in the first place. If a connection has been successfully established, the client sends its name to the server and waits for an acknowledgment. The server then displays the name of the client in a message box and asks the user if a connection is to be established. Depending on whether the user clicks OK or CANCEL, a positive or negative acknowledgment is returned to the client. A negative acknowledgment terminates the connection, whereas a positive one initiates the "real" conversation.

A version of CommChat tailored specifically to sockets would not require a client to send its name to the server because the server can use the getpeername function to retrieve the name of the machine on the other end of the connection.

Moreover, the file-transfer work would have been minimal in a sockets-only version of CommChat because machines that have TCP/IP installed generally also run a version of FTP (file transfer protocol), which is a server that is dedicated to, well, transferring files. To understand how that works, let us go back to the earlier discussion on ports.

A port roughly corresponds to the pipe name of a named pipe and is one part of the information that a machine needs to know to establish a connection with another machine. Well, that is the short story. A port is actually much more and can mean different things, depending on the underlying protocol (see Internetworking with TCP/IP by Douglas E. Comer for details). A number of well-known ports provide predefined functionalities; for example, the TELNET service provides a complete terminal emulation facility, and the FTP service contains a full-blown file transfer system. (So much for all the work I put into the file transfer option of CommChat.)

Socket Options

As in named pipes, you can set a number of options to influence the behavior of sockets. The main difference between named pipes and sockets is that most of the named pipe options must be specified when the pipe is created (although several options can be set with the SetNamedPipeHandleState function) whereas all socket options can be dynamically assigned with the setsockopt function.

Another important difference is that options can be set on different levels for sockets: Some options affect the behavior of sockets regardless of the protocol, and other options are specific to the underlying protocol.

See the Windows Sockets 1.1 specification (MSDN Library, Specifications) for documentation on available socket options.

MFC Sockets

MFC provides built-in support for asynchronous and synchronous sockets through the CAsynchSocket/CSocket class hierarchy. Before I go into details about the sockets class I provide, I would like to introduce MFC's class hierarchy and explain how it relates to what I am doing.

The class hierarchy I provide—CClientCommunication and its derivatives—are abstractions of communication objects in terms of CFile; that is, I provide objects that hide the details about the network communication from the application. All the application sees is a pluggable communication object that can be utilized like a file.

On the other hand, the CAsynchSocket class is, in essence, a thin wrapper around the Win32® Windows sockets API. The set of member functions that the CAsynchSocket class supports corresponds to the set of functions that the Windows sockets API defines on sockets: GetSockOpt, SetSockOpt, Accept, Bind, Connect, Listen, Receive, Send, and so on. The documentation for CAsyncSocket states that this class is "based on the assumption that you understand network communications"; thus, to work with CAsynchSocket, you will need to understand the concepts covered in this article.

To work with the Windows sockets class hierarchy, you will probably derive your own classes from either CAsynchSocket or CSocket (we will discuss these classes in a moment). You might want to study the CHATTER and CHATSRVR sample applications that accompany MFC for an example on how to use the classes. Another way to use the classes is to define your own class hierarchy and use private member variables of type CAsynchSocket or CSocket to provide access to communication objects.

In this article, I will not work with the MFC classes (for reasons that I will explain later), but you will notice a number of similarities between what I discuss and the CAsynchSocket/CSocket class hierarchies.

The CSocket class provided by MFC is similar to CAsynchSocket, with two exceptions: First, objects derived from CAsynchSocket execute asynchronously, whereas objects derived from CSocket execute synchronously. Second, objects derived from CSocket can be associated with an archive object, which simplifies I/O: Instead of calling the Send and Receive members on a synchronous socket, an application can use the << and >> operators on the associated archive object.

Let me try to clarify the distinction between synchronous vs. asynchronous operations: When performed on derivatives of CSocket, operations that pertain to communications (such as Accept, Receive, and Send) do not return until the operation is completed; when submitted on derivatives of CAsynchSocket, those operations return immediately. In the latter case, the completion of the pending I/O operation is communicated to the calling thread via a Windows message that can be trapped in the socket object by overriding one or more of the OnAccept, OnClose, OnConnect, OnOutOfBandData, OnReceive, and OnSend notification members.

Note that although the Windows sockets specification (upon which the MFC socket library hierarchy is built) defines both synchronous and asynchronous sockets, the CAsynchSocket and CSocket classes are implemented using the asynchronous socket extensions. Thus, even if you derive a class from CSocket (which provides synchronous communications), the socket object that is used internally will still be an asynchronous socket. You will find the details of the implementation in the SOCKCORE.CPP file that comes with the MFC sources. Basically, after submitting the asynchronous request, the CSocket object enters a polling loop that terminates as soon as the asynchronous request has completed. The advantage of this particular design is that it gives the MFC library more control over when a pending operation can be terminated (for example, a cancel operation is rather easy to implement this way). However, on the down side, synchronous and asynchronous communications now have to worry about issues that are fairly trivial with "native" synchronous sockets, as we will see in the following section.

CSocket Objects and Multithreading

I have worked on an implementation of my socket class hierarchy using CAsynchSocket and its derivatives for a long time. Given that the member functions of the CAsynchSocket class closely resemble the operations that are defined on sockets, converting the class from "bare" sockets to CSocket-based objects was rather easy. However, debugging the class turned out to be a major nightmare, for one simple reason: CommChat (the application I wrote to test the classes) is a multithreaded application, and it is not easy to make CAsynchSocket-based objects work with multithreaded applications. Here's why:

First of all, as explained in MFC Technical Note 37, "Multithreaded MFC 2.1 Applications" (see Technical Articles, Visual C++ 2.0 [32-bit] Articles, MFC 3.1 Technical Notes in the MSDN Library), an MFC object that will cross thread boundaries must be used with MFC threads. In CommChat, I simply used ::CreateThread to dispatch my background threads. Because those threads only used non-MFC objects (that is, sockets with no attachments to the MFC library), I did not have to worry about MFC compatibility. However, an MFC object is rather closely linked to the thread in which the object is supposed to be used. Thus, the first change I had to make was to use AfxBeginThread instead of CreateThread, because AfxBeginThread sets up some of the thread-specific data structures that CSocket objects use. This is a one-line rewrite, so it's no big deal.

The next thing I stumbled into was much tougher. In CommChat, communication objects are passed back and forth between threads rather liberally, and several threads take turns using the objects. There is no concurrent access to the communication objects in CommChat. As soon as a thread is dispatched, it is clear which objects are accessed by which thread until the thread terminates. However, in spite of this rather straightforward design, I had to do a major rewrite of CommChat to enable the CSocket-based class hierarchy to work, for the following reason: Before thread A can use a CSocket object that was previously in use by thread B, the object must be detached from B (a task that must be performed by B itself) and attached to A, after A has begun executing. In other words, the application must explicitly keep track of which thread uses which objects at any given time, and must detach and attach the object appropriately. Alternatively, one socket could be "shared" by several threads. To accomplish this strategy, you could allocate a new object of your CAsynchSocket-derived class for every thread that shares the socket, connect the socket from one thread, and use the Attach command to associate the m_hSocket member of the object that connected the thread with the other threads.

This awareness of multithreading makes the CSocket object a little less generic than non-MFC sockets, named pipes, or NetBIOS communications. If you make a decision in favor of MFC socket objects, I recommend that you use the asynchronous CAsynchSocket object type.

Summary

Both named pipes and sockets provide communication abstractions that can be easily incorporated into the generic CCommunication data type because they both follow the open/read/write/close paradigm.

In "Plugs and Jacks: Network Interfaces Compared," I will elaborate on the differences between the interfaces, but before you read that article, you should read "Aristocratic Communication: NetBIOS," which is the next article in this series on network programming interfaces.

Bibliography

Allard, J., Keith Moore, and David Treadwell. "Plug into Serious Network Programming with the Windows Sockets API." Microsoft Systems Journal 8 (July 1993): 35-50. (MSDN Library Archive Edition, Books and Periodicals)

Comer, Douglas E. Internetworking with TCP/IP. Vols. 1 and 2, 2d ed. Englewood Cliffs, NJ: Prentice-Hall, Inc., 1991.

Stevens, Richard W. TCP/IP Illustrated. Vol. 1, The Protocols. Reading, Mass.: Addison-Wesley, 1994.

Stevens, Richard W. Unix Network Programming. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1990.