Architecture

How the Web Proxy Service Works
How the WinSock Proxy Service Works

How the Web Proxy Service Works

The CERN-Proxy Protocol
The Microsoft Proxy Server Web Proxy service
Caching Mechanisms

The CERN-Proxy Protocol

What is CERN-Proxy?
How HTTP Works
How the GET Method Works
Examples of Usage with Web Proxy Service

The CERN-proxy protocol is widely recognized throughout the World Wide Web (WWW) community as the standard for implementing proxy services in TCP/IP-based networks. It has its origins within the standards for Hypertext Transfer Protocol (HTTP) as a UNIX-based service first developed by members of Switzerland’s Conseil Europeen pour la Recherche Nucleair (European Laboratory for Particle Physics, or CERN) in the early 1990s.

As the CERN staff added application-aware proxy support for Hypertext Transfer Protocol daemon (HTTPd) servers commonly known as Web servers to their libraries, the WWW community built on these additions. In the time since it was first introduced, the CERN-proxy protocol has become an accepted industry standard for implementing HTTP proxy service. The Microsoft Proxy Server Web Proxy service is fully compatible with the CERN-proxy standard.

To better understand how CERN-proxy works, it is important to understand the differences between how most Internet applications, such as File Transfer Protocol (FTP) and Gopher, work and how HTTP applications work.

Standard TCP/IP applications such as FTP use TCP (or in some cases, User Datagram Protocol, or UDP) as the transport-level protocol for supporting client/server communications. For HTTP-based applications, a set of commands (called methods) are defined that are used in Web-based client/server communications. While CERN-compatible proxy services support WWW (HTTP), FTP, and Gopher requests, it is important to keep in mind that a CERN-based Web proxy server uses HTTP for all communications with its Web Proxy clients.

How HTTP Works

HTTP defines a set of commands (called methods) that a client can send to a server. The two most common methods are:

GET GET is used to forward a Uniform Resource Locator (URL) to a server requesting the resource to which the URL refers.
POST POST is used to forward a request that contains a URL and data; typically, a user provides this data by completing a Hypertext Markup Language (HTML) form.

How the GET Method Works

In a simple HTTP request, a browser that is not configured for proxy service sends an HTTP URL directly to a Web server. The browser sends the server a GET method, which includes the path and resource name requested. In the process, the browser will remove the protocol and site name from the URL (http://domainname) before forwarding the GET method request on to the site named in the URL.

For example, if the following URL is typed on the command line of a browser not configured for proxy service

http://host.com/sales/report.htm

the browser parses the URL and sends the following command to Host.com as:

GET /Sales/Report.htm

This type of processing simplifies communications because HTTP is the protocol for all message requests that browser clients communicate to the server, but it limits browser requests to sending and receiving by use of HTTP only. Browser requests for FTP URLs can not be handled directly in this manner, although proxy service can support these types of requests for browser clients.

Examples of GET Usage with Proxy Service

When a browser is configured for use with a proxy server, GET methods issued by the browser are created with more detail about the resource included. The browser client will issue the full URL without parsing the named protocol first. The named protocol indicates the type of service being requested and whether the GET request is for a WWW, FTP, or Gopher resource. The fully detailed GET method is then forwarded to the proxy server.

The following is an example of a WWW (HTTP) request that shows proxy-based service for a document entitled Doc.htm in the Sales directory on the server Host.com:

The process by which a proxy request is serviced is as follows:

The Web browser sends the full URL request as a GET statement to the proxy server.
The proxy server:
- Receives the request.
- Parses the URL.
  
  If the URL is in the cache, the request is serviced to the Web browser from the cache.
- Identifies the type of resource the request is for (such as HTTP or FTP).
- Resolves the domain name to an IP address.
- Requests Doc.htm from the Internet server by using the appropriate protocol, such as HTTP or FTP.
  
  In this case, because the URL specified HTTP, HTTP is the appropriate protocol. Note that a request sent to a Web server is formatted by the proxy service as a standard request, that is, a request not forwarded by the proxy service.
The Web server:
- Receives the request.
- Responds by sending Doc.htm to the proxy by using HTTP.
The proxy server:
- Receives Doc.htm.
- Sends Doc.htm to the browser by using HTTP.
The Web browser receives Doc.htm and displays it on-screen, completing the process.

Another example of this process will demonstrate how an FTP resource is requested and handled under proxy service.

The following diagram shows how an FTP request is forwarded using proxy service. The browser sends a request to the proxy service for an FTP-published document named Q296.doc. The GET method that is used contains the full FTP URL entered at the browser command line, ftp://host.com/sales/Q296.doc. (Note that the HTTP protocol is used by the browser to send the GET method to the proxy.)

In this case, the proxy service identifies the request as the FTP protocol, and requests Q296.doc from host.com by using the FTP protocol. Host.com then returns Q296.doc to the proxy by using the FTP protocol, and the proxy uses the HTTP protocol to send the document to the client.

Browsers that are not configured for the proxy service issue FTP and Gopher requests by using the FTP and Gopher protocols, respectively. A browser that is not configured for the proxy service cannot issue an FTP or Gopher requests by using HTTP.

Configuring a browser to communicate with a proxy server actually simplifies the work that the browser needs to do, because all requests are processed with HTTP and have complete URLs.

The Microsoft Proxy Server Web Proxy Service

About the Web Proxy Service
Proxy ISAPI Filter
Proxy ISAPI Application
Advantages of the Web Proxy Service

About the Web Proxy Service

The Microsoft Proxy server fully supports the CERN standards for proxy service with added support for other application-level proxy services as well.

How does the Web Proxy service work? Certain application services have knowledge of the underlying protocols used by the applications they support. This knowledge allows Microsoft Proxy Server to offer additional features, such as user authentication, protocol conversions, and local caching of retrieved content. These features add security, improve response time and access control, and decrease network usage.

A Web proxy performs functions associated with both clients and servers. As a server, it receives WWW requests from private network clients; as a client, it responds to private network clients’ requests by issuing the appropriate requests to a WWW server on the Internet. The interface between the client and server components of the Web Proxy Service provides opportunities to add value to the connections it services. By increasing security and functionality for client connections, the Web Proxy Service do much more that simply relay data for between servers and clients.

The Web Proxy Service runs as an extension to Microsoft Internet Information Server (IIS) version 2.0. It is implemented as a dynamic-link library (DLL) that uses the Internet Server Application Programming Interface (ISAPI), and therefore runs within the process of the IIS WWW service.

The WWW service must be installed and running in order for proxy requests to be processed. Because all proxy requests for Web, Gopher, or FTP resources are sent from the client to the proxy by using the HTTP protocol, it is both convenient and efficient for the IIS WWW service to receive these requests and pass them on to the Web Proxy Service DLL by means of the ISAPI interface.

Note In order to install Microsoft Proxy Server, you must have both Windows NT Server 4.0 and Internet Information Server 2.0 installed. For more information on installing either of these products, see the product documentation.

Functionally, the Web Proxy Service consists of two components, the Proxy ISAPI Filter and the Proxy ISAPI application.

Proxy ISAPI Filter

The ISAPI filter interface allows for registration of an extension that the Web server calls whenever it receives an HTTP request from a client. An ISAPI filter is called for every request, regardless of such details as the identity of the resource requested in the URL. Thus, an ISAPI filter can monitor, log, modify, redirect, or authenticate requests sent to the Web server.

The process of calling an ISAPI filter in an HTTP request can be described as follows:

The HTTP client browser forms and passes a URL request on to the network for forwarding to a server running IIS.
The IIS server receives the request and calls an ISAPI filter DLL.
The ISAPI filter DLL loads and registers with the WWW service.
The request is filtered for notification points (such as monitor, log, modify, redirect, or authenticate requests).
The request is then forwarded back to the WWW service for response action.
A response is issued by the WWW service and the ISAPI filter DLL is called again.
The response is filtered for notification points (monitoring, logging, and so on).
The response is then returned to the client.

The WWW service can call the ISAPI filter DLL’s entry point at various times during a request-and-response sequence. When the ISAPI filter is loaded, it programmatically registers the notification points in which it is interested. The WWW service then starts calling the ISAPI filter DLL’s entry point at each requested notification point for each HTTP request.

Note For more information about:

The ISAPI filter interface, see the ActiveX Development Kit, available at http://www.microsoft.com/intdev/.
The ISAPI interface and the IIS WWW service, see the Installation and Administration Guide For Microsoft Internet Information Server, an online guide installed with the product.

The Proxy ISAPI filter is contained within an additional DLL (W3proxy.dll) that examines each request to determine if the request is either a proxy request or a standard HTTP request.

It registers itself to use the SF_NOTIFY_PREPROC_HEADERS notification point, because when called at this notification point for a request, the Proxy ISAPI filter can see the URL sent by the client (and can modify the URL), before the Web server processes the request.

Once the Proxy ISAPI filter examines each request, it can modify the header in one of the following ways:

If the request is a proxy request—that is, if the request contains a URL complete with protocol and domain name as described in The CERN-Proxy Protocol, earlier in this chapter—the Proxy ISAPI filter adds the name of the Proxy ISAPI application (W3proxy.dll) to the URL. This causes the WWW service to forward the request to the Proxy ISAPI application for processing.
If the request is not a proxy request—that is, if the request does not contain a protocol and a domain name—then the request is assumed to be for a Web resource on a Web server. The Proxy ISAPI filter does not modify the request, allowing ordinary Web publishing to occur between the Web server and client.

Proxy ISAPI Application

The ISAPI application interface uses an in-process mechanism to extend Web server functionality. Unlike Common Gateway Interface (CGI), another mechanism for extending Web server functionality, an ISAPI application does not initiate a new process for every request. ISAPI applications can create dynamic HTML and integrate the Web with other service applications, such as databases.

An ISAPI application DLL loads once; thereafter, the Web server calls the DLL whenever it receives a client request for that application. The Proxy ISAPI application is contained within W3proxy.dll, which also contains the Proxy ISAPI filter. Because both application and filter reside in W3proxy.dll, all necessary initialization for both is done when the server is started.

Every time it receives a request, the Proxy ISAPI application does the following:

Authenticates the client.
Applies the domain filter.
Looks for objects in the cache and returns objects from there, first checking that the cached object is current to its source.
Gets the objects from the Internet, sends them to the client, and adds them to the cache if appropriate.

If a request is valid, and it is necessary to issue the request to an Internet site, the ISAPI application parses the URL to extract the protocol (typically HTTP, FTP, or Gopher), and the domain name. For HTTP requests, the ISAPI application calls the appropriate Windows Sockets APIs directly to process file requests.

For HTTP requests, all input/output (I/O) is done asynchronously after the domain name has been resolved. If possible, the Domain Name System (DNS) is used to resolve the domain name. (DNS is used to resolve Internet or UNIX system names. Microsoft TCP/IP includes DNS support.)

Issuing the request to the Internet site includes the following steps:

Resolve the domain name to an IP address (if possible, the DNS cache is used).
Connect to the remote site.
Send the request to the remote site.
Receive the response header from remote site.
Receive the data (send to the client and save in the cache).

The ISAPI Proxy application uses a small set of reusable worker threads and asynchronous I/O to achieve very high performance. An Asynchronous Thread Queue (ATQ) and the TransmitFile API further enhance thread efficiency. The Proxy ISAPI application benefits from the high performance and scalability built into IIS, as well as its own architecture.

Advantages of the Web Proxy Service

By running as an ISAPI application, the Microsoft Proxy Server benefits from many other functional and performance features of IIS. An example is the Web server’s support for HTTP Keep-Alives.

Keep-Alives is a feature that when supported by both browser and server allows TCP connections to remain intact after a request and response are completed. This significantly improves performance if another request is made from the same client to the same server within a time limit for connections. Support for Keep-Alives requires that the Web server be able to return the byte size of responses to clients, and that time outs are used by client and server so as to efficiently manage Windows Sockets connections.

Because individual Web pages can often contain numerous links to separate graphic image files on the Web server, Keep-Alives can often improve performance for non-proxy environments. The speed with which Web page contents can be accessed is vastly improved, even when another HTML page is not requested.

For a Microsoft Proxy Server, the Keep-Alives mechanism is much more valuable. In a typical scenario, a company has a small number of computers running Microsoft Proxy Server for Internet access. Every attempt within the company to access Internet Web, Gopher, and FTP sites requires a connection from a browser on the internal network to the Microsoft Proxy Server. The probability of reusing a connection between the same client and the same Microsoft Proxy Server, is very high. For Internet Explorer version 3.0, support for Keep-Alives to a proxy server is a feature.

Caching Mechanisms

About Caching
Passive Caching
Active Caching

About Caching

The Web Proxy Service uses caching to maintain a local copy of HTTP objects. This allows subsequent requests for these objects to be serviced from a local disk copy, rather than issuing the request over the Internet, thereby improving user-perceived performance and reducing bandwidth consumption on the site’s Internet connection.

Not all objects that pass through Web Proxy Service can or should be cached. Some objects are dynamic and change frequently, some change every time they are accessed. Other objects require authentication of the requesting client and cannot be cached for security reasons.

A Web object must satisfy the following criteria in order to be cached:

The request must be a GET.
There may be no ? keywords, which are typically used for basic logons.
The file must be served by the HTTP protocol (objects associated with other protocols are not cached).
The HTTP response header must not include WWW-Authenticate, Pragma: no-cache, Cache-control: Private, Cache-control: no-cache, or Set-Cookie.
The date in the Expires: header field must be later than the one in the Date: header. The Expires: header is used in all HTTP requests to indicate a date and time for the request to expire on the network.The Date: header is used to indicate the date and time that the Web server received the request. Both fields are generically returned in almost all HTTP requests. Some Web servers indicate to downstream caches that a page should not be cached by setting the Expires: header equal to the Date: header, indicating that the page expires immediately. Also, setting this field to “Expires: 0” will prevent caching as well.
The HTTP Result code must be 200 (success).
The object must not be encrypted or protected by Secure Sockets Layer (SSL).
There cannot be an Authorization header in the HTTP request header, or Vary in the response header.

The HTTP header may include “cookies,” which allow a server to customize a response for a particular user. Cookies are increasingly used for custom pages or for informal (that is, not very secure) authentication. Microsoft Proxy Server will treat cookies as another optional HTTP header that will be disregarded, with the exception of the Set-Cookie header. It will be assumed that subsequent transactions after the cookie has been set can be cached, unless any of the subsequent objects requested include headers with cache ineligible exception values based on the criteria in the preceding list.

Microsoft Proxy Server caches data by means of two different types of caching processes: passive caching and active caching. The remainder of this section discusses each of these processes in further detail.

Passive Caching

Passive caching, also referred to as “on-demand” caching, is the most basic mode of caching. The following figure depicts passive caching.

Microsoft Proxy Server interposes itself between the client and local or remote Web and intercepts requests (for example, HTTP GET requests) from the client. Before forwarding the request on to the Web, Microsoft Proxy Server first calls into its cache (Urlcache.dll) to determine if the cache can satisfy the request by using the RetrieveUrlFile API. If the data is in the cache and has not expired, it is returned immediately to the client by using the Windows Sockets TransmitFile API.

If the object is not cached or if the cached copy of the object has expired, Microsoft Proxy Server retrieves the object from the Web, returns it to the user, and inserts it into the cache (by using the CreateUrlFile and CacheUrlFile APIs). If the local disk space reserved for the cache is too full to hold the new data, older objects are removed from the cache using a formula that evaluates age, popularity, and size.

The Microsoft Proxy Server cache APIs are documented in the ActiveX Development Kit, available at http://www.microsoft.com/intdev/.

Active Caching

Microsoft Proxy Server uses active caching to improve the retrieval performance by increasing the likelihood that a requested object will be found in the cache. Active caching works as a superset to passive caching.

Typically, in passive caching, an object is placed in the cache and a Time-To-Live (TTL) expiration value is associated with that object. During this TTL, all requests for the object are serviced from the cache without generating traffic back to the upstream Web server. After the TTL has expired, subsequent client requests for the object will generate traffic to and from the Web server. The response from the server will be stored in the cache and a new TTL will be calculated.

Active caching augments this system by having the server automatically generate requests for a specified subset of objects. Microsoft Proxy Server optimizes the choice of objects for active caching on the basis of the following qualities:

Popularity Ensures that requests made by Microsoft Proxy Server are likely to be requested by clients as well.
TTL Longer TTLs are more valuable to cache than shorter TTLs; Microsoft Proxy Server will also check objects that are close to expiration.
Server load Microsoft Proxy Server performs more aggressive active caching during periods of low server load than when the server is heavily loaded.

Active caching results in:

Better Client Performance Clients are more likely to find their URL in the cache, resulting in lower latency (the amount of time clients wait for a response) and better throughput.
Even Load Distribution Active caching has the effect of balancing the request load for the server across time, by rescheduling some cache update requests from high-peak periods of server activity to off-peak periods.
More Accurate Data By checking unexpired objects during off-peak periods, the likelihood of returning stale data to clients is reduced.

How the WinSock Proxy Service Works

Understanding Windows Sockets and WinSock Proxy
WinSock Proxy Architecture
Windows Sockets APIs
WinSock Proxy Limitations

Understanding Windows Sockets and WinSock Proxy

About Windows Sockets
About WinSock Proxy

About Windows Sockets

Windows Sockets is a mechanism for interprocess communication between network applications running on the same computer, or on different computers connected using a local area network (LAN) or wide area network (WAN). It defines a set of standard APIs that an application uses to communicate with one or more other applications, usually across a network.

The Windows Sockets APIs support:

Initiating an outbound connection for a client application.
Accepting an inbound connection for server application.
Sending and receiving data on a client/server connection.
Terminating a client/server connection.

The specification includes a standard set of APIs supported by all Windows-based TCP/IP protocol stacks, and to be used by network applications. Support for other transport protocols is included in Windows Sockets with some implementations supporting IPX/SPX and NetBEUI protocols.

In Windows Sockets, application communications channels are represented by data structures called sockets. A socket is identified by two items:

An IP address
A port number

Windows Sockets can support both point-to-point connected service (also referred to as stream-oriented communications), and multipoint connectionless service (referred to as datagram-oriented communications). When using the TCP/IP protocol suite, stream-oriented communications use TCP and datagram-oriented communications use UDP.

Connected Service using TCP Sockets

Most Internet application protocols, including HTTP, Gopher, and FTP, are connection-oriented client/server protocols. A client typically initiates a connection to a server in order to process a specific client request. A server waits for connections initiated by clients, accepts those connections, and begins communicating with each client following the rules of the specific application protocol. Communications managed expressly between a server and its clients in this manner form what is known as connected service.

For application protocols that use connected service, the Transmission Control Protocol (TCP) has been long established as the transport-level protocol of choice. TCP supports the use of sockets as well to form connected communications between computers on a network.

For example, a TCP socket can be formed by first associating an IP address and a TCP port. The IP address is a 32-bit number that uniquely identifies the local IP network interface. The port identifies a virtual channel used for communications at the TCP level. A stream-oriented connection is then formed by associating a local IP address-TCP port pair with a remote IP address-TCP port pair.

A server goes through the following steps to create a TCP socket with a client:

The socket() API is used to establish a socket and associate it with a specific streaming protocol, such as TCP.
The bind() API is used to associate a local IP address and port with the socket. Most servers specify that they want to bind the socket to all local IP addresses, and indicate the well-known port for the application protocol (port 80 for HTTP, port 21 for FTP, and so on).
The listen() API is used to enable inbound connections on the IP/port pair.
When a client connection is received, the server uses the accept() API to complete the connection process, associate a different socket with the connection, and go back to the listening stage on the original socket to handle future client connections.
The server uses the recv() and send() APIs to communicate with the client.
The server can use the getsockname() API to query the local and remote IP address-port pairs.

A client typically initiates a TCP socket connection to a server in order to process a user request. With stream-oriented TCP connections, the client executes the following steps:

The socket() API is used to establish a socket and associate it with a specific streaming protocol, such as TCP.
The bind() API is used to associate a local IP address and port with the socket. Most clients specify that they are willing to use any local IP address and port.
The connect() API is used to initiate a connection to a specified IP address/port pair. The remote IP address specified identifies the server, and the port identifies the service (80 for HTTP, 21 for FTP, and so on).
The client uses the recv() and send() APIs to communicate with the server.
The client can use the getsockname() API to query the local and remote IP address-port pairs.

Connectionless Service using UDP Sockets

Some Internet applications can use User Datagram Protocol (UDP), a transport protocol that delivers server data in a form that offers higher throughput performance than TCP. UDP is very effective for delivering data from servers to clients at the highest possible speeds by using unacknowledged delivery and packaging data into small uniform-length packets called datagrams. Communications that consist of UDP datagrams sent from a server to clients on a network in this manner form what is known as connectionless service. This type of service is useful for real-time applications such as streaming audio and video. For example, RealAudio and VDOLive both use UDP to offer connectionless service to clients.

For UDP, the client and server each establish a UDP socket in the following way:

The socket() API is used to establish a socket and associate it with a specific datagram protocol, such as UDP.
The bind() API is used to bind the socket to a local IP address-port pair.
The sendto() and recvfrom() APIs are then used to begin immediately sending and receiving data. These APIs specify the IP port to send to, and return the IP port received from.

While most UDP implementations consist of a client communicating with a single server at a time, UDP is a connectionless protocol and supports communications between a client application and multiple servers over a single socket.

About WinSock Proxy

WinSock Proxy allows a Windows Sockets application running on a private network client to perform as though it is directly connected to a remote Internet server application, when in actuality, the Microsoft Proxy Server serves as the proxy host for this connection. WinSock Proxy consists of two parts: a service running on a gateway computer, and a DLL installed on each client computer.

The WinSock Proxy service runs on Windows NT Server version 4.0 only. It runs as a stand-alone Windows NT service, and is responsible for creating virtual connections between internal applications and Internet applications. The WinSock Proxy service is also responsible for doing “data pumping” between the two actual communications channels set up for a virtual connection, and acting as a TCP/IP protocol gateway if the internal network runs IPX/SPX.

One of the benefits of this type of design is that all application-level communications are channeled through a single secured computer—the gateway computer running Microsoft Proxy Server. The WinSock Proxy service can also provide application-level event monitoring for redirected Windows Sockets applications on the private network.

With WinSock Proxy, client applications send Windows Sockets APIs to communicate with server applications running on Internet computers, and the WinSock Proxy service on the Microsoft Proxy Server intercepts these calls. The Microsoft Proxy Server then handles redirecting these APIs to the remote server on the Internet. This establishes a logical communications path from the internal client application to the Internet server application by way of the computer running Microsoft Proxy Server.

The gateway process is not visible to either the internal network client or the remote server. Both client and server computers appear to share only a single connection to each other. In truth, both maintain separate connections to the computer running Microsoft Proxy Server through separate network hardware interfaces on the computer running Microsoft Proxy Server.

There are some side effects to using the proxy server to access the Internet.

External servers see different users coming from the same address.
A client application that binds a socket to a specific port may fail when the requested port is already assigned to another client. To avoid this, the client software can either request PORT_ANY (0), or try a range of ports when a bind fails with the error WSAEADDRINUSE. (Note that the socket option SO_REUSEADDR will not work in this case.)

The WinSock Proxy service offers client and server support for most standard and custom Internet applications that communicate using Windows Sockets 1.1. Almost all Windows Sockets 1.1 TCP/IP applications can be redirected using WinSock Proxy. However, redirection of Windows Sockets 2.0 APIs or applications is not supported at this time. The WinSock Proxy service works with Windows-based TCP/IP applications on the private network, and any TCP/IP applications platform on the Internet.

WinSock Proxy Architecture

WinSock Proxy Client Components
WinSock Proxy Control Channel
TCP Redirection
UPD Redirection
Using TCP/IP on the Internal Network
Using IPX/SPX on the Internal Network

WinSock Proxy Client Components

On client computers, the WinSock Proxy uses a specially designed DLL to redirect Windows Sockets API calls from the client to remote servers. When this DLL is installed, it renames the standard Windows Sockets DLLs, and the WinSock Proxy DLL is given the name of the corresponding Windows Sockets DLL (Winsock.dll for 16-bit; Wsock32.dll for 32-bit). This results in all Windows Sockets API calls being forwarded to the WinSock Proxy DLL first, and then the WinSock Proxy DLL redirecting calls to the renamed Windows Sockets DLL as needed, or processing the call itself.

Once the WinSock Proxy DLL is actively installed on the client, it intercepts all Windows Sockets API calls made by applications on the client computer. Depending on the API, and the current socket status, the client WinSock Proxy DLL may:

Completely process the client’s request.
Pass the request to the (renamed) actual Windows Sockets DLL on the local computer (after possibly making changes to the request).

–Or–

Need to pass control information (by use of the WinSock Proxy Control Channel) to the WinSock Proxy service on the computer running Microsoft Proxy Server.

For network communication between local applications on the internal network, the WinSock Proxy client DLL passes Windows Sockets API calls to the previously installed (and renamed) Windows Sockets DLL. This allows Windows Sockets communications to continue to work normally. Also, this is true regardless of whether the previous Windows Sockets DLL was obtained from a third-party TCP/IP stack or directly from Microsoft. In all cases, the Windows Sockets DLL that was installed prior to WinSock Proxy client setup is maintained for forwarding local network calls.

There are two versions of the WinSock Proxy client DLL: a 16-bit version and a 32-bit version. The 16-bit version is installed on Windows 3.1 and Windows For Workgroups 3.11. The 32-bit version is installed on Windows NT. Both versions are installed on Windows 95.

WinSock Proxy Control Channel

The WinSock Proxy service and client DLLs communicate by using a control channel that is set up when the client DLL is first loaded. The control channel uses the connectionless UDP protocol. UDP allows a single socket on the gateway computer to be used for communications with all WinSock Proxy clients, and is faster than TCP. A simple acknowledgment protocol is used between WinSock Proxy client and service to add reliability to the control channel.

The goal is to use the control channel as infrequently as possible, and to have as few Windows Sockets APIs that require special processing on the client computer as possible. For example, for TCP connection requests, the control channel is used to set up the virtual connection, but once the connection is set up, sending and receiving data (send() and recv() APIs) requires no special processing on the client: the WinSock Proxy DLL simply forwards these requests to the (renamed) Windows Sockets DLL. This also means that the Win32 APIs ReadFile and WriteFile, which bypass Windows Sockets, will work with redirected connections.

The WinSock Proxy control channel is used for the following purposes:

To set up TCP connections for WinSock Proxy clients to remote servers
When a connection with a remote application is being established, the control channel is used in establishing the virtual connection. Once the connection is established, sending and receiving data will not require use of the control channel.
To maintain UDP communications between WinSock Proxy clients and WinSock Proxy servers
The control channel is used by WinSock Proxy clients to contact the WinSock Proxy server when the UDP socket is bound. Additionally, in order to support multiple remote applications communicating with the internal application, port-mapping information is sent to the client DLL each time a new remote peer sends data. Sending and receiving data to and from known peers does not require the control channel.
To manage database requests between WinSock Proxy clients and WinSock Proxy servers
Redirection of the Windows Sockets database requests, such as DNS name resolution (gethostbyname(), and so on) is handled by passing the client request to the WinSock Proxy service by using the control channel, and the response is forwarded to the client DLL by using the control channel.

When the first application on a client attempts to make its first Windows Sockets connection, the WinSock Proxy DLL is loaded and initialized. At this time, the WinSock Proxy DLL does the following:

It establishes its own WinSock Proxy control channel with the WinSock Proxy service, and notifies the service, by using the control channel, that it is active.
The WinSock Proxy service then downloads the Local Address Table (LAT). The LAT is a routing table that consists of a list of IP address pairs, each pair indicating a range of addresses located on the internal (private) network. The LAT is used by the client to determine which requests need to be redirected.
Note Routing information is updated by copying files over the network using Windows NT file sharing protocols (server message block, or SMB). The control channel is not used for this purpose.

Note LAT information is stored in the Msplat.txt file, located by default at C:\Msp\clients. It is initially configured by the Microsoft Proxy Server Setup program. It can be modified or changed later by using Internet Service Manager.

Once the WinSock Proxy DLL has initialized service, for future connection attempts by applications, the WinSock Proxy DLL attempts to determine if the application is trying to communicate with a local computer (private network) or remote computer (Internet).

For connection attempts and Windows Sockets APIs destined for a local computer, the WinSock Proxy DLL simply forwards the API calls to the (renamed) Windows Sockets DLL, for normal processing. If a Windows Sockets API call contains no information about the destination (and therefore no indication as to whether it should be redirected), the WinSock Proxy component assumes it is a local request, and forwards the request to the standard Windows Sockets DLL.

When a Windows Sockets database API is called by an application (gethostbyname(), and so on) to resolve an Internet name or address, the WinSock Proxy components work together, using the control channel, to redirect the request to the gateway computer, and have the request processed on the Internet.

The architecture of WinSock Proxy requires special processing by the client’s WinSock Proxy DLL when establishing a connection with an Internet site, but once a communication channel is established, standard Windows Sockets and Win32 APIs for reading and writing a socket or file can be used with no special processing on the client. The application performs as if it is reading and writing to the Internet site, while it is actually communicating with the WinSock Proxy service, which forwards the requests.

The WinSock Proxy control channel uses UDP port number 1745 on the WinSock Proxy server and client computers.

The following illustration depicts the WinSock Proxy components on an IPX/SPX private network.

TCP Redirection

TCP handles point-to-point, connection-oriented communications. For each TCP connection requested by an internal application, two actual connections are set up by WinSock Proxy service:

A connection between the client application and the WinSock Proxy service using the WinSock Proxy Microsoft Proxy Server’s internal network port interface.
A connection between the WinSock Proxy service and the Internet application using the WinSock Proxy Microsoft Proxy Server’s external (Internet) port interface.

Data received from either connection is forwarded to the other connection, and both applications perform as though they are communicating directly with each other.

A TCP redirected connection is managed in the following way:

The WinSock Proxy control channel is used to set up a TCP redirected connection. Once the TCP redirected connection is set up, the control channel is not used for data transfer.
The internal application then initiates an outbound TCP connection to an Internet site.
The send() and recv() APIs are then called on the client and server. The WinSock Proxy client DLL forwards send() calls to the real Windows Sockets DLL (these APIs do not contain addresses, they simply refer to a socket), and all data is identical to data sent in a normal (non-redirected) socketed connection. (The ReadFile() and WriteFile() Win32 APIs work on TCP redirected connections as well. For Windows NT, these APIs are not handled by Windows Sockets and therefore are not intercepted by the WinSock Proxy DLL.)

Once an internal application’s socket has been remotely bound, WinSock Proxy makes it appear that the socket is bound to the proxy computer’s Internet interface. If the internal application calls the getsockname() API, the data returned will indicate that the socket’s local IP address is that of the proxy computer. Thus, it appears to the application that it is on the Internet. This is necessary for protocols such as FTP, in which the client sends its local IP address to a server, in order for the server to initiate a new TCP connection back to the client.

When an internal application attempts to listen for a TCP connection initiated by an Internet application, WinSock Proxy uses the local IP address to which the application’s socket is bound to determine whether the listen should be redirected. If the local IP address is that of the Internal computer’s interface (a private network IP address), the listen will be local (passed to the Winsock DLL). If the IP address bound to the socket is that of the Microsoft Proxy Server’s Internet interface, the listen will be redirected.

When a listen() API is redirected, WinSock Proxy does the following:

Listens for a socket connection on the WinSock Proxy service’s Internet IP address and the same port specified as the local port in the internal application’s socket bind.
When an external site connects to the port, creates a socket connection between the internal application (on the local port specified by the application), and the WinSock Proxy service (on its internal IP address and an arbitrary port). This connection is initiated by the WinSock Proxy service, because the internal application is listening for an incoming connection.

Once an internal application’s socket is bound to the Microsoft Proxy Server’s Internet IP address and an inbound connection is established, a getsockname() API call by the application will return the proxy’s Internet IP address as the local IP address, and the Internet site’s IP address and port as the remote IP and port.

The following illustration shows WinSock Proxy redirecting a TCP connection.

UDP Redirection

UDP offers connectionless communications, and supports multiple applications communicating with an application over the same UDP socket. A UDP-based application uses sendto() to send data, specifying the destination IP address, and recvfrom() to receive data, returning the source IP address.

When an internal application binds a UDP socket, the WinSock Proxy service binds a UDP socket to its Internet IP address, and the same local port as used by the client. This is the socket used for communications between all Internet peers for the internal application.

When an internal client computer receives a packet over UDP from the Internet:

The packet was actually forwarded by the WinSock Proxy service, and the source address will be that of the computer running Microsoft Proxy Server.
The WinSock Proxy client DLL needs to change the source port and IP address to that of the actual Internet source before the internal application receives the data. However, the problem is that for UDP there can be multiple sources of data sent to one destination socket.

In other implementations, this problem is sometimes handled by having the Web Proxy Service add a header to the data (which contains the original source port and IP address) before forwarding it to the internal client. The client DLL would then strip off this header and modify the source IP address and port passed to the application. This solution requires much work on every data packet, including a buffer copy, and may even result in the buffer size being larger than the maximum allowed. In this case, splitting the data into multiple packets needs to be supported, as well as ordering and recombining at the destination. Also, this solution prevents Win32 APIs from working. (On Windows NT, Win32 APIs are not passed to Windows Sockets.)

Instead, the problem of multiple-source IP addresses is solved by creating a separate UDP socket in Microsoft Proxy Server for each Internet peer sending data to the client. Each time the first data packet is received from a new Internet port and IP address, WinSock Proxy service creates a new UDP socket on a different local port, in Microsoft Proxy Server (bound to the proxy’s internal IP interface). The WinSock Proxy service maintains a table that maps Internet ports and IP addresses to the port number of the WinSock Proxy server’s socket for that Internet site. Each time it changes, the mapping table is forwarded to the WinSock Proxy client DLL by using the control channel.

When the WinSock Proxy service receives data from an Internet application destined for the client, it sends the data to the client by using the associated socket on the Microsoft Proxy Server. The WinSock Proxy client DLL looks at the source (remote) port number of the data packet (proxy-server port number), and uses the table to map that to an Internet application’s port and IP address. The internal application is handed the Internet port and IP address as the source.

The result is that handling UDP communications:

Does not require extra control channels.
Does not cause data packets to be modified.
Does not require use of the control channel when data is sent from an Internet peer that the WinSock Proxy service already knows about.

Win32 APIs also work for reading and writing the socket.

When the internal application sends data to one of the remote peers, the WinSock Proxy DLL uses the mapping table to map the destination port and IP address (specified by the internal application) to an WinSock Proxy server port, and sends the data to the appropriate UDP socket (port) on the WinSock Proxy server computer.

Because this mechanism requires a new socket for each Internet peer application, extra resources are used in the Microsoft Proxy Server when an internal application uses UDP to communicate with many remote peers. Most Internet client applications that use UDP (RealAudio, VDOLive, and so on), communicate with a single server application, so this is an efficient trade-off. For other UDP client applications, the number of servers communicating with the client is usually small. (WinSock Proxy will limit the number of mappings per socket, keeping the most recently used, and re-establishing mappings as needed.)

When an internal application calls getsockname() for a remoted UDP socket, the local IP address returned is that of the proxy’s Internet interface.

The following illustration depicts redirected UDP communications that use WinSock Proxy.

Using TCP/IP on the Internal Network

When the internal network runs TCP/IP, a TCP/IP application could try to communicate with a local (internal network) or remote (Internet) application. When the WinSock Proxy client DLL initializes, it receives from the WinSock Proxy service, using the control channel, a Local Address Table, which contains a list of IP addresses and subnets that are located on the private network. Future communication attempts by applications on the client computer with a specific IP address can be routed locally or remotely by the WinSock Proxy DLL, as appropriate. If communication is attempted with a local IP address, the WinSock Proxy DLL simply forwards the request to the real Window Sockets DLL, with no special processing.

In some cases an application attempts to communicate, but the WinSock Proxy DLL cannot determine whether the application is trying to communicate with a local computer or a remote computer. For example, a typical server application binds a socket to all local IP addresses (IP_ANY), and then listens on that socket. If WinSock Proxy cannot in any way determine if a listen() API should be local or remote, it will assume local (the more secure of the two).

If multiple internal servers are to be set up to do redirected listen() APIs on the same port at the same time, they must use different IP addresses on the Internet interface of the computer running Microsoft Proxy Server. For example, if two servers running Microsoft Exchange (IMC) internally will be listening on the SMTP port (port 25), the Microsoft Proxy Server must have two IP addresses assigned to its Internet: one for each computer running Exchange (IMC) server. Because both internal servers listen() on the same port, and the two listen() instances must be redirected to the same port on the proxy computer, the only way to distinguish them on the proxy is by using different IP addresses.

Using IPX/SPX on the Internal Network

If the internal network does not run TCP/IP, it is assumed that all attempts by Windows Sockets applications to communicate over TCP/IP are to be redirected to the Internet.

When the internal network runs IPX/SPX, the principle of how communications are redirected is identical to that used when the internal network runs TCP/IP. When an internal application attempts to establish communications over TCP/IP, the WinSock Proxy DLL changes the Windows Sockets API parameters to those appropriate for IPX/SPX (address reformatting, and so on), and the communications between client and WinSock Proxy server are handled over IPX/SPX. The WinSock Proxy server acts as a protocol gateway, converting between IPX/SPX on the private network and TCP/IP on the Internet. In addition to standard redirection functionality, the following tasks are accomplished by the WinSock Proxy DLL when establishing remote communications with IPX/SPX:

socket() API. When an application specifies a protocol of TCP or UDP in a socket API call, the WinSock Proxy DLL changes this to the appropriate IPX/SPX protocol.
bind() API. When an application specifies a local IP address to bind a socket to, WinSock Proxy converts this to a local IPX/SPX address. A request to bind to IP_ANY is also converted.
connect() API. When an application attempts to connect to an Internet application, the address passed to the Windows Sockets DLL needs to be the IPX address of the Microsoft Proxy Server’s internal interface.
sendto() API. The destination IP address needs to be converted to IPX address of the internal interface of the computer running Microsoft Proxy Server.
recvfrom() API. The source IP address returned needs to be converted from IPX address of the computer running Microsoft Proxy Server to IP address of the Internet application.
The control channel uses IPX instead of UDP.

Windows Sockets APIs

socket()
bind()
connect()
listen() and accept()
recv() and send()
recvfrom() and sendto()

socket()

The socket() API is used by applications to establish a socket and associate it with a protocol (TCP, UDP, and so on) The socket() API requires no special processing by the WinSock Proxy DLL if the internal network runs TCP/IP; the API is simply passed to the standard Windows Sockets DLL for the local creation of the socket. If the local network runs IPX/SPX, the WinSock Proxy DLL needs to change the protocol specified in the socket() API call (UDP or TCP), to the appropriate IPX/SPX protocol.

bind()

After calling the socket() API, clients may call the bind() API to bind the socket to a specific local interface (IP address) and port. This API is intercepted by the WinSock Proxy DLL and forwarded to the WinSock Proxy service using the control channel, when one of the following is true:

The private network is running IPX, and not TCP/IP.
The private network is running TCP/IP, but the specified IP address is not in the LAT, and is not ADDR_ANY.
The application configuration specifies that bind is remote.
The application already has a connection using the proxy server, and the bind is not to PORT_ANY and ADDR_ANY. (Note that some FTP clients can redirect even when they bind to PORT_ANY and ADDR_ANY.)

The WinSock Proxy service, in preparation of an attempt by the application to create a remote connection using this socket, creates one socket on the gateway computer for a UDP socket, and two sockets on the gateway computer for a TCP socket. This is done by calling the socket() and bind() APIs for the one or two new sockets. One new socket will be bound to the same port number that the client specified in its bind(), and the IP address of the gateway computer’s Internet interface. For TCP, the second socket is bound to the IP address of the gateway computer’s internal interface (and an arbitrary port). The client’s socket bind() request will then be passed to the Winsock DLL on the client computer, for normal processing.

connect()

An application uses the connect() API to initiate an outbound TCP connection to a remote IP address and port pair. If the WinSock Proxy DLL determines, by looking up the IP address in the downloaded Local Address Table, that the application is attempting to connect to a remote (Internet) site (or if the local network runs IPX/SPX), the DLL forwards the request to the WinSock Proxy service using the control channel.

The WinSock Proxy service performs these actions:

A connect() API (on the socket that was previously bound to the Internet interface).
For the other socket, which was bound to the interface on the internal network, the listen() and accept() APIs are used to establish the connection with the client.

The WinSock Proxy DLL passes the connect() API to the Windows Sockets DLL, but first changes the IP address of the remote computer to that of the gateway computer’s internal interface (or converts it to the gateway computer’s IPX address, if the local network runs IPX/SPX). The listen() and accept() APIs used by the WinSock Proxy service will complete the establishment of this connection.

The result is that the WinSock Proxy service on the gateway computer has two socket endpoints that represent communications channels with the two communicating applications.

listen() and accept()

When a client allows an inbound connection from a remote computer, it calls the listen() API. If the WinSock Proxy DLL determines from the Local Address Table, or configuration information, that the client is attempting to establish a connection with an Internet computer, the DLL forwards the listen() API to the WinSock Proxy service by using the control channel. listen() will be redirected only if the bind() call was remoted.

The WinSock Proxy service will do a listen() on the socket bound to its Internet interface when the client did its bind(). When the remote application attempts to connect to the WinSock Proxy service’s socket, the service will do an accept() to complete the connection process. The service will then do a connect() on the internal socket to establish a connection with the internal client application. The client application will then call the accept() API to complete the connection process.

The result is that the WinSock Proxy service on the gateway computer has two socket endpoints that represent the two communicating applications.

Note that if inbound access is not allowed, or if a site attempts to connect and that site is filtered (by using Microsoft Proxy Server Internet site filtering), the connection is discarded and will not reach the client.

recv() and send()

Once a TCP socket connection is established between internal application and the WinSock Proxy service, and a corresponding connection is established between WinSock Proxy service and remote application, the client can receive and send data with the recv() and send() APIs.

The WinSock Proxy service uses the recv() API on both connections for receiving data packets. When a client sends data, the data is actually sent to the WinSock Proxy service, because it has the socket endpoint of the client’s connection. The WinSock Proxy service simply receives the data on that connection, and sends it to the remote computer using the other connection.

When the remote application sends data to the client, the WinSock Proxy service receives this data and sends it to the client application.

Receiving and sending data on the client computer requires no special handling on the part of the WinSock Proxy DLL. The DLL simply passes these API calls on to the (renamed) Windows Sockets DLL. The WinSock Proxy service does all of the special handling, by passing the data to the associated connection. This results in high performance when sending and receiving data on the client. The Win32 APIs for reading and writing files, when applied to a redirected socket, will work successfully.

recvfrom() and sendto()

The recvfrom() and sendto() APIs are most often used with UDP connectionless communications. The sendto() API requires an IP address and port destination, and the recvfrom() API returns an IP address and port of the originator of the data.

When the internal application does a bind() on a UDP socket, the WinSock Proxy service binds a socket on the gateway computer’s external (Internet) interface, to send and receive data to and from remote applications. Once this socket is bound, remote servers can send data, destined for the internal application. Each time a UDP data packet is received from a new IP address and port pair, the WinSock Proxy service creates a new UDP socket on the gateway computer and binds it to a different port on the gateway’s internal interface.

The WinSock Proxy service maintains a mapping table of remote IP address and port pairs (that have sent data) with the port number of the corresponding internal-interface socket in the gateway computer. This mapping table is downloaded to the WinSock Proxy DLL on the client computer by using the control channel.

When a UDP data packet is received from a remote computer, the WinSock Proxy service looks up the internal socket (based on the remote computer’s IP address and port) to use for that remote computer, and sends the data to the internal computer by using the corresponding socket. The WinSock Proxy DLL on the client computer will receive the data packet from the gateway computer’s IP address, and the port that it was sent from can be used to look up the IP address, and port, of the originating remote computer. The WinSock Proxy DLL will replace the source information, making it appear that the data came directly from the remote computer.

When the client (internal) application sends data to a remote computer, the WinSock Proxy DLL intercepts the request and modifies the destination to send it to the WinSock Proxy service by using one of the WinSock Proxy service’s sockets bound to the internal IP interface. In order to determine which WinSock Proxy service socket to use, the WinSock Proxy DLL looks up the final destination IP address and port pair in the UDP mapping table, and that indicates which WinSock Proxy service socket (port) to send to.

WinSock Proxy Limitations

Version 1.0 of Microsoft Proxy Server can redirect Windows Sockets 1.1 applications. Redirection of Windows Sockets 2.0 APIs or applications is not supported. Almost all Windows Sockets 1.1 TCP/IP applications can be redirected. This section describes limitations that may prevent specific protocols or applications from working through the WinSock Proxy service.

When an internal application receives an inbound TCP connection from an Internet site, the internal application’s bind() API needs to be redirected to the WinSock Proxy service computer’s Internet interface. If multiple internal computers will be running that same application, and therefore listening on the same port at the same time, each one needs to use a different Internet IP address on the WinSock Proxy server in order to distinguish them (because the port is the same).

There are a small number of APIs that cannot be handled properly in a redirected environment. Following is a list of APIs that will not be redirected properly:

duplicatehandle()
getsockopt()

Note The getsockopt() API returns the local information, which is usually equal to the remote information.

Also, WinSock Proxy does not support Out-Of-Band (OOB) data transfer. In general, OOB is implementation-dependent and may not work between different network stacks.