The TransmitFile() API

Windows NT 3.51 adds a new Windows Sockets API called TransmitFile(). This API is designed to allow applications and services to send file data very quickly over the network. It works by taking handles to a connected socket and to an open file. Then, in kernel mode, it reads data directly from the system cache and passes it off to the transport protocol. This avoids all the buffer copies, context switches, and kernel transitions associated with the typical methods of sending file data.

Another advantage of TransmitFile() is that it allows "head" and "tail" buffers to be specified along with the file handle. These buffers are sent before and after the file data, respectively. This is very efficient because it allows the transport protocol to combine the head buffer with the first chunk of file data and the tail buffer with the last chunk of file data.

The performance effect of TransmitFile() is significant, as shown in the following chart:

This chart compares three common mechanisms for sending file data: with the ReadFile() and send() APIs, by memory-mapping the file, setting SO_SNDBUF to 0 and using overlapped WriteFile() calls (thereby avoiding the buffer copies), and by using TransmitFile(). These tests were run with a 486/33 on two Ethernets with Compaq Netflex cards.

The chart shows that the cost of buffer copies, kernel transitions, and context switches make ReadFile() and send() the least fast way to send file data. This is because there are two buffer copies for every I/O (into the user buffer for ReadFile() and from the user buffer for send()) as well as kernel transitions and context switches.

Using a memory-mapped file avoids a buffer copy into the user buffer when retrieving data, and using WriteFile() with SO_SNDBUF set to 0 avoids the buffer copy when sending the data. Therefore, this mechanism achieves much better performance, especially for larger files. However, it must still incur the costs of the kernel transitions and context switches.

The fastest method is the TransmitFile() which simply sits in kernel mode, using highly optimized cache manager functions to retrieve the data and tight code paths to give this data to the transport protocol. Using TransmitFile(), the 486/33 is nearly able to saturate the full bandwidth of two Ethernets with file data.

Another note on TrasnmitFile() for high-performance file transfer is that it is relatively simple to use. The mapped file mechanism used above took 213 lines of source code, while TransmitFile() used only 41.