Special Treatment of Far Functions

When you compile a C program, the compiler inserts prolog and epilog code in each function. This code sets up a ”stack frame“ for the function. The function uses the stack frame to retrieve parameters passed to the function on the stack and to temporarily store the function's own local variables. For a normal MS-DOS program, the prolog and epilog can be as simple as this:

PUSH BP

MOV BP, SP

SUB SP, x

[other program lines]

MOV SP, BP

POP BP

RET y

I'm assuming here that stack overflow checking has been disabled. The value of x that the prolog subtracts from SP is the total size of the function's local nonstatic data (increased to the next even number).

After the three prolog instructions have been executed, the stack is organized as shown in Figure 7-4 (from higher addresses to lower). The function uses the BP register to reference both data passed to the function on the stack (which have positive offsets to BP) and local data declared within the function (which have negative offsets to BP). If the function uses the SI and DI registers, these registers will also be saved on the stack because they can be used by the caller for register variables. The DS register must also be preserved during a function call.

For a function declared as near, the return address is only one word—the offset address. For a far function, the return address is two words. A function declared as near or far normally has the same prolog and epilog, but the compiled function must take into account the size of the return address—one word or two—when accessing the parameters passed to the function.

At the end of the function, the value of SP is set equal to BP, and BP is popped off the stack to restore it to its original value. The function is now ready to return to the caller. This involves a near RET or a far RET, depending on whether the function is near or far. If the function is also declared as pascal, the y value in the code above is the total size of parameters passed to the function. Otherwise, y is not used, and the caller adjusts the stack pointer when the function returns.

That's a normal compilation. When you compile a Windows program with the -Gw switch (the Windows switch), every far function gets special prolog and epilog code that looks like this:

PUSH DS

POP AX ; set AX to DS

NOP

INC BP

PUSH BP ; save incremented BP on stack

MOV BP, SP

PUSH DS ; save DS on stack

MOV DS, AX ; set DS to AX

SUB SP, x

[other program lines]

DEC BP

DEC BP

MOV SP, BP ; reset SP to point to DS on stack

POP DS ; get back DS

POP BP ; get back incremented BP

DEC BP ; restore BP to normal

RET y

Functions that are declared as near get the normal prolog and epilog even with the -Gw switch. Notice two points here: First, the prolog sets the value of AX equal to DS, saves the DS register on the stack, and then sets DS from AX. On exiting, DS is retrieved from the stack. That code is not doing anything (or anything harmful) except taking up unnecessary space. Second, the previous value of the BP register is incremented before being saved on the stack and decremented after it is retrieved from the stack. This certainly looks mystifying right now. Figure 7-5 shows what the stack frame looks like for far functions compiled with the -Gw compiler switch after the prolog code is executed.

When LINK creates the program's .EXE file, it treats every far function in a moveable code segment as a ”moveable entry point.“ In the .EXE header, LINK builds an entry table that contains 6 bytes of information for every moveable entry point in the program. This information includes the segment ordinal number of the function—simply a sequential numbering of the program's segments—and the offset of the function within that segment. The entry table also includes the 2 bytes CDH and 3FH for every moveable entry point. These 2 bytes are actually the instruction INT 3FH. This same software interrupt shows up in non-Windows programs, where it is used for overlay management. In Windows, the interrupt performs a similar function in that it loads a program's code segment from disk into memory.

A flag in the entry table indicates whether the far function was listed in the EXPORTS section of the module definition (.DEF) file. As I've discussed, the EXPORTS section of the .DEF file must list all far functions in your program that are called by Windows. These include window procedures (such as the function I've been calling WndProc), call-back functions, window subclassing functions, and dialog box functions.

When your program's code contains references to the addresses of far functions, LINK has to leave the instruction incomplete. For example, if you call a far function, the compiler generates a far CALL instruction, but the address is not yet known because LINK doesn't know where the segment containing the function will be loaded in memory. LINK builds a relocation table in the .EXE file at the end of each code segment. This table lists all the references to far functions within your program as well as all references to Windows functions.

Now that the compiler and LINK have done all these strange things to your program, it's time for Windows to run the program and do its magic.