5.7.1 The .COM File Format

A .COM file contains an absolute image of a program—that is, the exact processor instructions and data that must be in memory in order to run the program. MS-DOS loads the .COM program by copying this image directly from the file into memory; it makes no changes.

To load a .COM program, MS-DOS first attempts to allocate memory. Since a .COM program must fit in one 64K segment, the size of the .COM file must not exceed 65,024 bytes (64K minus 256 bytes for a PSP and at least 256 bytes for an initial stack). If MS-DOS cannot allocate enough memory for the program, a PSP, and an initial stack, the attempt fails. Otherwise, MS-DOS allocates as much memory as possible (up to all remaining memory), even though the .COM program itself cannot be greater than 64K. Before attempting to run other programs or allocate additional memory, most .COM programs free any unneeded memory.

After allocating memory, MS-DOS builds a PSP in the first 256 bytes of that memory, setting the AL register to 00h if the first FCB in the PSP contains a valid drive identifier or to 0FFh if it does not. MS-DOS also sets the AH register to 00h or to 0FFh, depending on whether the second FCB contains a valid drive identifier.

After building the PSP, MS-DOS loads the .COM file, starting immediately after the PSP (offset 100h). It sets the SS, DS, and ES registers to the segment address of the PSP and then creates a stack. To create a stack, MS-DOS sets the SP register to 0000h if at least 64K of memory has been allocated; otherwise, it sets the register to two more than the total number of bytes allocated. Finally, it pushes 0000h onto the stack to ensure compatibility for programs designed for very early versions of MS-DOS.

MS-DOS starts the program by transferring control to the instruction at offset 100h. Programmers must ensure that the first instruction in the .COM file is the program's entry point.

Notice that, because the program is loaded at offset 100h, all code and data offsets must be relative to 100h. Assembly-language programmers can ensure this by setting the program's origin to 100h (for example, by using the statement org 100h at the beginning of the source program).