An Example .COM Program

The HELLO.COM program listed in Figure 3-3 demonstrates the structure of a simple assembly-language program that is destined to become a .COM file. (You may find it helpful to compare this listing with the HELLO.EXE program later in this chapter.) Because this program is so short and simple, a relatively high proportion of the source code is actually assembler directives that do not result in any executable code.

The NAME statement simply provides a module name for use during the linkage process. This aids understanding of the map that the linker produces. In MASM versions 5.0 and later, the module name is always the same as the filename, and the NAME statement is ignored.

The PAGE command, when used with two operands, as in line 2, defines the length and width of the page. These default respectively to 66 lines and 80 characters. If you use the PAGE command without any operands, a formfeed is sent to the printer and a heading is printed. In larger programs, use the PAGE command liberally to place each of your subroutines on separate pages for easy reading.

The TITLE command, in line 3, specifies the text string (limited to 60 characters) that is to be printed at the upper left corner of each page. The TITLE command is optional and cannot be used more than once in each assembly-language source file.

1: name hello

2: page 55,132

3: title HELLO.COM--print hello on terminal

5: ;

6: ; HELLO.COM: demonstrates various components

7: ; of a functional .COM-type assembly-

8: ; language program, and an MS-DOS

9: ; function call.

10: ;

11: ; Ray Duncan, May 1988

12: ;

13:

14: stdin equ 0 ; standard input handle

15: stdout equ 1 ; standard output handle

16: stderr equ 2 ; standard error handle

17:

18: cr equ 0dh ; ASCII carriage return

19: lf equ 0ah ; ASCII linefeed

20:

21:

22: _TEXT segment word public 'CODE'

23:

24: org 100h ; .COM files always have

25: ; an origin of 100h

26:

27: assume cs:_TEXT,ds:_TEXT,es:_TEXT,ss:_TEXT

28:

29: print proc near ; entry point from MS-DOS

30:

31: mov ah,40h ; function 40h = write

32: mov bx,stdout ; handle for standard output

33: mov cx,msg_len ; length of message

34: mov dx,offset msg ; address of message

35: int 21h ; transfer to MS-DOS

36:

37: mov ax,4c00h ; exit, return code = 0

38: int 21h ; transfer to MS-DOS

39:

40: print endp

41:

42:

43: msg db cr,lf ; message to display

44: db 'Hello World!',cr,lf

45:

46: msg_len equ $-msg ; length of message

47:

48:

49: _TEXT ends

50:

51: end print ; defines entry point

Figure 3-3. The HELLO.COM program listing.

Dropping down past a few comments and EQU statements, we come to a declaration of a code segment that begins in line 22 with a SEGMENT command and ends in line 49 with an ENDS command. The label in the leftmost field of line 22 gives the code segment the name _TEXT. The operand fields at the right end of the line give the segment the attributes WORD, PUBLIC, and `CODE'. (You might find it helpful to read the Microsoft Macro Assembler manual for detailed explanations of each possible segment attribute.)

Because this program is going to be converted into a .COM file, all of its executable code and data areas must lie within one code segment. The program must also have its origin at offset 0100H (immediately above the program segment prefix), which is taken care of by the ORG statement in line 24.

Following the ORG instruction, we encounter an ASSUME statement on line 27. The concept of ASSUME often baffles new assembly-language programmers. In a way, ASSUME doesn't "do" anything; it simply tells the assembler which segment registers you are going to use to point to the various segments of your program, so that the assembler can provide segment overrides when they are necessary. It's important to notice that the ASSUME statement doesn't take care of loading the segment registers with the proper values; it merely notifies the assembler of your intent to do that within the program. (Remember that, in the case of a .COM program, MS-DOS initializes all the segment registers before entry to point to the PSP.)

Within the code segment, we come to another type of block declaration that begins with the PROC command on line 29 and closes with ENDP on line 40. These two instructions declare the beginning and end of a procedure, a block of executable code that performs a single distinct function. The label in the leftmost field of the PROC statement (in this case, print) gives the procedure a name. The operand field gives it an attribute. If the procedure carries the NEAR attribute, only other code in the same segment can call it, whereas if it carries the FAR attribute, code located anywhere in the CPU's memory-addressing space can call it. In .COM programs, all procedures carry the NEAR attribute.

For the purposes of this example program, I have kept the print procedure ridiculously simple. It calls MS-DOS Int 21H Function 40H to send the message Hello World! to the video screen, and calls Int 21H Function 4CH to terminate the program.

The END statement in line 51 tells the assembler that it has reached the end of the source file and also specifies the entry point for the program. If the entry point is not a label located at offset 0100H, the .EXE file resulting from the assembly and linkage of this source program cannot be converted into a .COM file.