How to Calculate a Segmented Executable Checksum Value

ID Number: Q71971

5.0x 5.10 5.11 5.13 | 5.0x 5.10 5.11 5.13

MS-DOS | OS/2

Summary:

In the header for an .EXE created by LINK, there are two different

checksum variables. The first (listed at offset 12-13h in the old .EXE

header) is a complemented checksum of all the words in the file. The

second (listed at offset 08h in the new .EXE header) is a 32-bit

checksum of all the DWORDs in the file. The information below

describes how to calculate these checksums.

More Information:

The following information on how to calculate the first checksum is

from the "MS-DOS Encyclopedia" on page 122:

12-13H (Complemented Checksum) This word contains the one's

complement of the summation of all words in the .EXE file. Current

versions of MS-DOS basically ignore this word when they load a .EXE

program; however, future versions might not. When LINK generates a

.EXE file, it adds together all the contents of the .EXE file

(including the .EXE header) by treating the entire file as a long

sequence of 16-bit words. During this addition, LINK gives the

Complemented Checksum word (12-13H) a temporary value of 0000H. If

the file consists of an odd number of bytes, then the final byte is

treated as a word with a high byte of 00H. Once LINK has totaled

all words in the .EXE file, it performs a one's complement

operation on the total and records the answer in the .EXE file

header at offsets 12-13H. The validity of a .EXE file can then be

checked by performing the same word-totaling process as LINK

performed. The total should be FFFFH, because the total will

include LINK's calculated complemented checksum, which is designed

to give the file the FFFFH total.

The sample code below shows one implementation of this.

Note that some versions of the Microsoft linker do not properly

calculate the checksum if the /E (pack executable) or /CO (include

CodeView information) switches are used during linking. Since DOS does

not check the checksum, this doesn't normally present a problem.

The old .EXE header checksum is used any time a straight DOS program

is created. However, for almost any other type of executable file

(Windows .EXEs and .DLLs, OS/2 1.x .EXEs and .DLLs, bound

applications), the 32-bit checksum is calculated for the new .EXE

header. This calculation is only slightly more complicated.

First, the checksum is not a true sum of all DWORDs in the physical

file, but rather all the DWORDs in the executable part of it. As

resources, symbolic information, and so forth are added to the .EXE

proper, these items are merely appended to the end of the file. You

can find the size of the executable file by examining the two words at

offset 02h and 04h in the old .EXE header.

Second, the 32-bit checksum is actually a sum of all DWORDS in the

executable file EXCEPT the checksum itself. There is no complement

operation done by the linker and the result is NOT FFFFFFFFH when all

the DWORDs are added together.

Sample Code

-----------

/* Compile options needed: none

*/

#include <stdio.h>

#include <stdlib.h>

#include <io.h>

void Calc16ChkSum(FILE *fp);

void Calc32ChkSum(FILE *fp);

void main(int, char **);

FILE * fp;

unsigned long NewHdrOffset, FileSize = 0L;

unsigned int PageCnt;

#define NEWHDROFFSET 0x3C /* Location in Old Header with offset

of new header in EXE */

void main (int argc, char * argv[])

{

if ( argc != 2 )

{

printf("\n\nUsage: %s <EXEfilename>\n\n", argv[0]);

exit (-1);

}

if ( (fp = fopen (argv[1], "rb")) == NULL )

{

printf("\n\nError: Unable to open file : %s\n\n", argv[1]);

exit (-1);

}

fread(&PageCnt, sizeof(int), 1, fp); /* Read past the signature */

fread(&PageCnt, sizeof(int), 1, fp); /* Read the last page size */

FileSize = PageCnt;

fread(&PageCnt, sizeof(int), 1, fp); /* Read the full page count */

if ( FileSize == 0L )

FileSize = PageCnt * 512;

else

FileSize += (PageCnt - 1) * 512;

fseek(fp, NEWHDROFFSET, SEEK_SET); /* Locate the New EXE offset */

fread(&NewHdrOffset, sizeof(long), 1, fp); /* and read it */

if ( NewHdrOffset == 0L )

Calc16ChkSum(fp);

else

Calc32ChkSum(fp);

fcloseall();

return;

}

void Calc16ChkSum(FILE *fp)

{

unsigned int sum16, NxtInt, x;

unsigned char NxtChar;

sum16 = 0;

fseek(fp, 0, SEEK_SET);

for ( x = 0L ; x < FileSize / 2L ; x++ )

{

fread(&NxtInt, sizeof(int), 1, fp);

sum16 += NxtInt;

}

/* make sure and get the last byte if odd size... */

if ( FileSize % 2 )

{

fread(&NxtChar, sizeof(char), 1, fp);

sum16+= (unsigned int)NxtChar;

}

printf("\nThe 16 bit checksum should be FFFF, it is %x\n\n",

sum16);

}

void Calc32ChkSum(FILE *fp)

{

unsigned long sum32, NxtLong, CheckSum32, x;

unsigned char NxtByte, y;

sum32 = 0;

fseek(fp, 0, SEEK_SET);

/* Calculate the number of DWORDs before the checksum, and add

* them together.

* (Note: The checksum will *always* start on a DWORD boundary.) */

x = (NewHdrOffset + 8) / 4;

for ( ; x ; x-- )

{

fread(&NxtLong, sizeof(long), 1, fp);

sum32 += NxtLong;

}

/* Read the actual check sum... */

fread(&CheckSum32, sizeof(long), 1, fp);

/* Then the rest of the DWORDs in the file. */

for ( x = 0 ; x < (FileSize - NewHdrOffset - 12) / 4; x++ )

{

fread(&NxtLong, sizeof(long), 1, fp);

sum32 += NxtLong;

}

/* We have to account for the extra bytes in the file. Basically,

* they are used to form a long with the high order bytes set to

* zero. */

if ( 0L != (x = FileSize % 4L) )

{

NxtLong = 0L;

for ( y = 0 ; y < x ; y++ )

{

fread(&NxtByte, sizeof(char), 1, fp);

NxtLong += (unsigned long)NxtByte << (8 * y);

}

sum32 += NxtLong;

}

printf("\nThe 32-bit checksum should be %lx, it is %lx\n\n",

CheckSum32, sum32);

}