C 6.0a/6.0ax Generate Bad Code on Multiple Cast of a Constant

ID Number: Q72888

6.00a 6.00ax | 6.00a

MS-DOS | OS/2

buglist6.00a buglist6.00ax fixlist7.00

Summary:

Microsoft C versions 6.0a and 6.0ax truncate or sign extend all but

the lowest byte or word of an expression containing a constant in

certain cases involving multiple casts. If, within a single

expression, a constant is cast to (void *), (char *), or (int *), and

then ultimately cast to another type, all but the lowest byte or word

will be lost; the low byte remains with (void *) and (char *), the low

word remains with (int *). For example, incorrect code is generated

for the following statement:

unsigned long lval = (unsigned long)(char far *)0xffffff03L;

Because the intermediate cast is (char *), only the lowest byte is

retained and the value assigned to lval is 0x03.

More Information:

What actually occurs is that when the compiler performs the second

cast, it incorrectly considers the constant that was cast to a pointer

in the the first cast to be the type the pointer points to, rather

than a pointer type. Thus, with multiple casts, a (void *) is

considered a char, a (char *) is considered a char, and an (int *) is

considered an int.

This type of multiple casting is often used in Presentation Manager

(PM) programming where a number of window class constants (such as

WC_BUTTON) may be cast to unsigned longs to eliminate compiler

warnings. Because these constants are defined as long constants cast

to far pointers, any cast of one of these constants to a long results

in the above scenario. For example, WC_BUTTON is defined in PMWIN.H as

follows:

#define WC_BUTTON ((PSZ)0xffff0003L)

PSZ is a typedef defined in OS2DEF.H as follows:

typedef unsigned char far *PSZ;

Thus, casting WC_BUTTON to an unsigned long results in the following

expression, which meets the conditions that cause the compiler to

"lose" the three highest bytes:

(unsigned long)(unsigned char far *)0xffff0003L

There are three possible workarounds for this problem:

1. Do not cast the constant. You will get a compiler warning (C4047)

but it will not affect the behavior of the program.

-or-

2. Add an additional intermediate cast so that the final cast will not

meet the conditions specified above.

-or-

3. Change the PMWIN.H header file from (PSZ) to (long _far*)WC_xxx

(where WC_xxx is the window class constant in use).

These three workarounds are illustrated in the sample code below,

where workarounds 1, 2, and 3 above are demonstrated with the

variables a, b, and c, respectively. All three variables are assigned

the value 4294901763, while variable d shows the problem condition.

Note that this problem may also arise in Windows programming because

this same type of casting may occur with some common macros and

typedefs. For example, consider the following definitions in

WINDOWS.H:

typedef unsigned int WORD;

typedef unsigned long DWORD;

typedef char far *LPSTR;

#define MAKEINTRESOURCE(i) (LPSTR)((DWORD)((WORD)(i)))

#define IDC_IBEAM MAKEINTRESOURCE(32513)

If you cast IDC_IBEAM to an unsigned long, the value produced will be

0x01, when it should be 32513 (0x7F01). This occurs because the cast

ultimately results in the following expression:

(unsigned long)(char far *)((unsigned long)((unsigned int)(32513)))

This expression also meets the conditions described above and the

generated code retains only the lowest byte.

Microsoft has confirmed this to be a problem in C versions 6.0a and

6.0ax. This problem was corrected in C/C++ version 7.0.

Sample Code

-----------

/* Compile options needed: none

*/

#include <stdio.h>

void main( void)

{

unsigned long a, b, c, d;

a = (char _far*) 0xffff0003L; // produces C4047 warning

b = (unsigned long)(long _far*)(char _far*) 0xffff0003L;

c = (unsigned long)(long _far*) 0xffff0003L;

d = (unsigned long)(char _far*) 0xffff0003L; // problem line

printf( "\nOutput should be 4294901763 for all 4 values\n\n");

printf( "a = %lu\nb = %lu\nc = %lu\nd = %lu\n", a, b, c, d);

}

Program Output

--------------

Output should be 4294901763 for all four values

a = 4294901763

b = 4294901763

c = 4294901763

d = 3