PRB: Rounding Error Casting Double to Long

ID Number: Q12297

3.00 4.00 5.00 5.10 6.00 6.00a 6.00ax 7.00 | 5.10 6.00 6.00a

MS-DOS | OS/2

Summary:

SYMPTOMS

The example below results in errors when casting a double

expression to a long. In the following program example, Long2 and

Long4 give incorrect results of -4049, while Long1 and Long3 give

correct results of -4050:

include "stdio.h"

main()

{

long val1, val2, val3 ;

double mul1, mul2 ;

val1 = 45000 ;

mul1 = 0.09 ;

mul2 = (double)val1 * mul1 * -1.00 ;

printf("%7ld Long1 ",(long)mul2);

val2 = (long)mul2 ;

printf("%7ld Long2 ",(long)((double)val1 * mul1 * -1.00));

printf("%7ld Long3 ",val2);

val3=(long)((double)val1*mul1*-1.00) ;

printf("%ld Long4 \n",val3) ;

}

CAUSE

This is an arithmetic precision problem. The result "-4049" is

obtained by converting a 10-byte real to a long, and the result

"-4050" is obtained by first converting to an 8-byte real and then

converting to a long.

By the rules of type conversion:

The conversion from 10-byte to 8-byte form results in a rounding

of the number from -4049.99999999999985 to -4050.0. When that is

converted to a long, the result is -4050.

Conversion of a double to a long truncates toward zero, for

example, -4049.99999999999985 becomes -4049.

Many numbers (such as .09) cannot be represented exactly in any

number of binary digits. Therefore, any representation of these

numbers will come in a little under or a little over the "true"

value. When these values are involved in calculations, that

representation error propagates. This error is in the very lowest

digits, so the "correct" answer is found only when precision in

an intermediate value is lost.

RESOLUTION

This is not a problem with Microsoft C. When converting from

floating-point numbers to integers in C, the rule is to truncate

towards zero. Therefore, when rounding is desirable, you should

add or subtract 0.5 so that truncation gives the required result.

The following is an example:

#define ROUNDL( d ) ((long)((d) + ((d) > 0 ? 0.5 : -0.5)))

printf("%7ld Long2 ",ROUNDL((double)val1 * mul1 * -1.00));

For more information on conversions from floating point, refer to

the section on type conversions in the "Microsoft C Language

Reference" manual.

Additional reference words: 5.00 5.10 6.00 6.00a 6.00ax 7.00