The information in this article applies to:
SYMPTOMSAn application produces incorrect results when it casts an expression of type double to type long. CAUSEThe error is caused by differences in arithmetic precision. STATUSThe C and C++ standards specify that when an application converts a floating-point number to an integer, the number is truncated toward zero. When rounding is the preferred behavior, the application can add or subtract 0.5 as appropriate so truncation provides the correct result. The text below presents a macro for this purpose. MORE INFORMATIONThe following code example demonstrates this behavior when compiled with any floating point math option other than /FPa. The application displays -4049 as the value for Long2 and Long4, which is incorrect. The application displays the correct value -4050 for Long1 and Long3. Sample Code
The application produces the incorrect results by converting a 10-byte
real value to a long; the application produces correct results by
converting a 10-byte real to an 8-byte real and converting that value
to a long.According to the type conversion rules, the conversion from a 10-byte real to an 8-byte real rounds the number from -4049.99999999999985 to -4050.0. When the application converts this value to a long, the value -4050 results. However, when the application directly converts the double value to a long, the application truncates toward zero. In this example, -4049.99999999999985 becomes -4049. Many numbers (such as .01) are repeating fractions in the binary numbering system which cannot be represented exactly. Any representation of these numbers is slightly more or less than the "true" value. When a calculation involves one of these values, the representation error propagates and can be magnified. Because the error is present only in the least significant part of the number, errors occur only when a calculation loses precision in an intermediate value. The conversions always truncate toward zero. The following macro effectively rounds the number by increasing the magnitude of the number by 0.5 then converting the number to an integer.
The problem does not occur with the Microsoft Visual C++ 32-bit Editions as
Win32 does not support the long double data type for compatibility reasons.For more information on converting floating-point numbers to integers, see the type conversions section of the "C Language Reference" manual. Additional query words: 1.00 1.50 5.10 6.00 6.00a 6.00ax 7.00
Keywords : kb16bitonly |
Last Reviewed: January 3, 2000 © 2000 Microsoft Corporation. All rights reserved. Terms of Use. |