Dr. GUI Online
September 7, 1998
First it was Y2K. Then the Euro conversion. And now, as if the Y2K and Euro problems weren't bad enough, there's the year 2038 bug. Dr. GUI got a letter from Mahmoud Saleh alerting him (reminding him, actually) of a similar problem that will face C and C++ programmers in coming years: we can call it the Y2.038K bug.
The problem stems from the common definition of the time_t as an integer containing the number of seconds since midnight, January 1, 1970. Most C/C++ runtime libraries define time_t as a long int. On most systems, long int is 32 bits, which means that we've got a range of 2^31-1 (2,147,483,647) seconds—until sometime on January 18, 2038. (Assuming Dr. GUI's Windows CE Palm-size PC has it right, that's a Monday. Figures.) When the clock rolls over, it'll be back to the '70s for everyone. Get your leisure suits ready 'cuz you'll be catching Boogie Fever and Nixon will be President again. Four more years indeed!
Anything that uses time_t is also in trouble. That includes the time_b structure (not commonly used, anyway) and, very unfortunately, the MFC CTime class. Code that uses time_t, directly or indirectly, will need to be changed sometime before you start dealing with dates after 1/18/2038. (Note that if your program deals with, say, 40-year bonds, you're in trouble today.)
Dr. GUI prescribes data types that don't cause problems for the foreseeable future. Two that will do the work are the Win32 SYSTEMTIME structure, which stores the year part of the date as a 16-bit integer, and the Win32 FILETIME structure, which stores the date as the number of 100-nanosecond intervals since 1601. The problem is that neither of these structures have many supporting functions.
Better yet is to use the automation DATE object. DATE is typedef'ed as double, so there are 53 bits of precision—enough for your program's lifetime. The whole part of the double number represents the number of days since midnight, December 30, 1899. (Negative numbers are before 12/30/1899.) The absolute value of the fractional part represents the time in the day: midnight is zero, noon is 0.5, etc. You can convert the automation DATE objects to other formats with various variant API functions.
Note The fact that the DATE representation uses fractional binary floating point means that it cannot represent exact times. The date will always be exactly right since it's encoded as the whole number part. But the times may be off by a tiny fraction (up to a microsecond or so for the next century or so) when you convert them and do math with them since they can't be represented exactly. (Remember what happens if you add 0.001 a thousand times—rarely do you get 1.000—usually 0.999999999998 or some such. That's because binary floating point can't represent 0.001 exactly—and adding it 1000 times magnifies the error.)
Dr. GUI doesn't expect this to be an issue often since we'll be using at most 16 bits to represent the number of days, leaving 36 bits to represent the time, but you may want to test to make sure this isn't an issue for you. If it is, use a format that represents the smallest time increment you need as an integer rather than in binary floating point.
If you're using MFC, use the COleDateTime class rather than a DATE object. The COleDateTime class wraps a DATE object and provides a bunch of handy member functions. But it shares the binary floating-point problem with DATE.
Dr. GUI recommends using robust date and time data structures rather than ones based on time_t for all new code—and updating existing code as needed.
If you program primarily in Visual Basic or Visual J++, you're in luck: Visual Basic's Date data type (the same as the automation DATE, actually) and Java's Date class both go well beyond your program's reasonable lifetime. Visual FoxPro uses a robust date representation as well, although you might have to adjust your program to be Y2K compliant.
Henrik Vallgren and Peter Schaeffer took the time to let Dr. GUI know that he's dead wrong about there being enough GUIDs for each atom in the universe to have its very own. The number of particles (not atoms) in the universe is somewhere around 1079 or so—even Dr. GUI isn't sure of exactly how many.
A GUID comprises 128 bits, so the number of GUIDs is 2128. That's only about 1038, or 100,000,000,000,000,000,000,000,000,000,000,000,000. And that's a factor of 1041 short of 1079. So a GUID could hold a unique number for only 1 out of every 1041 particles in the universe. Thankfully, each particle doesn't need its own interface ID and/or class ID. (IElectron? IQuark?) So there are still plenty of GUIDs for our uses.
Henrik's line of reasoning was especially interesting:
"Consider the sun's mass 2*10^30 kg. Assume that the sun consists of only hydrogen. One gram of hydrogen is 6*10^23 (Avogadro's number) atoms, one kilogram 6*10^26 atoms.
"The number of atoms in the suns is thus approximately 10^54 (~2^179).
"The hydrogen assumption is wrong so that number should be divided by the true average atomic weight, of which I have no idea. [But it's certainly less than 100—perhaps less than 10—so the assumption of a pure hydrogen sun doesn't affect these numbers much. Like a factor of 100 isn't much.—Dr. GUI]
Then consider billions and billions of stars in the Milky Way. And quite a few galaxies on top of that. You'll get the picture ... "
Indeed, the good doctor does get the picture. Dr. GUI begs your forgiveness 1041 times.
Note that if the GUIDs were somewhat bigger, they WOULD be big enough to give each particle its very own GUID, or PID (for particle ID). It would take about 264 bits—33 bytes for each GUID.
Thanks again to Henrik and Peter for taking the time to write.