http://www.perlmonks.org?node_id=249801

toma has asked for the wisdom of the Perl Monks concerning the following question:

I've been writing code lately that uses dates in the form of seconds since the epoch. This code will all break in 2038, I believe. I don't think there's much chance that the code or I will survive until then, but on the off chance that we do, I would like to leave some comments for the maintainer.

Is there a standard way to mark this type of code, or is there a simple way to design code to make this easy to fix?

It should work perfectly the first time! - toma

Replies are listed 'Best First'.
Re: 2038 bug
by tachyon (Chancellor) on Apr 11, 2003 at 05:19 UTC

    Put all your time manipulation code in a module called Y2038::Problem ???

    Seriously if you abstract your time code to a single location it will be easy to modify. But realistically Y2038 is only an issue if time_t is still a 4 byte int in 2038. It is quite reasonable to expect that 64 bit machines will be the norm and quite possibly 8 byte ints. In Perl there is unlikely to be a major issue as our use of time is abstracted from the raw 4 byte dependent time_t. As a result in Perl 6.01 in 2038 (Oh what a cynic) the basic time() function could quite easily be modified to return an 8 byte int into a Perl scalar and thus there is not real issue provided that the Perl was comiled with 8 byte int support, etc.

    The ugly Y2K hack is to assume that times say < 2147483647 / 2 (make it a small a the number of extra years you want to wring out of your code) represents a rollover so proceed accordingly. This potentially gives you another 34 years worth of mileage out of it - assuming that there are not valid dates before 2004 you are processing in 2038 of course. You could facilitate such ugly hacks by putting all time functions in a module. ie

    # get the difference between two epoch times and cope with 2038 sub diff_time { my ( $begin, $end ) = @_; # for now all we need is: return $begin - $end; # Jan 19 03:14:07 2038 we have potential rollovers so require Math::BigInt; # blah }

    For those who might wonder why 2038.....

    my $zero_hour = 2**31 -1; print "Zero hour is $zero_hour\n"; print scalar gmtime($zero_hour), "\n"; print gmtime($zero_hour+1) ? "OK" . scalar gmtime($zero_hour+1) : 'Oh +dear!';

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      You'd make a mistake if you think the 2038 problem will magically resolve itself if time_t is suddenly a 64 bit integer. If it were that simple, wouldn't you think many operating systems already had done so? Some obvious problems:
      • Unless all systems switch from 32 bit time_t to 64 bit time_t at the same time, you will have problems when two systems share information (say, using NFS).
      • Modyfing your OS to use a 64 bit time_t doesn't magically change your data. Your file system uses 32 bit timestamps. What do you think would happen if you upgrade your OS to use 64 bit time_t, and it starts assuming your inodes are 12 bytes longer?

      2038 is not going to be disaster, unless too many will think that just upgrading to a 64 bit time_t will magically solve all problems. (If it were that simple, we could have let localtime() return a 4 digit year to avoid Y2K.)

      Some time ago, I saw the timetable SUN is going to use to introduce a 64 bit time_t value. It's going to be a 10+ year traject.

      Abigail

        By an odd coincidence, I was watching a Dilbert rerun last night centering around the Y2K bug. The issue was that there was a single ancient mainframe that was linked to all the other more modern machines. Since this one system hadn't been upgraded since the 70s, it was vulnerable to the bug and threatened to destroy them all. The code itself was undocumented, and the only person who might remember where the time-handling code might be was...Wally.

        The rationale given for why it hadn't been upgraded with all the other systems was that there was a short term advantage in cost savings, and by the time the problem surfaced, the executives who made the decisions would be long gone. Sure, it's a Dilbertism, but in the wake of the corporate scandals of 2001, doesn't it ring true?

        Assuming that we'll all be on 64-bit (or 128-bit, or embryonic monkey-brain) processors at some point in the nebulous future is a mistake. Someone out there will decide not to upgrade. Some of my company's customers still use Windows 95. Last year I was at an airport, and the application displaying the arrivals and departures had crashed The terminal was showing a Windows 98 desktop.

        Even if you decide that someone can fix it "later", possibly in the mad dash to fix legacy unix code in Fall of 2037, please document the issue.

        -Logan
        "What do I want? I'm an American. I want more."

        What I don't really understand is why you could not make time_t an unsigned 4 byte int and thus get another 68 odd years out of it. Leaving aside the problems with C code that expects it to be signed. Why was it unsigned in the first place anyway?

        As you say there are a lot of issues to be resolved. My main point was to abstract the time handling so at least that can of worms is all in the same place which will make it easier to deal with - all other things being equal.

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      But realistically Y2038 is only an issue if time_t is still a 4 byte int in 2038.

      I'm not trying to be a panic monger (lord knows the Y2K 'bug' brought enough of them out of the woodwork!) but the problem is real now. There are plenty of reasons you might want to do a date calculation that went beyond 2038, for example printing a table of predicted values by year for a retirement investment policy, or calculating a mortgage amortisation table, or working out what day of the week to schedule your retirement party :-).

      Don't get me wrong, I'm not suggesting these things are impossible or even hard to do. I'm merely saying that the standard tools like time(), localtime() and Time::Local() will let you down. Dave Rolsky's recent article on Perl.com provides some answers.

        There are plenty of reasons you might want to do a date calculation that went beyond 2038, for example printing a table of predicted values by year for a retirement investment policy, or calculating a mortgage amortisation table, or working out what day of the week to schedule your retirement party.
        Sure, but those are dates. One doesn't calculate a mortgage amortisation table with a precision of a second, nor are parties planned to start at a specific second.

        Abigail

      (After I read tachyon's post again, I thought I should add this comment. i should not second guess his thought too much, and simplify his thought. Any way, I still leave this post here, as some sort of helper to his post.)

      The $begin - $end approach would not work, assume that we get both from a function like time().

      The problem is that, for time(), after it steps over the biggest positive integer, it would go directly to the smallest negative integer, i.e. the negative integer with the biggest absolute value.

      For example, if $begin is 2038-01-18-20:14:06, and $end is 2038-01-18-20:14:07, although there is only 1 second difference, by calling time(), and do $begin - $end, you will get:

      2 ** 31 - 1 - (- 2 ** 31) = 2 ** 32 - 1

        Of course it will work. I said abstract *all time manipulation* into a module. This obviously includes getting the current time from time(). The abstraction is the main thing as it lumps all the problems in one festering little bucket.

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: 2038 bug
by hossman (Prior) on Apr 11, 2003 at 06:43 UTC
    I personally think there are only two ways of dealing with a problem like this in advance:
    • Don't Care.
      Just ignore the problem and trust that it's not ever going to be an issue.
    • Use a time format that doesn't have the limitation.
      The bottom line is, you can represent a date as a string that has no limitations to how far in the future it can go -- but then you need something to parse those dates for doing calculations ... Date::Manip can handle any 4 digit year ... Date::Calc claims to work for dates up to at least the year "32767" (depending on the size of int for your platform) ... Presumably DateTime has been designed to not have any inherient limitations (so if you use it, you can trust that even if it's built on top of a "seconds since epoch" right now, eventually someone will change the implimentation later on) (yep .. i just checked, see update), etc...

    I don't see a lot of middle ground on an issue like this.

    If you're going to worry about it, worry about it all the way and make sure it works ... otherwise just put a conservative comment in the User Manual that says something like "Use this software with dates greater them 2010 at your own risk" and don't make a lot of work for yourself that may or may not acctually ever be usefull to someone who may or may not be looking at your code.

    Update: DateTime works great...

    laptop:~> perl -MDateTime -MDateTime::Duration -l -w my $d = DateTime->now; $d->add(years => 200); print $d->iso8601(); ^D
    2203-04-11T06:52:55

Re: 2038 bug
by IndyZ (Friar) on Apr 11, 2003 at 05:50 UTC
    I'd do something like:

    # FIXME - time() will break in 2038 # ... your code ...

    The only group Perl project that I've ever worked on decided to use FIXME with an explanation for known broken code. It made it simple to do an update from CVS, then run a "grep FIXME" on the source.

    --
    IndyZ
Re: 2038 bug (consider TAI64)
by grinder (Bishop) on Apr 11, 2003 at 10:09 UTC

    You might be interested at looking at Daniel J. Bernstein's implementation of TAI (temps atomique international), a 64-bit representation of time values. It can deal with second, nanosecond or attosecond precision, depending on your requirements.

    cr.yp.to/libtai/tai64.html

    He has a number of essays about time that are worth reading. It should be noted that djb is considered a controversial character in some circles.

    _____________________________________________
    Come to YAPC::Europe 2003 in Paris, 23-25 July 2003.

Re: 2038 bug
by pg (Canon) on Apr 11, 2003 at 06:49 UTC
    One solution to make your code survive longer is: (solution is tested with various dates set to points after 2038)
    1. turn on big int for the right lexical scopes, and
    2. do this convertion:
      my $t = time(); $t += 2 ** 32 if ($t < 0);
      This would make the epcho value increase smoothly even after 2038 (but not forever).

      The logic behind the convertion: time() returns (- 2** 31) for 2038-01-18-20:14:07, which should be 2 * 31; returns (-2**31 + 1) for 2038-01-18-20:14:08, which should be (2 ** 31 + 1) and ... so the difference between what returned from time() and what it should be is always 2 ** 32.

Y2K once again!
by htoug (Deacon) on Apr 11, 2003 at 08:10 UTC
    We had problems with the 2038 bug several years ago (try to find the 67th birthday of someone aged approx 18, that will fail long before 2038 - in 1970-something), just as we had all Y2K stuff weeded out long before 1999.
    The best solution is to go with a module that allows much greater range of dates than the 1970 to 2038 range that time_t allows with 32 bits.
    We changed to Date::Calc (and used the C-library for all C and C++ programs) so we wont have problems before sometime around year 2380(-70 or so years) when our RDBMS's date type goes out of range!
Re: 2038 bug
by zenn (Sexton) on Apr 11, 2003 at 13:45 UTC
    You can found in time.c in Linux kernel this comment:
    * WARNING: this function will overflow on 2106-02-07 06:28:16 on * machines were long is 32-bit! (However, as time_t is signed, we * will already get problems at other places on 2038-01-19 03:14:08) */

    This means that depending on your platform the bug can happen earlier enough to begin to think about it now.
    Nevertheless this depends on the size of your time_t type that is usually based in the long c type(in linux I mean). So since kernel 2.3 the developers are thinking about extending this to 64 bits to avoid this kind of problem. I don't know about other platforms, but I think it is easy enough to handle this problem, and this will be addressed soon.

    Zenn
Re: 2038 bug
by hatter (Pilgrim) on Apr 11, 2003 at 11:00 UTC
    Not so much a way to fix the code, but a way to fix the program... Document it. If the code is complicated enough to worry about the 2038 issue, then it should be complicated enough to require proper specification and other documents. Add "This code will choke on dates past 2038" (as someone else pointed out, you can hit these bugs well before 2038, if you're calculating dates into the future) If people are aware of this every time they read the docs, it'll make it hard for them to build things on top which will inherit the same problems, and break in difficult to diagnose ways.

    the hatter
Re: 2038 bug
by mattr (Curate) on Apr 13, 2003 at 15:17 UTC
    You are referring to this problem I think..
    $ perl -e 'print scalar localtime time*3;'
    Thu Sep 28 21:26:38 1933 Ouch!

    There are some interesting pages at:
    language.perl.com/news/y2k.html
    www.2038.org
    www.dewtronics.com/y2038.html
    vancouver-webpages.com/vanlug/linux-y2k.html
    www.tiretracking.com/y2kvsy2038.htm

    The last link has specific advice for your case:

    "What can be done?"

    There are two approaches to fixing this problem. The first is to not use 'time' and its companion functions. Instead, use more modern function calls which will not have this rollover problem for millenniums to come. The second approach is to develop your own version of 'time' which returns a data type with greater capacity. This means that you will also have to rewrite the companion functions. Additionally, you will have to check your code to make sure that it does not assume a certain data type. 'time' is defined such that it returns a data type of 'time_t' . You can redefine 'time_t' to a larger capacity data type.

    Hopefully, compilers will address this issue within the next 10-20 years. In that case, you will not need to rewrite the 'time' functions. However, at the very least you will need to recompile all of your old applications. More likely, you will have to examine all of your code to make sure that your developers did not cast the 'time_t' data into a basic (or integral) data type such as an int or long int. You will probably find at least one instance of this in every large project. And one instance of this bug is enough to stop your software in its tracks.

    We would not recommend that you spend a great amount of money eliminating this software problem in your existing software. However, you would be advised to at least distance yourself from 'time' and its companion functions in current and future development efforts. Using an object-oriented approach, you can still make use of these functions without risking a Y2K type phenomenon. Better yet, you can begin to use API's that are a bit more far-sighted. At MPC we have already put this into practice in our software development efforts.

    But the above rollover will only happen if you are using the same perl on the same machine/OS/BIOS, I think. The source code of your program would not have to change if it is interpreted by a (yet to be created?) more modern version of perl.

    That said, it would be nice if we had a way to explain perl modules to programs with an aim to automated software integration.

    Some keywords- "web annotation","syntactic web", "natural language processing" will probably be full grown by then (see some in google directory). But that seems to be in early stages, we could use something perlish or corbaish even now. So I think your question is very relevant to us now too.

    Even with today's technology we should be able to describe characteristics of our programs in a way that a machine can understand. Also, there is currently no standardized way of describing Y2.038K problems. Ultimately things like failure modes of 2038-challenged unix software and embedded systems can be described with some kind of interface description language - maybe perl is good for this (Consider perl Makefile.PL as a forerunner). That way we can hope that when nearly every piece of hardware in existence today crashes and burns, the important things can be virtualized on 64+ bit machines that scan these descriptions of dead systems and rescue us. Doesn't seem like science fiction to me.. Anyway I would like to see something more parseable in manpages and whatnot, perldocs could gain another section for machine-readable descriptions too.

    Anyway to directly answer your question (how to annotate your program) maybe it would be useful to put tags in comments to indicate dangerous lines, and provide a separate file which describes the problems. You could also try describing exactly what is dangerous about it in simplified english inside <Y2.038K> </Y2.038K> tags, it might just work!

Re: 2038 bug
by Anonymous Monk on Apr 11, 2003 at 05:03 UTC

    Why not just fix it? Post the code, we'll help out, honest.

      he cannot really fix it on his own. Try this: on your PC, set date to 01-01-2039, by doing this:
      date 01-01-2039
      then run this perl program:
      print time();
      you will get a negative number back. That's the limitation sort of thing, not a bug in his code. Try this code:
      print time(); # set time to 01-01-2038 before try this code print(2 ** 31 - 1);
      Look at how close those two numbers are, now you realize that's the up limitation of positive integers on 32 bit machine.

      I don't worry about this too much, most likely when 2038 approaches, 64 bit or something even better will dominate.

      (Set time to 2038-01-18-20:14:06, and see what happens to the above demo, also 2038-01-18-20:14:07)

        Thanks for the correction, the thought had crossed my mind, but I didn't think we were anywhere near that close (guess I should do my math next time ;). time to dust off the ol' quantum computer, only 35 years to get it working! :)

Re: 2038 bug
by bart (Canon) on Apr 11, 2003 at 22:09 UTC
    I don't think you need to worry. Sure, in current day programs, time() is a 32 bit integer. But it needn't be. Even if today it was implemented, not using a 64 bit integer but using a (double) float instead of an integer, you'd have a precision of 53 bits. Surely that would be enough.

    And I really don't see a reason why the epoch couldn't just stay the same, Jan 1 1970 00:00:00 GMT. In other words: I don't really expect code like yours to break, provided your Perl port will be upgraded by then — and, as Abigail wrote, your system can handle it. Honestly, I do expect that that will turn out to be the biggest problem: timestamps on file systems.

Re: 2038 bug
by dragonchild (Archbishop) on Apr 14, 2003 at 18:10 UTC
    Why not find out what the people working with the 30-year Treasury note are planning to do in 2008? There were Y2K-compliant mainframes in the year 1970 - all of them that had to work with the 30-year T-Bill. I can't imagine that there isn't known solution to this. (Inline::Java anyone?)

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.