Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

epoch reduction

by neilwatson (Priest)
on Apr 19, 2012 at 17:41 UTC ( #966022=perlquestion: print w/replies, xml ) Need Help??
neilwatson has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, I have a long array of epoch times. Many entries per day over many days. I want to reduce them to the just one per day, the largest. I've thought about converting to a Date string and grouping via regex. I'd like to hear what the monastery would do.

Neil Watson

Replies are listed 'Best First'.
Re: epoch reduction
by ikegami (Pope) on Apr 19, 2012 at 17:52 UTC
    use List::Util qw( max ); use POSIX qw( strftime ); my %by_date; for my $time (@times) { my $date = strftime('%Y-%m-%d', localtime($time)); $by_date{$date} = max($by_date{$date}||0, $time); } my @filtered_times = map $by_date{$_}, sort keys(%by_date);

    One second thought, the following seems better, especially if the initial list of times is already sorted.

    use POSIX qw( strftime ); my $last_date = ''; my @filtered_times; for my $time (sort { $a <=> $b } @times) { my $date = strftime('%Y-%m-%d', localtime($time)); if ($date eq $last_date) { $filtered_times[-1] = $time; } else { push @filtered_times, $time; $last_date = $date; } }
      I like this solution.

      However, my suggestion would be to use gmtime instead of localtime and use the same algorithm. But that of course depends upon what the OP is really trying to do. I didn't see anything in the problem statement about local time. Epoch time is a monotonically increasing number of seconds from an arbitrary start time. Weird things can happen when translating UTC (GMT) based time back into a local time.

      For a wild example: Samoa changes Date/Time Line. Smaller versions of this happens when we change between Summer and Winter time.

        I didn't see anything in the problem statement about local time

        I do not consider "sometimes 8pm to 8pm and sometimes 7pm to 7pm" to be the default definition of "day".

        I would have thought that if someone says today without qualification, that means midnight to midnight where he is observing. Even if every day is not the same length. I don't know why you're assuming he wants the date somewhere else in the world.

        But yes, any time zone's day could be used by the solution I posted.

Re: epoch reduction
by Marshall (Abbot) on Apr 19, 2012 at 17:56 UTC
    By "largest" do you mean the last time on a particular day? If the array is sorted, I would run down the list converting to day,month,year when the day,month,year changes, then the previous time was the last one on that day. I don't see what regex would have to do with this.

    If you could post a short example with the code that you have so far, that would be very helpful.

Re: epoch reduction
by jwkrahn (Monsignor) on Apr 19, 2012 at 21:43 UTC

    You could probably do something like this:

    # assuming @epoch_times contains the data use constant ONE_DAY => 84_600; # seconds per day my %per_day; for my $time ( @epoch_times ) { my $key = int $time / ONE_DAY; if ( exists $per_day{ $key } ) { $per_day{ $key } = $time if $per_day{ $key } < $time; } else { $per_day{ $key } = $time; } } @epoch_times = values %per_day;
Re: epoch reduction
by Not_a_Number (Prior) on Apr 19, 2012 at 20:00 UTC
    say for sort values %{{map {join( ' ', ( gmtime($_) )[3..5] ) => $_} @times}};

    But only if your input data is already sorted...


    Update: And assuming (cf. Marshall's question above) that by "largest" you mean "latest".

    Update 2: And of course if your date-times go back to more than 10 years ago (more precisely, to before 9 Sep 2001 03:46:40), the sort needs to be replaced by sort { $a <=> $b }.

Re: epoch reduction
by JavaFan (Canon) on Apr 19, 2012 at 20:20 UTC
    I'd split the date and time parts, and use a hash, keyed on date, and keep track of the highest time value seen for that date. If the time part is in HH:MM:SS format, it's trivial to find the highest value.
Re: epoch reduction
by fullermd (Priest) on Apr 20, 2012 at 06:39 UTC

    Assuming that by 'epoch time' you mean time_t, you don't need to convert to some other form. Since time_t is a lying liar and pretends every day is exactly 86400 seconds long, you can just do some /'s and %'s to find which fit in the same day and the highest for the day.

    (of course, because of the undefined behavior around leap seconds, you can't really be sure whether % 86_400 == 0 is actually 00:00:00 or 23:59:60, but I think most systems repeat the 86399th instead. And since you can't tell anyway, it probably doesn't matter)

Re: epoch reduction
by Anonymous Monk on Apr 19, 2012 at 23:40 UTC


    Perhaps sorting the date-times in reverse order?

    Then you each time the date part changes you have the latest time for that day.


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://966022]
Approved by ikegami
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2018-05-27 14:20 GMT
Find Nodes?
    Voting Booth?