Endless has asked for the wisdom of the Perl Monks concerning the following question:
Hello my new favorite friends,
As part of a lexical processing project I'm working on, I'm parsing millions of dates and converting them to Epoch time. However, Diag::NYTProf showed me that I was losing massive amounts of time by using use Date::Parse::str2time; I guess that's the price you pay for something that seemed like the perfect, effortless way to parse the dates.
So, my question is, how can I most efficiently parse these dates, for those of you who have a sense of the benchmarks? Here was the WRONG way (removing it doubled my speed!):
# Dates of form 'Fri, 01 Mar 2013 01:21:14 +0000' my $created_at = str2time($value);
Update: Solution
Thanks to the discussion between BrowserUK and rjt I high-speed solution came that looked something like this:
use Inline C => q@ int epoch_sec(char * date) { char *tz_str = date + 26; struct tm tm; int tz; if ( strlen(date) != 31 || strptime(date, "%a, %d %b %Y %T", &tm) == NULL || sscanf(tz_str, "%d", &tz) != 1) { printf("Invalid date %s\n", date); return 0; } return timegm(&tm) - (tz < 0 ? -1 : 1)*(abs(tz)/100*3600 + abs(tz)%100*60); } @; our $date = "Fri, 01 Mar 2013 01:21:14 +0200"; my $newDate = epoch_sec($date); say $newDate;
Thanks! You guys are incredible.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: High-speed Date Formatting
by rjt (Curate) on Jul 12, 2013 at 00:39 UTC | |
by Endless (Beadle) on Jul 12, 2013 at 15:48 UTC | |
by erh (Initiate) on Apr 17, 2014 at 21:41 UTC | |
Re: High-speed Date Formatting
by tobyink (Canon) on Jul 12, 2013 at 01:12 UTC | |
Re: High-speed Date Formatting
by BrowserUk (Patriarch) on Jul 12, 2013 at 00:23 UTC | |
by rjt (Curate) on Jul 12, 2013 at 00:54 UTC | |
by BrowserUk (Patriarch) on Jul 12, 2013 at 01:39 UTC | |
by hdb (Monsignor) on Jul 12, 2013 at 08:58 UTC | |
by sundialsvc4 (Abbot) on Jul 12, 2013 at 02:59 UTC | |
Re: High-speed Date Formatting
by fullermd (Priest) on Jul 12, 2013 at 07:20 UTC | |
by rjt (Curate) on Jul 12, 2013 at 08:34 UTC |