Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: sorting entires by date

by BrowserUk (Pope)
on Jan 02, 2004 at 00:17 UTC ( #318197=note: print w/ replies, xml ) Need Help??


in reply to sorting entires by date

If, as your sample data indicates, your lines are of a consistant fixed format, then a simpler sort using substr would suffice.

my @sorted = sort{ substr( $a, 19 ) cmp substr( $b, 19 ) } <FILE>;

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!


Comment on Re: sorting entires by date
Download Code
Re^2: sorting entires by date
by Aristotle (Chancellor) on Jan 02, 2004 at 00:27 UTC
    He has filenames in there, so assuming that the format is fixed is more than likely moot. But seeing as it is the last field we're interested, the following will work:
    my @sorted = sort { substr( $a, 1 + rindex, $a, ':' ) cmp substr( $b, 1 + rindex, $b, +':' ) } <>;

    Makeshifts last the longest.

      The problem with this being that you have to substr twice for every comparison, which, when the file becomes large, is substatially more time consuming than the ST or GRT which does a substr for each line only. He explains that in this Unix Review Column.


      Who is Kayser Söze?
      Code is (almost) always untested.

        In a simplistic benchmark of a thousand records the straight substr $a cmp substr $b comes out over 180% quicker than the ST version. Using a GRT saves another 14%.

        The cost of the split in the ST outweights the repeated substr in this case.

        #! perl -slw use strict; use Benchmark qw[ cmpthese ]; open IN, '<', 'test.dat' or die $!; our @lines = <IN>; close IN; print "Sorting ", scalar @lines, $/; cmpthese( -3, { ST => q[ my @sorted = map $_->[0], sort{ $a->[4] <=> $b->[4] } map [ $_, split /:/ ], @lines; ], XX => q[ my @sorted = sort{ substr( $a, 19 ) cmp substr( $b, 19 ) } @lines; ], GRT=> q[ my @sorted = map{ substr $_, 10 } sort map{ substr( $_, 20 ) . $_ } @lines; ], }); __END__ P:\test>318176 Sorting 1000 Rate ST XX GRT ST 6.70/s -- -65% -69% XX 19.1/s 186% -- -13% GRT 21.9/s 227% 14% --

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        Hooray!

      If you have dates before and after time 1000000000 (Sept 2001), you need to use <=> instead of cmp (assuming traditional Unix epoch).

      Being able to do a numeric comparison on two substrings is one of those things that makes Perl Perl.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://318197]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (15)
As of 2014-10-30 15:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (208 votes), past polls