Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re^2: Date to be sorted in descending and time in ascending

by aaron_baugher (Curate)
on May 18, 2012 at 21:22 UTC ( #971370=note: print w/replies, xml ) Need Help??

in reply to Re: Date to be sorted in descending and time in ascending
in thread Date to be sorted in descending and time in ascending

This means that the substr function will be called four times for each comparison, which could add up if it's a large array. That makes it a good candidate for a Schwartzian Transform. (You probably know that; I'm adding it for the original poster.) UPDATE: I was completely wrong about this; see below.

#!/usr/bin/env perl use Modern::Perl; use Data::Dumper; my @data = <DATA>; @data = map { $_->[0] } sort { $b->[1][0] <=> $a->[1][0] or $a->[1][1] <=> $b->[1][1] } map { [ $_, [substr( $_, 0, 8 ), substr( $_, 8)]] } @data; say @data; __DATA__ 20090405022300 20080405022600 20090405022900 20080405023500 20050405005000 20080405022500 20090405022500 20020405081200 20010405000000 20090405022100

UPDATE: While I understand the Schwartzian Transform in theory and think it's one of the coolest things ever, I haven't had much call to actually use it, so I did a benchmark for this case against the simple sort of substr calls. I was a little surprised to see that the repeated substr calls beat my ST, testing with array sizes from 10 to 1_000_000. In fact, the ST took about twice as long in all tests. I guess four substr calls (or two if the first comparison returns a value so the second comparison isn't necessary) don't qualify as expensive enough to make the overhead of the ST worth it here. Darn it.

Aaron B.
Available for small or large Perl jobs; see my home node.

Replies are listed 'Best First'.
Re^3: Date to be sorted in descending and time in ascending
by roboticus (Chancellor) on May 19, 2012 at 12:14 UTC


    Hmmm, I'm surprised. I figured at a million strings that the Schwartzian Transform would win. But you show another good lesson: Measure, don't guess. While both you and I expected the transform to win at a million strings, measurement trumps expectation.


    When your only tool is a hammer, all problems look like your thumb.

      Yep. For further curiosity, I did a count on how many substr calls there were. For the ST, of course, you have two for each element, one to get the first 8 chars and one to get the rest. So for a million-element array, that's 2M calls. But for the sort-on-substr version, I got about 37M substr calls. That'll vary some depending on how unsorted the original array is, but that's probably a good ballpark number.

      That's a lot more substr calls, but I guess it's still less work than building an entire new million-element array (with each element a reference to a two-element array), as the ST requires.

      It does give me a (very rough) rule of thumb, though: for the ST to be more efficient, the alternative probably needs to do the equivalent of 8-10 substr calls. So just a few core functions probably won't qualify, but a longer series of functions, or some fairly complex regexes, or certainly any sort of file or database lookups, probably will. And there's always measurement to see for sure.

      Aaron B.
      Available for small or large Perl jobs; see my home node.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://971370]
and monks are getting baked in the sun...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (6)
As of 2018-03-18 20:22 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (230 votes). Check out past polls.