Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Re^4: RFC:A brief tutorial on Perl's native sorting facilities.

by BrowserUk (Pope)
on Feb 06, 2007 at 17:46 UTC ( #598596=note: print w/replies, xml ) Need Help??

in reply to Re^3: RFC:A brief tutorial on Perl's native sorting facilities.
in thread RFC:A brief tutorial on Perl's native sorting facilities.

That's not always possible. Eg. if it's a list of objects ...

If the data is a list of objects to be sorted

  1. then their class(es?) would need to overload cmp and/or <=> in order that you sort to sort them.

    And if that is the case, then each pair of objects would need to be compared using that overloaded operator and there would be no benefit from using any of the indirect sorting methods.

  2. Or, the mapping step in what you describe, would need to return a stringified or numified representation (key) of the objects (to populate the @keys array in your examples), and these would then be compared using the standard non-overloaded comparison operators.

    In which case, the GRT would work just fine; avoid the creation of the keys array and the slice operation. That means the GRT would avoid the creation of several intermediate arrays and lists. It also avoids the callback into Perl code (it's main strength, and the reason for its performance), and so will normally be faster.

And even if it's just a list of references to some data structures, by serialization and deserialization you end up with copies of the data structures.

I do not understand what you mean by this? In particular, the GRT never requires the keys to be "deserialised".

Whatever you put into the @keys array, has later to be compared using one of the standard comparison operators. With that being the case, using the GRT again avoids the need for callbacks and will be faster.

I realise that I am probably missing something here, but do you have any practical examples of when this indexed sort method with outperform a GRT?

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^5: RFC:A brief tutorial on Perl's native sorting facilities.
by Jenda (Abbot) on Feb 06, 2007 at 22:02 UTC

    Yes, you would ask the object to give you a key (or generate the key based on some of the objects' properties) and use that to populate the @keys array in my example or the keys in the two item arrays of ST. The thing is that the GT prepends the key (either using . or pack()) to the stringified values of the items, sorts the results and then strips the keys. It's not a matter of outperforming, it's a matter of either returning useless stuff or having to bend backwards and ending up with copies.

    Let's assume you have an array containing hashes like this and want to sort them by, let's say, the birth_date:

    @list = ( {fname => 'Jan', lname => 'Krynicky', birth_date => 'Sep 3 1975', #... }, {fname => 'Pavel', lname => 'Krynicky', birth_date => 'Dec 25 1969', #... }, {fname => 'Martin', lname => 'Krynicky', birth_date => 'Aug 24 1973', #... }, );
    Sort this using GRT!

    use Date::Calc qw(Decode_Date_US); use Data::Dumper; sub convertdate { return sprintf '%04d%02d%02d', Decode_Date_US($string) } # ST @sorted = map{ $_->[1] } sort{ $a->[0] <=> $b->[0] } map{ [ convertdate($_->{birth_date}), $_ ] } @list; print Dumper(\@sorted); # @keys array { my @keys = map convertdate($_->{birth_date}), @list; @sorted = @list[ sort {$keys[$a] cmp $keys[$b]} (0..$#list) ]; } print Dumper(\@sorted); # GRT @sorted = map{ ## Chop off the bit we added. substr( $_, 8 ) } sort map{ ## Note: No comparison block callback. ## Extract the field as before, but concatenate it with the origin +al element ## instead of building an anonymous array containing both elements +. convertdate($_->{birth_date}) . $_ } @list; print Dumper(\@sorted);
    $VAR1 = [ 'HASH(0x18db494)', 'HASH(0x224e9c)', 'HASH(0x224fa4)' ];

      Did you really expect that to work?

      The easiest way to do what you're trying to do is to sort the array indices, rather than the array elements directly.

      my @sorted = @list[ map { substr( $_, 8 ) } sort map { convertdate( $list[$_]->{birth_date} ) . sprintf "%06d", $_ } 0 .. $#list ];

      This is essentially identical to tye's fast, flexible, stable sort.

      A word spoken in Mind will reach its own level, in the objective world, by its own weight

        No of course I did not expect it to work. I was showing BrowserUK that it doesn't. I feel stupid for not noticing that I could merge the "sort indices" trick and GRT though to get the best from both worlds. The speed of GRT and the generality of the other styles of complex sort. Silly me. Thanks!

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://598596]
[tye]: No, I write more Perl than py at work. But I have to deal with plenty of py things.
[tye]: py monks would just be offended.
[Corion]: ;)
[Corion]: Hi tye btw ;)
[Corion]: I found plenty of not chatty enough logs with Perl too ... I'm slowly coming to appreciate Log::Log4perl resp. our homegrown alternative
[tye]: my experience with python so far is more like finding a moderately useful error message is a shocking surprise.
Corion disappears

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (11)
As of 2017-09-21 20:15 GMT
Find Nodes?
    Voting Booth?
    During the recent solar eclipse, I:

    Results (252 votes). Check out past polls.