Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

sort performance

by le (Friar)
on Sep 12, 2000 at 02:04 UTC ( [id://32000]=perlquestion: print w/replies, xml ) Need Help??

le has asked for the wisdom of the Perl Monks concerning the following question:

I have a big hash of hashes that looks like this:
%hash = ( foo => { this => '...', that => '...', here => '...', there => '...' }, bar => { this => '...', that => '...', here => '...', there => '...' } );
What would be the most performant way to sort keys %hash on the values of 'this', for example?

Currently, I do it with

for (sort { $hash{$a}->{this} <=> $hash{$b}->{this} } keys %hash)
Is there anything faster/better?

Replies are listed 'Best First'.
RE: sort performance
by Adam (Vicar) on Sep 12, 2000 at 02:16 UTC
    my @sortedkeys = sort { $hash{$a}->{this} cmp $hash{$b}->{this} } keys + %hash
    You don't need the foreach loop, otherwise your method is fine (that's how I always do it.)

    Update: Oh, my mistake... I see that @sortedkeys is the array for the foreach loop. So you are doing fine, although for fun we should try to come up with a bunch of methods and then benchmark them.

Re (tilly) 1: sort performance
by tilly (Archbishop) on Sep 12, 2000 at 04:01 UTC
    You could avoid repeated double-lookups with a Schwartzian sort:
    my @sorted_keys = map {$_->[0]} sort {$a->[1] cmp $b->[1]} map {[$_, $hash{$_}{this}]} keys %hash;
    (Read it bottom to top and it will make more sense.)

    This is more memory intensive, but doing the double lookup in the sort block means that it happens O(n * log(n)) times. Doing it in a map means it only happens n times. (Where n is the number of keys.)

      The 'Schwartzian Transform' is for expensive operations.
      A hash lookup is not an expensive operation, so just do a plain sort.
        You are right. I just benchmarked it. The overhead of the maps far outweighs the cost of the sort. In fact it is so slow I will have to ask p5p about it. Here is my test:
        use strict; use Benchmark; use vars qw(%hash %test); for (1..1000) { $hash{"key$_"}{this} = "num$_"; } %test = ( schwartz => sub { my @sorted = map $_->[0], sort {$a->[1] cmp $b->[1]} map [$_, $hash{$_}{this}], keys %hash; }, straight => sub { my @sorted = sort { $hash{$a}{this} <=> $hash{$b}{this} } keys %hash; }, stupid => sub { my @sorted = sort { my $foo = [$a, $hash{$a}{this}]; $hash{$a}{this} <=> $hash{$b}{this} } keys %hash; }, ); timethese (-1, \%test);
        Believe it or not, stupid is several times faster for me than schwartz. IMO there is simply no possible good reason for that!

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://32000]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (2)
As of 2025-06-20 16:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.