Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Custom, Reusable Sort Subroutine for Hashes?

by QM (Parson)
on Aug 30, 2017 at 11:01 UTC ( #1198323=perlquestion: print w/replies, xml ) Need Help??
QM has asked for the wisdom of the Perl Monks concerning the following question:

I want a custom sort sub that can sort any hash, according to whatever sort mechanism I dream up, with the hash determined at runtime.

The examples I've been able to find all assume a specific global hash. Sort subs don't allow any parameter passing, so I'm wondering how to accomplish this with grace and efficiency. Short of an anonymous sub created for a specific hash, I'm not sure how to proceed.

As a counterexample, this doesn't cut it:

sub by_score { $score{$b} <=> $score{$a} }

Here, %score is hardcoded. I may want to use the same sub on more than 1 hash throughout the script, and would prefer not to write one for each hash.

Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re: Custom, Reusable Sort Subroutine for Hashes?
by choroba (Bishop) on Aug 30, 2017 at 11:09 UTC
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; sub sort_hash (&\%) { my ($sub, $hash) = @_; sort { local($a, $b) = @$hash{$a, $b}; $sub->() } keys %$hash } my %hash = ( second => 2, fourth => 4, third => 3, first => 1 ); say for sort_hash { $a <=> $b } %hash;

    I used a prototype to make the syntax of sort_hash similar to that of sort. The trick is replacing the magical variables $a and $b with the respective values from the hash.

    Update: explanation added.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      This is very good, thanks.

      Just as an extension, can I use this with a named sort sub? I expect so, but I'll have to go off and experiment.

      Quantum Mechanics: The dreams stuff is made of

        > can I use this with a named sort sub?

        Sure, but you can't use the nice syntax anymore:

        sub numerically { $a <=> $b } say for sort_hash(\&numerically, %hash);
        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: Custom, Reusable Sort Subroutine for Hashes?
by Corion (Pope) on Aug 30, 2017 at 11:15 UTC

    Personally, I would create an anonymous subroutine that knows the correct hash to use. I think the fancy term would be "currying":

    my @list = (qw( a b c d e f )); sub by_score { my( $score ) = @_; return sub($$) { my( $a,$b ) = @_; warn "$a / $b"; $score->{$a} <=> $score->{$b} } } my $by = by_score({ a => 5, b => 4, c => 3, d => 6, e => -1 }); print join ",", sort( {$by->($a,$b)} @list ); $by = by_score({ a => -5, b => 4, c => -3, d => 6, e => -1 }); print join ",", sort( {$by->($a,$b)} @list );

    I'm a bit unhappy with the (lack of) syntax that prevents me from inlining $by. I would have liked to write something like:

    $by = by_score({ a => -5, b => 4, c => -3, d => 6, e => -1 }); print join ",", sort( $by, @list );

    ... but that's nothing that Perl likes, as Perl doesn't know whether $by should be part of @list or not.

      Yes, very helpful.

      I was anticipating a sub generator of some sort, where the hash and comparison function are passed in, and a new anonymous sub is returned. This sub could then be used in the code block position of the sort command.

      Perhaps something like this, which steals some ideas from Choroba's post, though I've not tried it:

      sub make_sort_sub { my $coderef = shift; my $hashref = shift; my $sort_sub = sub { something goes here } return $sort_sub; } my %hash = ( a => 5, b => 4, c => 3 ); my $keys_by_value = make_sort_sub( { $hashref->{$a} <=> $hashref->{$b} }, \%hash); my @keys_sorted_by_value = sort $keys_by_value %hash;

      Then $keys_by_value can be reused elsewhere on the same hash. But can't be used on a different hash. :(

      Hmmm, I'm thinking there's not a really elegant solution.

      Quantum Mechanics: The dreams stuff is made of

Re: Custom, Reusable Sort Subroutine for Hashes?
by QM (Parson) on Sep 05, 2017 at 16:58 UTC
    Combining some of these ideas, I came up with the following make_sort_sub:
    #!/usr/bin/env perl use strict; use warnings; use feature qw{ say }; use Scalar::Util qw{ looks_like_number }; sub make_sort_sub { my $code = shift; my $hashref = shift; my $sort_sub = eval "sub { $code }"; die $@ if $@; return $sort_sub; } # simple code snippet my %hash = ( a => 5, b => 4, c => 3, d => 5, e => 4 ); my $keys_by_value = make_sort_sub( '$hashref->{$a} <=> $hashref->{$b}', \%hash); my @keys_sorted_by_value = sort $keys_by_value keys %hash; say @keys_sorted_by_value; # Try with curlies now $keys_by_value = make_sort_sub( '{$hashref->{$a} <=> $hashref->{$b}}', \%hash); @keys_sorted_by_value = sort $keys_by_value keys %hash; say @keys_sorted_by_value; # Naive compare by values as numbers, keys as strings my $keys_by_value_or_keys = make_sort_sub( '$hashref->{$a} <=> $hashref->{$b} or $a cmp $b', \%hash); my @keys_by_value_or_keys = sort $keys_by_value_or_keys keys %hash; say @keys_by_value_or_keys; # Compare as numbers then strings, values then keys my %hash2 = (a => 'a', b => 'b', c => 3, d => 4); my $keys_by_value_or_keys_mixed = make_sort_sub( 'return $hashref->{$a} <=> $hashref->{$b} if ((looks_like_number($hashref->{$a}) and looks_like_number( +$hashref->{$b})) and ($hashref->{$a} <=> $hashref->{$b})); return $hashref->{$a} cmp $hashref->{$b} if ($hashref->{$a} cmp $hashref->{$b}); return $a <=> $b if ((looks_like_number($a) and looks_like_number($b)) and ($a +<=> $b)); return $a cmp $b;', \%hash2); my @keys_by_value_or_keys_mixed = sort $keys_by_value_or_keys_mixed ke +ys %hash2; say @keys_by_value_or_keys_mixed; exit;


    cbeda cbeda cbead cdab

    I couldn't figure out a good way to include a real code block, because it couldn't reference the keys and values. Choroba's suggestion is nice, but can't really handle anything other than $a/$b (AFAIK). Chicken and egg.

    As an aside, I can't remember if there's a nice CPAN way to do the last sub ("keys by value or by keys, numbers or strings").

    Quantum Mechanics: The dreams stuff is made of


      «The Crux of the Biscuit is the Apostrophe»

      perl -MCrypt::CBC -E 'say Crypt::CBC->new(-key=>'kgb',-cipher=>"Blowfish")->decrypt_hex($ENV{KARL});'Help

        ++ for trying, but that doesn't seem to be easier.

        In fact, I think the doc page is one of those "well formed, but not very informative" examples. I was looking for an example in the synopsis, and it has this:

        use Sort::Maker ; my $sorter = make_sorter( ... );

        Quantum Mechanics: The dreams stuff is made of

Re: Custom, Reusable Sort Subroutine for Hashes?
by sundialsvc4 (Abbot) on Aug 30, 2017 at 20:36 UTC

    I cordially recommend that you should create a package containing all of your “sort subs” which are genuinely reusable, then use the package exposing the particular subroutine(s) that you need.   And, even if two subroutines right now happen to be identical, but are used in two unrelated situations, go ahead and duplicate them verbatim.   Make sure that each one is clearly labeled and separately identifiable, and that changes to one will not have unforeseen maintenance consequences.

    Or:   if a particular package introduces a particular data structure, perhaps this would be an appropriate place to insert a sort-sub (or subs) appropriate to that data structure.   This would not only provide a logical maintenance-point, but would also serve to document all of the ways that the structure is being sorted, elsewhere in the system.

    The situation that you specifically want to avoid – as you justifiably seek to “DRY = Don’t Repeat Yourself,™” – is coupling.   Two independent code-paths which share no business-meaning with each other, such that you might well wish to change one but not the other, ought not share code.   The mere fact that two unrelated subroutines are identical now, does not mean that they will remain so.

    A “too generic, too universal, too customizable” subroutine can actually be regarded as “coupling bait.”   And well-intentioned “customizability,” in some ways, is even worse because the tweaks that fundamentally affect the behavior of the subroutine are “somewhere else.”   The correct behavior of the (complicated) routine can’t be easily tested across all of the cases to which it has been made to apply.

    “Stupid Simple,™ Please.”

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1198323]
Approved by choroba
Front-paged by choroba
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (3)
As of 2018-07-23 04:34 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (459 votes). Check out past polls.