Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Efficiently sorting array by non-alpha strings

by knowmad (Monk)
on Feb 20, 2004 at 15:53 UTC ( #330568=perlquestion: print w/ replies, xml ) Need Help??
knowmad has asked for the wisdom of the Perl Monks concerning the following question:

In case the subject isn't entirely clear, what I'm trying to achieve is a sort of an array of hashes by a field which contains the following three keywords -- Local, State, National. The trick is that I need to sort the records in that order. Thus a straight-forward alpha sort will not work.

I checked the Q&A, searched PM and Google and checked my various Perl books to no avail. I have a solution but am hoping a more clever monk could make it more efficient.

#!/usr/bin/perl use strict; use Data::Dumper; my @advocates = ( { fld_title => 'Rec1', fld_type => 'National', }, { fld_title => 'Rec2', fld_type => 'State', }, { fld_title => 'Rec3', fld_type => 'Local', }, ); sub by_locale { my $a_type; my $b_type; $a_type = 1 if ($a->{'fld_type'} =~ /local/i); $a_type = 2 if ($a->{'fld_type'} =~ /state/i); $a_type = 3 if ($a->{'fld_type'} =~ /national/i); $b_type = 1 if ($b->{'fld_type'} =~ /local/i); $b_type = 2 if ($b->{'fld_type'} =~ /state/i); $b_type = 3 if ($b->{'fld_type'} =~ /national/i); $a_type <=> $b_type; } my @sorted_advocates = sort by_locale @advocates; warn Dumper("unsorted_advocates", \@advocates); warn Dumper("sorted_advocates", \@sorted_advocates);

Thanks for any pointers,

-Wm

Comment on Efficiently sorting array by non-alpha strings
Download Code
Re: Efficiently sorting array by non-alpha strings
by Abigail-II (Bishop) on Feb 20, 2004 at 15:57 UTC

    It looks like you want a bucket sort. Something like this:

    my %bucket; map {push @{$bucket {$_ -> {fld_type}}} => $_} @advocates; my @sorted = map {@{$bucket {$_}}} qw /Local State National/;

    Abigail

      Thanks for the reply. I've not used the Schwarztian Transform before so tend to shy away from it. However, I read a bit more about it and discovered that it won't work for my actual code which is an array of objects.

      Your bucketsort is the elegant solution that I was seeking. Interesting though it's about twice as slow as my ugly version as indicated by this benchmark:

      Benchmark: timing 500000 iterations of bucketsort, sort... bucketsort: 2 wallclock secs ( 1.72 usr + 0.00 sys = 1.72 CPU) @ 29 +0697.67/s (n=500000) sort: 0 wallclock secs ( 0.89 usr + 0.00 sys = 0.89 CPU) @ 56 +1797.75/s (n=500000) Rate bucketsort sort bucketsort 290698/s -- -48% sort 561798/s 93% --

      Thanks
      -Wm

        Your bucketsort is the elegant solution that I was seeking. Interesting though it's about twice as slow as my ugly version as indicated by this benchmark:
        That's because you didn't fix my typos. Here's a different benchmark, with some of the other suggestions (I didn't include any that went beyond sorting on 'fld_type'):
        #!/usr/bin/perl use strict; use warnings; use Benchmark qw /timethese cmpthese/; our @advocates = map {{fld_title => 'Rec' . $_, fld_type => [qw /National State Local/] -> [ra +nd 3]}} 1 .. shift || 1000; sub by_locale { my $a_type; my $b_type; $a_type = 1 if ($a->{'fld_type'} =~ /local/i); $a_type = 2 if ($a->{'fld_type'} =~ /state/i); $a_type = 3 if ($a->{'fld_type'} =~ /national/i); $b_type = 1 if ($b->{'fld_type'} =~ /local/i); $b_type = 2 if ($b->{'fld_type'} =~ /state/i); $b_type = 3 if ($b->{'fld_type'} =~ /national/i); $a_type <=> $b_type; } my $order = "\0local\0state\0national\0"; sub by_locale_b { index ($order, "\0" . lc ($a -> {fld_type}) . "\0") <=> index ($order, "\0" . lc ($b -> {fld_type}) . "\0") } our (@knowmad, @abigail, @borisz, @pirate); cmpthese -1 => { knowmad => '@knowmad = sort by_locale @advocates', abigail => 'my %bucket; map {push @{$bucket {$_ -> {fld_type}}} => $_} @advo +cates; @abigail = map {@{$bucket {$_}}} qw /Local State Nat +ional/;', borisz => 'my %h = (National => 3, State => 2, Local => 1); @borisz = sort {$h {$a -> {fld_type}} <=> $h {$b -> {fld_type}}} @advocates;', pirate => '@pirate = map {$_ -> [1]} sort {$a -> [0] <=> $b -> [0]} map {[lc ($_ -> {fld_type}), $_]} @advoca +tes', browseruk => '@browseruk = sort by_locale_b @advocates', }; __END__ Rate knowmad browseruk borisz pirate abigail knowmad 39.4/s -- -48% -70% -83% -92% browseruk 75.5/s 91% -- -42% -67% -85% borisz 130/s 229% 72% -- -43% -74% pirate 229/s 480% 203% 76% -- -53% abigail 491/s 1147% 551% 279% 115% --

        Abigail

Re: Efficiently sorting array by non-alpha strings
by borisz (Canon) on Feb 20, 2004 at 16:05 UTC
    Use a hash to remap your field.
    my %h = ( National => 3, State => 2, Local => 1 ); my @sorted_advocates = sort { $h{$a->{fld_type}} <=> $h{$b->{fld_type} +} } @advocates;
    Boris
Re: Efficiently sorting array by non-alpha strings
by dreadpiratepeter (Priest) on Feb 20, 2004 at 16:09 UTC
    This ought to do it. WARNING: barely tested.
    #!/usr/bin/perl use strict; use Data::Dumper; my @advocates = ( { fld_title => 'Rec1', fld_type => 'National', }, { fld_title => 'Rec2', fld_type => 'State', }, { fld_title => 'Rec3', fld_type => 'Local', }, ); my %map = ( local => 1, state => 2, national => 3 ); my @sorted_advocates = map {$_->[1]} sort {$a->[0] <=> $b->[0]} map {[$map{lc($_->{fld_type})},$_]} @advocates; warn Dumper("unsorted_advocates", \@advocates); warn Dumper("sorted_advocates", \@sorted_advocates);


    -pete
    "Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."
Re: Efficiently sorting array by non-alpha strings
by Limbic~Region (Chancellor) on Feb 20, 2004 at 16:09 UTC
    knowmad,
    I am assuming you want to sort first by locality type and second by title ASCIIBetically.
    #!/usr/bin/perl use strict; use warnings; my @advocates = ( {fld_title => 'Rec1', fld_type => 'National'}, {fld_title => 'Rec2', fld_type => 'State' }, {fld_title => 'Rec3', fld_type => 'Local' } ); my %order = (LOCAL => 'A', STATE => 'B', NATIONAL => 'C'); my @sorted = map { $_->[0] } sort { $a->[1] cmp $b->[1] || $a->[2] cmp $b->[2] } map { [ $_ , $order{uc $_->{fld_type}}, $_->{fld_title} ] +} @advocates;
    Cheers - L~R

    I am waiting for tye to comment on my persistent use of the ST.

      L~R,

      I had not considered the secondary sort by title. Nice touch!

      I added your example into my benchmark tests which now include my original hack, Abigail's bucketsort example and yours (which looks to me like a Schwarztian Transform, no?). Yours takes the cake:

      Benchmark: timing 500000 iterations of bucketsort, sort, st... bucketsort: 1 wallclock secs ( 1.73 usr + 0.00 sys = 1.73 CPU) @ 28 +9017.34/s (n=500000) sort: 1 wallclock secs ( 0.90 usr + 0.00 sys = 0.90 CPU) @ 55 +5555.56/s (n=500000) st: 0 wallclock secs ( 0.66 usr + 0.00 sys = 0.66 CPU) @ 75 +7575.76/s (n=500000) Rate bucketsort sort st bucketsort 289017/s -- -48% -62% sort 555556/s 92% -- -27% st 757576/s 162% 36% --

      Thanks,

      -Wm
        knowmad,
        Actually, you can avoid the ST all together, which should be even faster.
        #!/usr/bin/perl use strict; use warnings; my @advocates = ( {fld_title => 'Rec1', fld_type => 'National'}, {fld_title => 'Rec2', fld_type => 'State' }, {fld_title => 'Rec3', fld_type => 'Local' } ); my %order = (LOCAL => 'A', STATE => 'B', NATIONAL => 'C'); my @sorted = sort { $order{uc $a->{fld_type}} cmp $order{uc $b->{fld_t +ype}} || $a->{fld_title} cmp $b->{fld_title} } @advocates; print "$_->{fld_title}\n" for @sorted;
        Cheers - L~R
Re: Efficiently sorting array by non-alpha strings
by BrowserUk (Pope) on Feb 20, 2004 at 16:18 UTC

    This would do it and is easier to extend than the multiple ifs or an if/else cascade.

    #!/usr/bin/perl use strict; use Data::Dumper; my @advocates = ( { fld_title => 'Rec1', fld_type => 'National', }, { fld_title => 'Rec2', fld_type => 'State', }, { fld_title => 'Rec3', fld_type => 'Local', }, ); my $order = "\0local\0state\0national\0"; sub by_locale { index( $order, "\0" . lc( $a->{ fld_type } ) . "\0" ) <=> index( $order, "\0" . lc( $b->{ fld_type } ) . "\0" ) } my @sorted_advocates = sort by_locale @advocates; warn Dumper("unsorted_advocates", \@advocates); warn Dumper("sorted_advocates", \@sorted_advocates);

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Timing (and a little luck) are everything!

      I added your solution to my benchmark and it comes in side-by-side with my original sort solution. However, yours is ++ for elegance and extendability. Thanks for the input!

      -Wm

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://330568]
Approved by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (7)
As of 2014-08-29 01:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (275 votes), past polls