Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re^2: When the Best Solution Isn't

by Aristotle (Chancellor)
on Sep 23, 2002 at 04:59 UTC ( #200008=note: print w/replies, xml ) Need Help??

in reply to Re: When the Best Solution Isn't
in thread When the Best Solution Isn't

Caveat: both our solutions take memory corresponding to the size of the input array.
#!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese); sub xform { map {$_->[0]} sort { $a->[1] <=> $b->[1]} map {[$_, rand(1)]} @_; } sub slice { my @random; push @random, rand 1 for 0 .. $#_; @_[ sort { $random[$a] <=> $random[$b] } 0 .. $#_ ]; } my @array = 1 .. 1000; cmpthese(1000, { slice => sub { slice @array }, xform => sub { xform @array }, }); __END__ Benchmark: timing 1000 iterations of slice, xform... slice: 17 wallclock secs (16.96 usr + 0.08 sys = 17.04 CPU) @ 58 +.69/s (n=1000) xform: 28 wallclock secs (27.76 usr + 0.12 sys = 27.88 CPU) @ 35 +.87/s (n=1000) Rate xform slice xform 35.9/s -- -39% slice 58.7/s 64% --

Makeshifts last the longest.

Replies are listed 'Best First'.
Re: Re^2: When the Best Solution Isn't
by japhy (Canon) on Sep 23, 2002 at 12:23 UTC
    A slightly faster solution (on my machine) uses in-place assignment to @_:
    sub shuffle { my @r = map rand, 0 .. $#_; @_[sort { $r[$a] <=> $r[$b] } 0 .. $#_] = @_; }

    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      There was a recent thread on the ActiveState list which was never quite solved. In general it came down to assigning to @_ was causing some weirdness to occur and ActiveState perl 5.6.1 would slow down to a crawl after a few thousand iterations. Other perls like my OpenBSD 5.6.1 and Cygwin 5.8.0 behaved nicely.

      I'm not sure what the issue was but there may be lurking bug in assigning to @_. I'm warning against it. (or at least be aware so if you do run into trouble later...)

      Update: fixed some spelling

        Don't fret -- an equally speedy solution is thus:
        sub shuffle (\@) { my $d = shift; my @r = map rand, 0 .. $#$d; @$d[sort { $r[$a] <=> $r[$b] } 0 .. $#$d] = @$d; }
        It requires you send it an actual array, of course, but it is just as fast.

        Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
        s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

      You know (leaving the in-place assignment issue aside), it occured to me (duh) that this is a general purpose but faster rephrasing of the Schwartzian Transform:
      sub sxsorta (&;) { my $sort_key = shift; my @rank = map &$sort_key, @_; @_[ sort { $rank[$a] cmp $rank[$b] } 0 .. $#_ ]; } sub sxsortn (&;) { my $sort_key = shift; my @rank = map &$sort_key, @_; @_[ sort { $rank[$a] <=> $rank[$b] } 0 .. $#_ ]; }

      The gains over the regular ST comes from the fact that it doesn't create hundreds of tiny anonymous arrays, and that the sort callback is slightly faster because it only has to do an array lookup but no dereferencing.

      The drawback is that with a lexically scoped (as it should be) @rank, there's no way to pass a userspecified callback to the routines without inflicting a massive performance hit. To do more complex sorts, one therefor has to separately spell this idiom out in their own code.

      Makeshifts last the longest.

        That looks a lot like a method used in Effective Perl Programming (see item 14 in the idiomatic perl pdf)

        The goal is to sort /etc/passwd by the third field:

        open PASSWD, "/etc/passwd" or die; @passwd = <PASSWD>; @key = map # Create an array @key contain +ing { (split /:/)[2] } @passwd; # just the keys (user ids). @by_uid_index = sort { # Sort indices, using @key arr +ay. $key[$a] <=> $key[$b] } 0 .. $#key; @by_uid = @passwd[@by_uid_index]; # Now, rearrange the contents +of # @passwd into sorted order. Or just combine the last two statements: @by_uid = @passwd[sort { $key[$a] <=> $key[$b] } 0 .. $#key];
        Well, this will work, and it's efficient enough, but there are prettier, more Perl-ish ways to present the solution.
        It then goes on to describe the Orcish and Schwartzian under the heading of 'Advanced sorting: the cool ways'


        Just for the record this technique is sometimes called a Guttman Rosler Transform.


        <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://200008]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2019-11-21 04:12 GMT
Find Nodes?
    Voting Booth?
    Strict and warnings: which comes first?

    Results (103 votes). Check out past polls.