Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Re: Re^2: When the Best Solution Isn't

by japhy (Canon)
on Sep 23, 2002 at 12:23 UTC ( #200068=note: print w/replies, xml ) Need Help??

in reply to Re^2: When the Best Solution Isn't
in thread When the Best Solution Isn't

A slightly faster solution (on my machine) uses in-place assignment to @_:
sub shuffle { my @r = map rand, 0 .. $#_; @_[sort { $r[$a] <=> $r[$b] } 0 .. $#_] = @_; }

Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Replies are listed 'Best First'.
Re: Re: Re^2: When the Best Solution Isn't
by diotalevi (Canon) on Sep 23, 2002 at 13:18 UTC

    There was a recent thread on the ActiveState list which was never quite solved. In general it came down to assigning to @_ was causing some weirdness to occur and ActiveState perl 5.6.1 would slow down to a crawl after a few thousand iterations. Other perls like my OpenBSD 5.6.1 and Cygwin 5.8.0 behaved nicely.

    I'm not sure what the issue was but there may be lurking bug in assigning to @_. I'm warning against it. (or at least be aware so if you do run into trouble later...)

    Update: fixed some spelling

      Don't fret -- an equally speedy solution is thus:
      sub shuffle (\@) { my $d = shift; my @r = map rand, 0 .. $#$d; @$d[sort { $r[$a] <=> $r[$b] } 0 .. $#$d] = @$d; }
      It requires you send it an actual array, of course, but it is just as fast.

      Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Faster than the speed of sound (was: When the Best Solution Isn't)
by Aristotle (Chancellor) on Sep 23, 2002 at 17:02 UTC
    You know (leaving the in-place assignment issue aside), it occured to me (duh) that this is a general purpose but faster rephrasing of the Schwartzian Transform:
    sub sxsorta (&;) { my $sort_key = shift; my @rank = map &$sort_key, @_; @_[ sort { $rank[$a] cmp $rank[$b] } 0 .. $#_ ]; } sub sxsortn (&;) { my $sort_key = shift; my @rank = map &$sort_key, @_; @_[ sort { $rank[$a] <=> $rank[$b] } 0 .. $#_ ]; }

    The gains over the regular ST comes from the fact that it doesn't create hundreds of tiny anonymous arrays, and that the sort callback is slightly faster because it only has to do an array lookup but no dereferencing.

    The drawback is that with a lexically scoped (as it should be) @rank, there's no way to pass a userspecified callback to the routines without inflicting a massive performance hit. To do more complex sorts, one therefor has to separately spell this idiom out in their own code.

    Makeshifts last the longest.

      That looks a lot like a method used in Effective Perl Programming (see item 14 in the idiomatic perl pdf)

      The goal is to sort /etc/passwd by the third field:

      open PASSWD, "/etc/passwd" or die; @passwd = <PASSWD>; @key = map # Create an array @key contain +ing { (split /:/)[2] } @passwd; # just the keys (user ids). @by_uid_index = sort { # Sort indices, using @key arr +ay. $key[$a] <=> $key[$b] } 0 .. $#key; @by_uid = @passwd[@by_uid_index]; # Now, rearrange the contents +of # @passwd into sorted order. Or just combine the last two statements: @by_uid = @passwd[sort { $key[$a] <=> $key[$b] } 0 .. $#key];
      Well, this will work, and it's efficient enough, but there are prettier, more Perl-ish ways to present the solution.
      It then goes on to describe the Orcish and Schwartzian under the heading of 'Advanced sorting: the cool ways'


        Interesting. It's more verbose with the extra array, and the Schwartzian Transform is such a common idiom one should definitely know it, but I wonder if the author(s) had any particular motive to recommend the ST over this meme?

        Makeshifts last the longest.

      Just for the record this technique is sometimes called a Guttman Rosler Transform.


      <Elian> And I do take a kind of perverse pleasure in having an OO assembly language...

        Sorry, that’s incorrect.

        1. The point of the GRT is to call upon sort without a callback attached.

          (This is done by packing all the sortkeys and the actual data into a single datum in such a way that a simple $a cmp $b, which is the default sort behaviour, will result in the desired order (hence its alternative name “packed-default sort”).)

        2. The GRT (and the ST alike) directly sorts the elements themselves, with an extra step to extract the payload from each element afterwards.

        The approach I’m using here differs on both counts. It uses a sort callback, unlike the GRT, and sorts indices, unlike both the ST and the GRT.

        Its similarity is in having an extra step before sorting and another after, where the before-sorting step has the same purpose in all cases, i.e. computing the sort key for each element only once. The storage used for the computed keys differs greatly for each approach though, and therefore so do their after-sorting steps.

        Makeshifts last the longest.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://200068]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2020-10-28 10:13 GMT
Find Nodes?
    Voting Booth?
    My favourite web site is:

    Results (260 votes). Check out past polls.