Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: About List::Util's pure Perl shuffle()

by runrig (Abbot)
on Jul 11, 2007 at 17:28 UTC ( #626049=note: print w/ replies, xml ) Need Help??


in reply to About List::Util's pure Perl shuffle()

I suppose taking references might save memory, but it doesn't save any time over this (on small or large lists):

sub shuffle { my @a=@_; my $n; my $ret; my $i=@_; map { $n = rand($i--); $ret = $a[$n]; $a[$n] = $a[$i]; $ret; } @_; }
I get about 16% better than the List::Util version (YMMV).

Update: Aha. It's large data being shuffled, not large lists, that makes List::Util the better answer in some instances. Thanks ikegami.


Comment on Re: About List::Util's pure Perl shuffle()
Download Code
Re^2: About List::Util's pure Perl shuffle()
by ikegami (Pope) on Jul 11, 2007 at 17:47 UTC
    Adding your optimization to BrowerUK's version, we get the best yet.
    sub ikegami (@) { my @a=@_; my $i=@a; my $n; my $x; map +( $n=rand($i--), $x=$a[$n], $a[$n]=$a[$i] )[1], @a }
    Rate naive listutil runrig broweruk ikegami naive 636/s -- -6% -19% -26% -37% listutil 675/s 6% -- -14% -21% -33% runrig 785/s 23% 16% -- -9% -22% broweruk 859/s 35% 27% 9% -- -15% ikegami 1010/s 59% 50% 29% 18% --

    Of course, this doesn't always help. For example, try shuffling somewhat long strings.

    our @data = map { 'x' x 1000 } 1..1000; cmpthese -3, { map { $_ => "$_ \@data" } qw/naive listutil broweruk ru +nrig ikegami/ };
    Rate runrig naive ikegami listutil broweruk runrig 118/s -- -10% -40% -43% -86% naive 131/s 11% -- -34% -37% -85% ikegami 197/s 68% 51% -- -5% -77% listutil 207/s 76% 58% 5% -- -76% broweruk 860/s 632% 558% 336% 315% --

    Update: runrig pointed out that my shuffle was broken. I changed $a[$n] to $x=$a[$n] to fix it. It slowed down my solution, but not by much.

      Update: runrig pointed out that my shuffle was broken. I changed $a[$n] to $x=$a[$n] to fix it. It slowed down my solution, but not by much.

      Ok, that's the same trick as BrowserUk's, except that in that form it's even more elegant. Anyway... c'mon! I'm not ashamed the very least to ask why does it work. I can't understand...

        It's similar to print($i, $i++, $i);, at least in effect. Perl must place some form of pointer* to $a[$n]'s value on a stack instead of making a new scalar to save speed and memory. The result is that when $a[$n] is changed by the third expression in the list slice, it also changes the value of the second expression.

        By explicitly copying the value of $a[$n] into another variable ($x), changes in $a[$n] no longer affect the result of the slice.

        * — An alias? A pointer to the SV?

Re^2: About List::Util's pure Perl shuffle()
by blazar (Canon) on Jul 11, 2007 at 23:13 UTC
    I suppose taking references might save memory, but it doesn't save any time over this (on small or large lists):
    [snip]
    I get about 16% better than the List::Util version (YMMV).

    Indeed I was about to yell: "well, but then let's just adapt BrowserUk's version simply removing the referencing and dereferencing":

    sub bzr (@) { my @a = @_; my $n; my $i = @_; map +($n=randi($i--), $a[$n], $a[$n]=$a[$i])[1], @_; }

    But that won't work any more! (So kids don't copy the code above, it's crap! ;-) OTOH if I compare your code (as 'run') with BrowserUk's, I get:

    Rate run buk run 7467/s -- -12% buk 8454/s 13% --

    So that seems the best both speed and possibly memory wise.

      That's strange, I compared mine against BrowserUK's and get that they are very nearly equal (mine ahead by a meaningless 1 to 4%), until the list gets large (around a million), then I'm ahead by ~10%. This is for lists containing data only a few characters long, but as ikegami points out, for larger data, BrowserUK's wins by a longshot.

      Here's a non-aliasing version that does work, but it doesn't perform very well. Why it works is left as an exercise.

      sub bukNoAlias (@) { my @a = @_; my( $n, $t ); my $i = @_; map+( $t=$a[ $n=rand($i--) ], $a[ $n ]=$a[ $i ] )[ 0 ], @_; }

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      Ah, but the fix is easy. It's in my reply (which was posted and updated with the fix before you posted yours).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://626049]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2014-07-25 00:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (167 votes), past polls