Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

RE (tilly): Hashing the order out of an array (Re: Randomize an array)

by tilly (Archbishop)
on Sep 11, 2000 at 21:31 UTC ( #31937=note: print w/replies, xml ) Need Help??

in reply to Hashing the order out of an array (Re: Randomize an array)
in thread Randomize an array

Sick, slick...and imperfect. :-)

When hashing random elements you expect to see a certain number of collisions in buckets. The second one in the bucket will be ordered after the first element. Therefore your algorithm is not a perfect shuffle, and the rising sequences test that I described in RE (tilly) 2: Randomize an array should be able to detect that.

But even so, I like it. :-)

  • Comment on RE (tilly): Hashing the order out of an array (Re: Randomize an array)

Replies are listed 'Best First'.
RE: RE (tilly): Hashing the order out of an array (Re: Randomize an array)
by tye (Sage) on Sep 11, 2000 at 23:49 UTC

    Okay, here is some code to demonstrate the test that tilly described. I'm posting this because it gave me a couple of more chances to abuse the language.

    #!/usr/bin/perl -w use strict; exit main(); # This is not tilly's recommended test for how random an # ordering is. Repeatedly give a sorted list of numbers # (if not using numbers, swap the < line for the "lt" # line) to your randomizer and pass the results to # cntincseq(). If your randomizer is good, then # cntincseq() will return values that are distributed # around something close to ( list size / 2 ). sub cntincseq (\@) { my( $aList )= @_; my %seq; @seq{ 0..$#{$aList} }= @$aList; my( $idx, $cnt )= 0..1; while( ++$idx < @$aList ) { $cnt++ if $seq{$idx} < $seq{$idx-1} # $cnt++ if $seq{$idx} lt $seq{$idx-1} } return $cnt; } sub validate ($&$) { my( $cnt, $munge, $size )= @_; my @counts; for( 1..$cnt ) { my @list= 1..$size; &$munge( \@list ); push @counts, cntincseq( @list ); print " ",$counts[-1]; } return @counts; } sub fy { my( $aList )= @_; for( reverse 2..@$aList ) { my $i= rand($_); @$aList[$_-1,$i]= @$aList[$i,$_-1]; } } sub hash { my %hash; for( @{$_[0]} ) { $hash{rand()}= $_; } @{$_[0]}= values %hash; } sub main { $|= 1; my $tot; my @counts= validate( 10, \&fy, 10000 ); # And here we have a particularly nasty abuse # of C<map> for entertainment purposes only. # Do not use in production code! print "\nFisher-Yates: ", ($tot=0,map{$tot+=$_}@counts)[-1]/@counts, "\n"; @counts= validate( 10, \&hash, 10000 ); print "\nTye hash: ", ($tot=0,map{$tot+=$_}@counts)[-1]/@counts, "\n"; return 0; }

    Here are the results of comparing Fisher-Yates to my hash randomizer:

    4987 4974 5022 5012 5078 4967 4996 5007 4998 4990 Fisher-Yates: 5003.1 4368 4542 4558 4529 4586 4601 4581 4566 4549 4535 Tye hash: 4541.5

    Grumble... Back to that work on my bucketless hash table...

    Update: Having just barely posted this, I think perhaps my test is similar but not quite the same as the test tilly described. I started writing the described test then mistakenly thought I could make it work on a wider selection of inputs, fixed what looked like a typo, and got a reasonable but different test out of it.

    Update 2: And it isn't nearly as good of a test. I'll follow-up with some discussion in another node in a bit.

            - tye (but my friends call me "Tye")
      It is indeed not quite my test. However in this case your test sensitive to the non-randomness since a single hash bucket tends to form a rising string of numbers.

      However the test that I suggested is more sensitive to finding a fresh deck of cards that is not very well shuffled. (Which is where I first heard of this test. Trust me, I did not invent this one. :-)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://31937]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2019-07-17 16:38 GMT
Find Nodes?
    Voting Booth?

    No recent polls found