Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^4: Hash order randomization is coming, are you ready?

by BrowserUk (Pope)
on Nov 29, 2012 at 14:12 UTC ( #1006260=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Hash order randomization is coming, are you ready?
in thread Hash order randomization is coming, are you ready?

two identical hashes don't produce the same keys/values order

I wonder about the justifiction for that? Not just fiddling for the sake of it I hope :(


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

RIP Neil Armstrong


Comment on Re^4: Hash order randomization is coming, are you ready?
Re^5: Hash order randomization is coming, are you ready?
by demerphq (Chancellor) on Dec 02, 2012 at 09:55 UTC

    It has always been the case. Keys are returned in bucket order, then top to bottom. Keys that collide into the same bucket will be stored in LIFO order. When copying a hash like so:

    %copy= %orig;

    %copy will be identical to %orig only if the the size of the bucket array is the same and no buckets collide. If the size of the bucket array is different the key order will change.

    $ ./perl -Ilib -MHash::Util=bucket_array -MData::Dumper -le'my (%hash, +%copy); keys(%copy)=16; %hash=(1..14); %copy=%hash; print Data::Dumpe +r->new([bucket_array($_)])->Terse(1)->Indent(0)->Dump for \%hash, \%c +opy;' [['13','5'],1,['7'],['11'],2,['3','1'],['9']] [3,['11'],3,['9'],['5','13'],1,['7'],3,['1','3'],1]

    Even if the bucket size is the same if items collide then during the copy they will reverse order relative to each other.

    $ ./perl -Ilib -MHash::Util=bucket_array -MData::Dumper -le'my (%hash, +%copy); %hash=(1..14); %copy=%hash; print Data::Dumper->new([bucket_a +rray($_)])->Terse(1)->Indent(0)->Dump for \%hash, \%copy;' [['11'],['7'],2,['13','5'],['9','3'],['1'],1] [['11'],['7'],2,['5','13'],['3','9'],['1'],1]

    None of this is new. The only new thing that changes here is which keys collide, and the fact that for a given list of keys, with hash randomization eventually they will all collide with each other. Before if you were lucky and your keys didn't collide, such as in tests, then broken code might work. At least until some new key was added that changed the state of the hash.

    ---
    $world=~s/war/peace/g

      First, thanks for the clarification.

      However, as far as I can tell, what you are saying comes down to:

      Two hashes containing identical keys and values, will iterate in different orders, unless they were constructed in exactly the same way.

      For example:

      $h1{ $_ } = 1 for 'a'..'z';; $h2{ $_ } = 1 for reverse 'a'..'z';; print %h1; print %h2;; w 1 r 1 a 1 x 1 d 1 j 1 y 1 u 1 k 1 h 1 g 1 f 1 t 1 i 1 e 1 n 1 v 1 m +1 s 1 l 1 c 1 p 1 q 1 b 1 z 1 o 1 w 1 a 1 r 1 d 1 x 1 j 1 y 1 u 1 h 1 k 1 g 1 f 1 i 1 t 1 e 1 n 1 v 1 m +1 s 1 l 1 c 1 p 1 b 1 q 1 z 1 o 1

      And:

      @h1{ 'a'..'z', 'A'..'Z' } = (1)x52;; delete @h1{ 'A'..'Z' };; @h2{ 'a'..'z' } = (1)x26;; print %h1; print %h2;; a 1 d 1 j 1 y 1 u 1 k 1 g 1 t 1 e 1 v 1 s 1 c 1 q 1 b 1 z 1 w 1 r 1 x +1 h 1 f 1 i 1 n 1 m 1 l 1 p 1 o 1 w 1 r 1 a 1 x 1 d 1 j 1 y 1 u 1 k 1 h 1 g 1 f 1 t 1 i 1 e 1 n 1 v 1 m +1 s 1 l 1 c 1 p 1 q 1 b 1 z 1 o 1

      And:

      @h{ 'a'..'z', 'A'..'Z' } = (1)x52;; delete @h{ 'A'..'Z' };; %h2 = %h;; print %h; print %h2;; a 1 d 1 j 1 y 1 u 1 k 1 g 1 t 1 e 1 v 1 s 1 c 1 q 1 b 1 z 1 w 1 r 1 x +1 h 1 f 1 i 1 n 1 m 1 l 1 p 1 o 1 w 1 r 1 a 1 x 1 d 1 j 1 y 1 u 1 h 1 k 1 g 1 f 1 i 1 t 1 e 1 n 1 m 1 v +1 s 1 l 1 p 1 c 1 q 1 b 1 z 1 o 1

      In all cases above, two "identical" hashes were arrived at through a different sequence of operations; and that difference in the sequence of construction manifests itself in a different iteration sequence.

      But that has always been the case!

      The above is 5.10; but the same is also true going right back to my involvement with perl: 5.6.1.

      Which makes me wonder whether your meditation isn't a little a) redundant; b) slightly scare mongery?

      Please don't take that the wrong way; I'm simply trying to understand exactly what difference(s) the latest changes have actually made?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

        Which makes me wonder whether your meditation isn't a little a) redundant; b) slightly scare mongery?

        I think you missed the point. The order will change *every process*.

        $ for i in {1..10}; do ./perl -le'%h=(1..20); print "$]: ",join "-", k +eys %h'; done; 5.017007: 1-13-5-15-19-9-17-11-7-3 5.017007: 13-19-5-17-9-15-1-7-3-11 5.017007: 13-7-19-15-5-1-11-17-3-9 5.017007: 17-13-3-7-15-1-9-5-11-19 5.017007: 17-9-3-11-7-15-1-19-5-13 5.017007: 19-1-11-5-9-3-15-17-7-13 5.017007: 9-19-3-17-7-11-13-15-1-5 5.017007: 1-11-15-3-19-17-7-13-9-5 5.017007: 19-7-13-1-5-17-9-3-11-15 5.017007: 5-19-9-1-13-17-7-3-15-11 $ for i in {1..10}; do perl -le'%h=(1..20); print "$]: ",join "-", key +s %h'; done; 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5 5.012004: 11-3-7-9-17-15-1-19-13-5

        The order returned by 5.12.4 is what you should see on pretty much every modernish perl there has been released with the exception of 5.8.1 and 5.17.6 and later. And obviously in 5.17.6 the order changes pretty much every time.

        What we discover when we per-process randomize the keys is that people actually depend on the key order more than they realize. When we make it random these dependencies become visible as bugs. I tend to consider them buggy originally, as minor changes to the history of the hash will produce roughly the same results as per-process randomization.

        BTW, you *did* see that I said "none of this is new" right? So why the emphasis on "But that has always been the case"?

        ---
        $world=~s/war/peace/g

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1006260]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (12)
As of 2014-07-31 22:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (253 votes), past polls