http://www.perlmonks.org?node_id=629429

dug has asked for the wisdom of the Perl Monks concerning the following question:

At $work a co-worker recently sent an obfu out to the tech list that relied on keys() returning the keys of a hash in the same order every time the piece of code was executed. My perl is 5.8.8, compiled without "-DUSE_HASH_SEED_EXPLICIT", and I don't have the "PERL_HASH_SEED" environment variable set.

Given my interpretation of `perldoc -f keys`, and `perldoc perlrun`, every time I call keys() on a hash, even between runs of perl, I will get a different order. This is the relevant piece of documentation from `perldoc -f keys`:

Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see "Algorithmic Complexity Attacks" in perlsec).

But when I run:

#!/usr/bin/perl use warnings; use strict; my %hash = ( A => "B", C => "D", E => "F", ); print keys %hash;

"ACE" is printed every time (at least every time that I've run it).

Is there a bug in the way I'm reading the documentation? If I remember correctly (which is unlikely), this was kind of a big deal when 5.8.1 came out, as there was a fair bit of code in the wild that disobeyed the documentation and relied on hash ordering.

My understanding of the current implementation given my version, compilation options and environment variables was that I could rely on hashes *not* being ordered the same between runs of perl. I'm clearly missing something. Can anyone shed some light?

Thanks,

-- Douglas Hunter

Replies are listed 'Best First'.
Re: hash randomization and keys()
by ikegami (Patriarch) on Jul 29, 2007 at 19:42 UTC

    Given my interpretation of `perldoc -f keys`, and `perldoc perlrun`, every time I call keys() on a hash, even between runs of perl, I will get a different order.

    No. The order of the items returned by keys, values and each for a given hash is the same for all three function, and the order doesn't change as long as the hash doesn't change.

    The order may change between runs, and the order may change if the contents of the hash changes.

    So your question remains. I don't know the answer, but I think what's random is the max bucket size. If so, you may not see a difference with just a few items.

      The hash seed is changing each time:

      PERL_HASH_SEED_DEBUG=1 perl -le 1

      Remember that a hash is mapping an arbitrary string into one of a finite number of buckets. Hash collisions are common, and so for two keys that hash to the same value, their contents will be found somewhere down in a linked list of values hanging off the bucket.

      The orginal attack was that one could concoct a series of keys that kept hashing to the same bucket. The post-5.8.1 hash seed initialises a constant in the hashing algorithm, which makes different buckets get used. But if you're using the same data, they'll follow a different but equivalent bucket usage, and so will appear "in the same order" when you pull them out again. You need something fairly low-level to view the buckets and SVs hanging off each one. I'm not sure off-hand what can be used: Devel::Peek doesn't go far enough.

      At another level, there is code in place to detect when single-bucket allocation is occurring, and then the hash is rehashed to break up the list across many buckets. Since your data is insufficiently pathological, you're not seeing the rehashing occurring from run to run.

      At least, I think that's what's going on...

      • another intruder with the mooring in the heart of the Perl

      No. The order of the items returned by keys, values and each for a given hash is the same for all three function, and the order doesn't change as long as the hash doesn't change.

      Ahh, of course. I saw that in the documentation as well, but apparently I wanted to believe my wrong interpretation of other parts of the documentation bad enough that I glossed over it.

      Thanks.

      -- Douglas Hunter
Re: hash randomization and keys()
by Anonymous Monk on Jul 30, 2007 at 05:29 UTC
    Perl has never guaranteed any ordering of the hash keys, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order. Also note that while the order of the hash elements might be randomised, this "pseudoordering" should not be used for applications like shuffling a list randomly (use List::Util::shuffle() for that, see List::Util, a standard core module since Perl 5.8.0; or the CPAN module Algorithm::Numerical::Shuffle), or for generating permutations (use e.g. the CPAN modules Algorithm::Permute or Algorithm::FastPermute), or for any cryptographic applications.