http://www.perlmonks.org?node_id=913714

citromatik has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I think the answer is "yes", but I want to be sure...

Is it guaranteed that a call to keys and values on the same hash (no insertions or deletions between both calls) will give the same order?

Thanks in advance

citromatik

Replies are listed 'Best First'.
Re: keys and values order on a hash
by BrowserUk (Patriarch) on Jul 11, 2011 at 16:11 UTC

    Yes.

    But if you need both sets, it would usually be better to get them with each as it will only need to iterate over the hash once, whereas requesting keys then values (in either order) requires iterating the entire hash twice.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      each() has its own problems, and I generally recommend against it in persistent environments like mod_perl. I once ran across a situation where a mod_perl process didn't reset the internal variable used to keep track of the current location for each() (was a long time ago, so I don't remember the specifics). This resulted in the next request on that process to start the search where the previous one left off instead of the beginning.


      "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

        You'll get that situation if you have something like

        while (...each(%h)...) { ... last if ...; ... }

        The solution is

        keys(%h); # Reset iterator. while (...each(%h)...) { ... last if ...; ... }

        Note that keys does not actually compute the list of keys in void context.

        I don't use mod_perl, but I would have assumed that the scenario you describe would only occur if the hash was a global?

        And isn't using globals in persistent environment one of the (many) no-nos? Or at least, something you are warned about that you must take extra care with.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      each has its own problems in terms of maintainability. Bailing out of the loop early (perhaps an exception was hit, perhaps there was a return or last) will leave the iterator in the middle of the hash.

      > perl -wE ' sub p { my($h) = @_; while (my($k, $v) = each %$h) { return "$k=$v" } } my %h = qw( one 1 two 2 three 3 ); say p(\%h); say p(\%h); say p(\%h); ' three=3 one=1 two=2

      This can lead to some subtle hard to trace bugs. I've been personally bitten by this several times, and I've essentially dropped each from my toolkit; it's simply too easy to leave the iterator in a strange spot.

      I also wouldn't be too sure of your optimization suggestion. With a pretty basic benchmark I slapped together (source here) it would appear separate keys/values is faster:

      > ./each-vs-key-values.pl BEGIN, hash: short Benchmark: running each, keys_values for at least 5 CPU seconds... each: 6 wallclock secs ( 5.26 usr + 0.01 sys = 5.27 CPU) @ 44 +213.85/s (n=233007) keys_values: 6 wallclock secs ( 5.63 usr + 0.01 sys = 5.64 CPU) @ 1 +07908.33/s (n=608603) Rate each keys_values each 44214/s -- -59% keys_values 107908/s 144% -- END, hash: short BEGIN, hash: long Benchmark: running each, keys_values for at least 5 CPU seconds... each: 7 wallclock secs ( 5.83 usr + 0.00 sys = 5.83 CPU) @ 16 +6.38/s (n=970) keys_values: 6 wallclock secs ( 5.30 usr + 0.00 sys = 5.30 CPU) @ 6 +95.09/s (n=3684) Rate each keys_values each 166/s -- -76% keys_values 695/s 318% -- END, hash: long BEGIN, hash: alphabet Benchmark: running each, keys_values for at least 5 CPU seconds... each: 5 wallclock secs ( 5.24 usr + 0.01 sys = 5.25 CPU) @ 36 +26.29/s (n=19038) keys_values: 5 wallclock secs ( 5.20 usr + 0.00 sys = 5.20 CPU) @ 1 +3917.12/s (n=72369) Rate each keys_values each 3626/s -- -74% keys_values 13917/s 284% -- END, hash: alphabet

      This is with a Debian Lenny perl 5.10.1. I'm speculating the overhead of the multiple each calls is killing any gains you get from not iterating a second time; that, or the iteration is cheap because of how the HV is built.

        Once you've got all those keys and values into those two separated arrays, what are you going to do with them?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: keys and values order on a hash
by choroba (Cardinal) on Jul 11, 2011 at 16:18 UTC
    See keys:
    The keys are returned in an apparently random order. The actual random order is subject to change in future versions of perl, but it is guaranteed to be the same order as either the "values" or "each" function produces (given that the hash has not been modified). Since Perl 5.8.1 the ordering is different even between different runs of Perl for security reasons (see "Algorithmic Complexity Attacks" in perlsec).

      given that the hash has not been modified

      Specifically, changing the hash by adding or removing a key is what voids the guarantee, and so does assigning to keys. Changing a value does not affect the order.

      Adding or removing a key and then undoing the action still counts as a change. That means you can get different orders from two hashes with identical keys.

Re: keys and values order on a hash
by Anonymous Monk on Jul 12, 2011 at 12:34 UTC
    I would not feel warm and fuzzy writing code with dependencies like that. I would find another way to do it. "Code like that" is what tends to bite you in the punctuation * mark.