Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: keys and values order on a hash

by BrowserUk (Pope)
on Jul 11, 2011 at 16:11 UTC ( #913715=note: print w/ replies, xml ) Need Help??


in reply to keys and values order on a hash

Yes.

But if you need both sets, it would usually be better to get them with each as it will only need to iterate over the hash once, whereas requesting keys then values (in either order) requires iterating the entire hash twice.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: keys and values order on a hash
Re^2: keys and values order on a hash
by hardburn (Abbot) on Jul 11, 2011 at 17:29 UTC

    each() has its own problems, and I generally recommend against it in persistent environments like mod_perl. I once ran across a situation where a mod_perl process didn't reset the internal variable used to keep track of the current location for each() (was a long time ago, so I don't remember the specifics). This resulted in the next request on that process to start the search where the previous one left off instead of the beginning.


    "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

      You'll get that situation if you have something like

      while (...each(%h)...) { ... last if ...; ... }

      The solution is

      keys(%h); # Reset iterator. while (...each(%h)...) { ... last if ...; ... }

      Note that keys does not actually compute the list of keys in void context.

        Indeed, you can insert a keys call prior to using 'each' in every instance, presuming you're not already in the middle of an 'each' on that hash already, and thereby reset the iterator mid-stream:

        my %h = qw( one 1 two 2 three 3 four 4 ); while (my($k, $v) = each %h) { say "$k=$v"; frobnicate(\%h); } sub frobnicate { keys %{ $_[0] } }

        Admittedly, this example is a bit contrived, but it's generally unexpected that your loop becomes infinite just from simple calls that don't appear to modify the hash. The workaround is obvious, if restricting: don't pass the hashref to any functions (including functions with prototypes that auto-enreference it). It's even worse if the hash is global; you essentially can't call any functions, for fear that, somewhere in the call chain, something will call 'keys' on the hash.

        In my opinion, it's too many caveats to make 'each' very useful.

      I don't use mod_perl, but I would have assumed that the scenario you describe would only occur if the hash was a global?

      And isn't using globals in persistent environment one of the (many) no-nos? Or at least, something you are warned about that you must take extra care with.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: keys and values order on a hash
by Somni (Friar) on Jul 11, 2011 at 18:22 UTC
    each has its own problems in terms of maintainability. Bailing out of the loop early (perhaps an exception was hit, perhaps there was a return or last) will leave the iterator in the middle of the hash.

    > perl -wE ' sub p { my($h) = @_; while (my($k, $v) = each %$h) { return "$k=$v" } } my %h = qw( one 1 two 2 three 3 ); say p(\%h); say p(\%h); say p(\%h); ' three=3 one=1 two=2

    This can lead to some subtle hard to trace bugs. I've been personally bitten by this several times, and I've essentially dropped each from my toolkit; it's simply too easy to leave the iterator in a strange spot.

    I also wouldn't be too sure of your optimization suggestion. With a pretty basic benchmark I slapped together (source here) it would appear separate keys/values is faster:

    > ./each-vs-key-values.pl BEGIN, hash: short Benchmark: running each, keys_values for at least 5 CPU seconds... each: 6 wallclock secs ( 5.26 usr + 0.01 sys = 5.27 CPU) @ 44 +213.85/s (n=233007) keys_values: 6 wallclock secs ( 5.63 usr + 0.01 sys = 5.64 CPU) @ 1 +07908.33/s (n=608603) Rate each keys_values each 44214/s -- -59% keys_values 107908/s 144% -- END, hash: short BEGIN, hash: long Benchmark: running each, keys_values for at least 5 CPU seconds... each: 7 wallclock secs ( 5.83 usr + 0.00 sys = 5.83 CPU) @ 16 +6.38/s (n=970) keys_values: 6 wallclock secs ( 5.30 usr + 0.00 sys = 5.30 CPU) @ 6 +95.09/s (n=3684) Rate each keys_values each 166/s -- -76% keys_values 695/s 318% -- END, hash: long BEGIN, hash: alphabet Benchmark: running each, keys_values for at least 5 CPU seconds... each: 5 wallclock secs ( 5.24 usr + 0.01 sys = 5.25 CPU) @ 36 +26.29/s (n=19038) keys_values: 5 wallclock secs ( 5.20 usr + 0.00 sys = 5.20 CPU) @ 1 +3917.12/s (n=72369) Rate each keys_values each 3626/s -- -74% keys_values 13917/s 284% -- END, hash: alphabet

    This is with a Debian Lenny perl 5.10.1. I'm speculating the overhead of the multiple each calls is killing any gains you get from not iterating a second time; that, or the iteration is cheap because of how the HV is built.

      Once you've got all those keys and values into those two separated arrays, what are you going to do with them?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Nothing, it's a benchmark. It's the nature of benchmarks to consider code in isolation, with as minimal side-effects as possible to measure the differences. It doesn't measure a whole program, or what affect each solution may have.

        However, this is also the nature of premature and micro-optimizations. The point of the benchmark was simply to show that assuming keys/values is slower than each is not always correct. Any decisions beyond that should only be measured based on actual code, with a profile, and an indication that this part of the code is the bottleneck.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://913715]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (7)
As of 2014-09-20 19:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (160 votes), past polls