Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Ouch! Each! Reentrant it is not

by Thilosophy (Curate)
on Jul 15, 2005 at 02:25 UTC ( #475106=perlmeditation: print w/ replies, xml ) Need Help??

Fellow Monks,

I have just spent an hour debugging a very strange problem, only to discover that I have been bitten by the behaviour of each. While all this is properly documented (each), I was not aware of it, so I thought I'd bring it up here as a caveat to remember.

The subroutine in the following snippet works only once (or rather every other time).

use strict; use warnings; our %hash = ( 1=> 2, 3=>4 ); sub find_2{ while (my ($key,$val) = each(%hash)) { return "The key for 2 is $key\n" if $val == 2; } die "could not find 2"; } print find_2; print find_2;
After the first call, it dies:
The key for 2 is 1 could not find 2 at each.pl line 11.
What happens here is that the subroutine does not search through the whole hash (it short-cicuits for performance reasons by returning the solution as soon as it has found it), and as a result, the next call to each on the same hash (even in completely different parts of the code!) will not start at the beginning again, but pick up where the first call left off.

Solution: Either go through the whole hash, or rewind it after using each:

if ($val == 2){; #rewind each my $a = scalar keys %hash; return "The key for 2 is $key\n" }

Comment on Ouch! Each! Reentrant it is not
Select or Download Code
Re: Ouch! Each! Reentrant it is not
by Zaxo (Archbishop) on Jul 15, 2005 at 02:51 UTC

    That's a nice demonstration of remembered state with each. It can be a useful property to exploit for some kinds of iterators, but it's pretty fragile. As you say, calling keys on the hash resets the each state.

    You can avoid the scalar op and a useless variable by just calling keys in scalar context:

    keys %hash and return "The key for 2 is $key\n" if $val == 2;

    After Compline,
    Zaxo

      You can avoid the scalar op and a useless variable by just calling keys in scalar context

      Taking it one step further, it seems you can also call it in void context (which is maybe more efficient):

      if ($val == 2 ){ keys %hash; return "The key for 2 is $key\n"; }

      You can avoid the scalar op and a useless variable by just calling keys in scalar context:

      I'd want to see that documented in the code or someone will get bitten!

      keys %hash; # reset each to beginning of hash return "The key for 2 is $key\n" if $val == 2;

      --
      Murray Barton
      Do not seek to follow in the footsteps of the wise. Seek what they sought. -Basho

Re: Ouch! Each! Reentrant it is not
by BrowserUk (Pope) on Jul 15, 2005 at 03:01 UTC

    That's not really a "lack of reentrancy". You are never reentering.

    As you discovered, each is an iterator, and it will not 'reset' until you either 'wrap', having processed the entire list of contents, or you manually 'reset' it, but how could it be different?

    In your example you are calling each in a while loop, and expected that the iterator would 'reset' even though the logic of your code ensures that it will only do so 1/n occasions. It is perfectly legitimate to call an iterator outside of a loop construct:

    $key1 = each %hash; ... $key2 = each %hash; ... $key3 = each %hash;

    which makes it impossible for each to know when it should 'reset' the iterator without the programmer manually intervening.

    However, it seems that each is non-reentrant, in as much as, even localising the hash doesn't appear to give you a separate instance of the iterator:

    @h{ 'a'..'z' } = 1..26;; print each %h;; w 23 print each %h;; r 18 print each %h;; a 1 { local %h=%h; print "$k => $v" while ($k,$v) = each %h };; w => 23 a => 1 r => 18 d => 4 x => 24 j => 10 y => 25 u => 21 h => 8 k => 11 g => 7 f => 6 i => 9 t => 20 e => 5 n => 14 m => 13 v => 22 s => 19 l => 12 p => 16 c => 3 b => 2 q => 17 z => 26 o => 15 print each %h;; w 23

    which is a pain.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

      It would seem that if the iterator was scoped then this wouldn't be a problem. Falling out of scope (end of the for each, or sub) would cause it to be deleted. The next call would reset it, of course this couldn't be changed now because there is surely code depending on that behaviour. I wonder if perl6 handles this better. Maybe in it hashes can return an iterator (or maybe they should if they don't).


      ___________
      Eric Hodges

        I really expected that localising the hash would localise the iterator. If anything were to be done about the current situation, making it so that it was would seem to be the most transparent way least likely to interfere with existing code use.

        I'd be very surprised if this wasn't fixed in P6.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
      Localising does give you a local iterator. Try this code:
      %a = ( A=>0, B=>1, C=>2 ); %b = %a; print each %b, "\n"; { local %b = %a; each %b; } while(($k,$v)=each %b) { print $k, $v, "\n" }
      Then, try this:
      %h = ( A=>0, B=>1, C=>2 ); print each %h, "\n"; () = %h; print each %h, "\n";
      The problem with local %h = %h is not that the local copy shares the iterator, but that the original %h gets its iterator reset by being read. This is mentioned briefly in the documentation for the each function.

        Hmmm. So you would have to localise and copy before using the outer level iterator in order to preserve it across a (possible) inner level iterator? Not very useful.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re: Ouch! Each! Reentrant it is not
by rinceWind (Monsignor) on Jul 15, 2005 at 07:08 UTC

    This brings to mind a meditation I posted some time ago, after a very similar experience: Iterating hashes safely and efficiently

    --

    Oh Lord, won’t you burn me a Knoppix CD ?
    My friends all rate Windows, I must disagree.
    Your powers of persuasion will set them all free,
    So oh Lord, won’t you burn me a Knoppix CD ?
    (Missquoting Janis Joplin)

Re: Ouch! Each! Reentrant it is not
by ysth (Canon) on Jul 15, 2005 at 09:09 UTC
    Rewind it before using each:
    sub find_2{ keys %hash; while (my ($key,$val) = each(%hash)) { return "The key for 2 is $key\n" if $val == 2; } die "could not find 2"; }
      Rewind it before using each.

      I did that at first, but then I changed it to rewind after use.

      Because if you rewind it only before, and then leave the iterator "open", your own piece of code is fine, but it will greatly confuse the next person who happens to call keys, values or each. So I decided that cleaning up after myself is the better way here.

      Update As pointed out by ysth, it would only greatly confuse any other users of each, and not affect keys or values.

        keys and values don't depend on the iterator, just each.
Re: Ouch! Each! Reentrant it is not
by ysth (Canon) on Jul 15, 2005 at 09:16 UTC
    This reminds me of a similar issue with scalar glob, only there there's no remedy but to use list-context instead:
    $ perl -wle' sub glob_limit { my ($pat, $max) = @_; my $i; print "pattern: $pat"; while ($val = glob($pat)) { print $val; last if ++$i == $max; } } glob_limit("{a,b,c}", 2); glob_limit("{p,d,q}", 3); ' pattern: {a,b,c} a b pattern: {p,d,q} c
Re: Ouch! Each! Reentrant it is not
by Limbic~Region (Chancellor) on Jul 15, 2005 at 12:34 UTC
    Thilosophy,
    Every 4 months or so, I think of ideas about how to expand Tie::Hash::Sorted when I see questions asked about how to do things with hashes (usually getting around limitations). Unfortunately, these things don't really have to do with sorted hashes necessarily and I think about sub-classing and well - I never really do anything. One of the ideas that I had was allowing for localizing of the each iterator so that it could be restored when needed. This isn't something hard to do and if you would like to implement a tied hash that does this - let me know and I can give you a few pointers.

    Cheers - L~R

      N.B. responsibility for the iterator on a tied hash is divided between the actual HV and the tied object; the HV keeps track of whether the next call should be to FIRSTKEY or NEXTKEY and the tied object keeps track of what to return for NEXTKEY.

      Given this, your idea sounds difficult but maybe not impossible.

Re: Ouch! Each! Reentrant it is not
by adrianh (Chancellor) on Jul 15, 2005 at 13:47 UTC
    Solution: Either go through the whole hash, or rewind it after using each:

    I once worked on a largish codebase that had this as the cause of several bugs in various places. We ended up using variations of things like:

    sub foreach_pair (&\%) { my ($coderef, $hashref) = @_; local ($a, $b); keys %$hashref; while ( ($a, $b) = each %$hashref ) { $coderef->($a, $b) }; }; my %foo = (a => 1, b => 2, c => 3); # if you don't mind remembering what $a and $b are foreach_pair { print "$a == $b\n" } %foo; # or if you prefer being explicit foreach_pair { my ($name, $value) = @_; print "name $name is $value\n" + } %foo;

    To avoid the problem and make things a bit more explict.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://475106]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2014-11-23 16:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (134 votes), past polls