Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Hashes: Deleting elements while iterating

by knexus (Hermit)
on Sep 02, 2003 at 20:22 UTC ( [id://288424]=perlquestion: print w/replies, xml ) Need Help??

knexus has asked for the wisdom of the Perl Monks concerning the following question:

I looked at previous posts related to this matter, but they were from Nov. 2000 or so and with older versions of Perl (I am using 5.8.0). Plus they really didn't answer my question (although I could have missed something).
So, here goes...

What I am seeing is a difference in behavior depending on how I iterate, when deleting hash elements. In the test code below, using the first foreach seems to work (the elements "appear" to be deleted) but I get errors:

Use of uninitalized value in pattern match (m//) at line 30.
Attempt free unreferenced scalar at line 27.

The other two iteration methods work fine without error. The main reason for this post is that I am still very new to perl (just a couple weeks) so I want to understand this behaviour (this may pass in time, when I just need to get it done... but for now ;-)

Note: I just commented out all but the one I wanted to test. Also, don't read too much into the names and comments in the code as it's superficial. :-)

Could this be caused by "buffering" (for lack of knowing a better term) by sort and each?

Thanks for any info/clarification and I gladly accept tips.

#!/usr/bin/perl -w use strict; my %index = ( 'node' => 1, 'node-cat' => 2, 'node-cat-cat' => 3, 'node-cat-cat-cat' => 4, 'node-bat-bat-cat' => 4, 'node-cat-cat-cat-cat' => 5, 'node-cat-bat-bat-bat' => 5, 'node-cat-cat-bat-bat' => 2, ); my $key; my $test='node-cat-cat'; foreach $key (sort keys %index) { print "$key: $index{$key}\n"; } # # DELETE all child nodes. # print "\nDelete children of $test :\n"; foreach $key (%index){ #foreach $key (sort keys %index){ #while ($key = each %index) { delete $index{$key} if ($key =~ /$test.+/); } print "\nFinal Hash:\n"; foreach $key (sort keys %index) { print "$key: $index{$key}\n"; }

Replies are listed 'Best First'.
Re: Hashes: Deleting elements while iterating
by antirice (Priest) on Sep 02, 2003 at 21:36 UTC

    The answer the OP was seeking has been well-demonstrated already. However, in attempting to come up with examples to show him why this behaves the way it does, I came across some very interesting behavior.

    #!/usr/bin/perl -lw my %hash = qw(a b c d d c b a); foreach my $key (%hash) { print $key; delete $hash{$key}; } # output: # c # # a # Segmentation fault (core dumped) foreach my $key (%hash) { print $key; delete $hash{"$key"}; } # same output with seg fault foreach my $key ( @{[%hash]} ) { print $key; delete $hash{$key}; } # output: # c # d # a # b # b # a # d # c

    I've tried it in both 5.6.1 and 5.8.0. From my understanding of hashes, this ought to work since according to perldoc -f delete the delete should just return undef if the hash element doesn't exist.

    antirice    
    The first rule of Perl club is - use Perl
    The
    ith rule of Perl club is - follow rule i - 1 for i > 1

      i tried the above too, and like antirice was also puzzled. however, my perl, 5.8.0, i586 linux, does not die.

      i think the oddities manifest are the result of possibly undefined behaviour when dealing with previously 'deleted' elements on treating the hash as a list. here is a brief pseudoexplanation:

      what happens is the following: taking
      1: foreach my $key (%hash) { 2: print $key; 3: delete $hash{$key}; 4: }
      as an example:

      the first iteration takes a real key, and deletes (line 3) it as if deleting from a hash. the effect of this is to nullify the value associated with the real key.

      when we go to the next iteration, which is done over a *list*, because (camel book) 'modification to (foreach) loop values can change the original values', we end up having undef as an iterator value. ie the ghost of the previously deleted key.

      this undef value remains in the foreach iterator's conception of the hash - presumably because it thinks of the hash as a list. so on next iteration, line 2 attempts to print an undef value, and line 3 attempts to delete it, which has no effect on the actual keys still in the hash.

      you can test this by including a  print join ":", keys %hash after line 3.

      next rinse, we delete a 'real' key from the hash again, and then attempt to iterate on what has become an undefined value.

      so we go through a "delete real key from hash, delete undef from hash" cycle, thanks to our iteration which treats the hash as a list.

      i guess what i am a little surprised about is that those undef values still remain in the concept of the list as seen by foreach. i guess this is a byproduct of treating the hash as a list, and forcing a peek at the phantom value.

      i would be interested if anyone could confirm/deny this was indeed happening.but more, if some kind soul would give an explanation of why the third antirice loop works, i could sleep easy. easier...

      wufnik

      -- in the world of the mules there are no rules --

        if some kind soul would give an explanation of why the third antirice loop works, i could sleep easy. easier...

        foreach my $key ( @{[%hash]} ) { print $key; delete $hash{$key}; }

        What antirice has done here is to take a copy of the hash as a list. After this, interating over the copy means that every second delete attempts to delete a key that never existed in the hash and fails quietly while all the others succeed.

        No doubt you're confused by the

        @{[%hash]}
        bit.

        Enclosing something in square brackets in Perl takes it in list context. It also generates an array reference to that. So %hash here is copied into an annoymous array. This is then deferenced by the @{}s because foreach expects a list, not a reference.

        Does this help?

        All the best,

        jarich

        My perl (AS 5.61 on Windows XP) also dies, but not every time. Using diagnostics, I get the following message, among others:

        Attempt to free unreferenced scalar at scratchpad2.pl line 5 (#2): (W internal) Perl went to decrement the reference count of a scalar to see if it would go to 0, and discovered that it had already gone to 0 earlier, and should have been freed, and in fact, probably was freed. This could indicate that SvREFCNT_dec() was called too many times, or that SvREFCNT_inc() was called too few times, or that the SV was mortalized when it shouldn't have been, or that memory has been corrupted.

        which may help some, but I'm afraid is a little over my head.

        Concerning the third antirice loop:

        foreach my $key ( @{[%hash]} ) { # print "$key: @{[%hash]}\n"; delete $hash{$key}; }
        I thought it looked to be (conceptually) equivalent to something like:

        foreach my $key (my @ary = %hash) { # print "$key: @ary\n"; delete $hash{$key}; }

        (which also 'works', BTW), but apparently it isn't the same, since uncommenting the print lines shows that the anonymous array reference (vocab a bit shaky here...) [%hash] shrinks by two items with each iteration, while the 'straight' array @ary remains unchanged.

        Oh dear, not sure if all that helps at all...

        dave

Re: Hashes: Deleting elements while iterating
by chromatic (Archbishop) on Sep 02, 2003 at 20:37 UTC

    You probably want keys in here:

    print "\nDelete children of $test :\n"; foreach $key (%index) {

    That said, the sort loop won't do anything especially interesting to the loop body. The while construct is a little dodgy, as modifying a hash inside an iterator loop can often reset the iterator, effectively making an infinite loop. (Yes, each is an iterator. It grabs one bit of data at a time. foreach keys grabs all of the keys at once.)

      I think what chromatic was trying to say is:
      print "\nDelete children of $test :\n"; foreach $key (keys %index) { ... delete $index{$key}; }

      I suspect what is happening with your code:

      foreach $key (%index) { delete $index{$key}; }
      is the following. Remember that when you're dealing with foreach loops the item that you've got ($key in this case) is the item in the hash. That is, $key becomes an alias to the memory address within the hash.

      So what is happening is that you're generating a list of all the keys and values in the hash, which is a list of aliases to those and then you're screwing it up. Remember that the list gets generated right at the very start of the for/foreach loop's execution.

      You see, you delete a key, which means that your value gets deleted as well. Then you go onto a value and attempt to reference/delete that. All of a sudden you're attempting to delete something that you have a kind of reference to (from the generated list) but that has already been deleted from the hash! Hence, no doubt, the Attempt free unreferenced scalar at line 27. warning.

      No wonder that makes Perl sad!

      So, in conclusion, use keys. And remember that you should never fool around with the list you're interating over. I mentioned this recently over here too, although in a different case. I do accept that the best way to do what you need to do, is to iterate over your hash, just make sure that you're iterating over the keys of your hash, rather than the whole thing.

      All the very best and I hope this helped

      jarich

      Update: Upon further thought and a little discussion I should probably mention that delete doesn't work on values and wouldn't normally cause this error. As far as I can tell the warning is caused because we're trying to use a scalar (the value of a deleted key) that no longer exists. This was the idea I was trying to convey above, but I suspect I wasn't very clear.

      Update II: chromatic has corrected me by pointing out that he was merely pointing out that keys was missing from the original poster's loop in this case. Rather than showing where it should be used, which the original poster obviously would have known by using keys elsewhere.

        Remember that when you're dealing with foreach loops the item that you've got ($key in this case) is the item in the hash. That is, $key becomes an alias to the memory address within the hash.

        I'm not sure that's accurate. Look at the last half of Perl_do_kv in doop.c and hv_iterkeysv and hv_iterval in hv.c. (I'm looking at something around 5.8.0.) The key iterator returns a mortal copy of the key, which is pushed onto the return stack. Sure, it points to the same data, but when you're done with it, it'll be ready for garbage collection without causing the original to be collected.

        The value is returned as a normal SV. I don't see any reference count increments, but there's very little you could do to cause it to be deleted from the hash without the associated key.

        I could really have misunderstood what you meant, but it doesn't sound quite right.

Re: Hashes: Deleting elements while iterating
by tcf22 (Priest) on Sep 02, 2003 at 20:37 UTC
    This seemed to work for me on Perl 5.6 on Win32 and 5.8 on Linux
    #!/usr/bin/perl -w use Data::Dumper; use strict; my %index = ( 'node' => 1, 'node-cat' => 2, 'node-cat-cat' => 3, 'node-cat-cat-cat' => 4, 'node-bat-bat-cat' => 4, 'node-cat-cat-cat-cat' => 5, 'node-cat-bat-bat-bat' => 5, 'node-cat-cat-bat-bat' => 2, ); my $key; my $test='node-cat-cat'; print Dumper \%index; # # DELETE all child nodes. # print "\nDelete children of $test\n"; foreach $key (sort keys %index){ if ($key =~ /$test.+/){ print "Deleting $key\n"; delete $index{$key}; } } print "\nFinal Hash:\n"; print Dumper \%index;
    It outputted
    $VAR1 = { 'node-cat-bat-bat-bat' => 5, 'node-bat-bat-cat' => 4, 'node-cat-cat-bat-bat' => 2, 'node-cat-cat-cat' => 4, 'node-cat' => 2, 'node-cat-cat-cat-cat' => 5, 'node-cat-cat' => 3, 'node' => 1 }; Delete children of node-cat-cat Deleting node-cat-cat-bat-bat Deleting node-cat-cat-cat Deleting node-cat-cat-cat-cat Final Hash: $VAR1 = { 'node-cat-bat-bat-bat' => 5, 'node-bat-bat-cat' => 4, 'node-cat' => 2, 'node-cat-cat' => 3, 'node' => 1 };
    which is what you would expect.
      <Grin>I think I was not clear enough or chose a less than obvious way to show things.</Grin>

      Yes, that approach worked for me as well. It's when I use

      foreach $key (%index){
      that I get the errors.

      If you look closely in the code I posted there are 3 loops above the delete operation, 2 of which are commented out. The ones commented out work, the other does not. I just change which one is UNcommented to test a particular way of doing it. I suppose this may seem unorthodox but it's just my way of trying to learn the differences/nuances of perl...

      So, what's the problem you might ask? ;-)
      Just seeking enlightenment as to why the one method gets errors (at least for me).

      Thanks for testing it out for me.

Re: Hashes: Deleting elements while iterating
by hardburn (Abbot) on Sep 02, 2003 at 20:28 UTC

    You could use Data::Dumper to dump the contents of your hash instead of printing it yourself:

    use Data::Dumper; print Data::Dumper::Dumper(\%index);

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    Note: All code is untested, unless otherwise stated

Re: Hashes: Deleting elements while iterating
by runrig (Abbot) on Sep 02, 2003 at 20:38 UTC
    foreach $key (%index){ #foreach $key (sort keys %index){
    Why do you have the correct line commented out? You will iterate over keys and values using this.
      Sorry, maybe I wasn't clear... or maybe I just chose poorly in the code.

      What I did was put 3 different ways of iterating over the hash when deleting. To test the different ones I would just COMMENT OUT the others and do a quick run.

      If I USE the first one I get errors. If I use the last 2 (the foreach w/ sort and the while) they work and no errors.

        Put in a print statement in the "faulty" for-loop and you'll see immediately what goes wrong.

        foreach $key (%index){ print "deleting $key\n"; # delete $index{$key} if ($key =~ /$test.+/); }

        Hope this helps, -gjb-

Re: Hashes: Deleting elements while iterating
by Elian (Parson) on Sep 03, 2003 at 19:55 UTC
    While there are some issues with your chosen method of iterating (not the least of which is $key will be set to both the keys and the values of the hash, since foreach puts the hash in list context and thus flattens it), there is no circumstance where you should get an attempt to free unreferenced scalar error. That's an indication of an internal error, and you'll only get that when something goes wrong inside perl. It's normally an indication that an XS module has an error in it, but in this case you seem to be tripping a perl bug.

    File a bug with the perl5-porters (use the perlbug command installed with perl) so someone can take a look at this and figure out what's wrong.

Re: Hashes: Deleting elements while iterating
by Roger (Parson) on Sep 05, 2003 at 06:38 UTC
    You could replace
    foreach $key (%index){ delete $index{$key} if ($key =~ /$test.+/); }

    in the original code with a one-liner:

    foreach (keys %index) { delete $index{$_} if /$test.+/ }
    Where the keyword "keys" returns a list of all the hash keys. The new version does not generate warnings.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://288424]
Approved by tcf22
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (4)
As of 2024-04-24 01:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found