Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Deleting undef values from a hash

by liz (Monsignor)
on Dec 14, 2003 at 00:29 UTC ( #314585=perlmeditation: print w/replies, xml ) Need Help??

perldoc -f delete states:
Returns each element so deleted or the undefined value if there was no such element.
Which is of course nice, because it gives you a way to find out whether a key was really deleted from the hash. Or does it?

If the value associated with the key in the hash was undef, then being returned undef doesn't tell you anything. The undef could be from the hash, it could also be supplied Perl to "indicate" nothing was deleted from the hash.

So I decided to see whether deleting in array/list context would yield meaningful information in that respect. It doesn't:

my %a = (foo => undef); print "existed = ".(() = delete $a{foo})."\n"; print "notexisted = ".(() = delete $a{bar})."\n"; __END__ existed = 1 notexisted = 1

It seems that Perl is returning an array with the result of all possible deletions, so an array with the number of keys attempted to be deleted, not the actual number of keys deleted. Observe:

my %a = (foo => 1, bar => 2, baz => 3); $" = ','; print "existed = @{[delete $a{foo}]}\n"; print "notexisted = @{[delete @a{qw(foo bar baz)}]}\n"; __END__ existed = 1 notexisted = ,2,3

Anyway, that's what I learned today.

Since this behaviour goes back to at least Perl 5.00503, I think I'll provide a documentation patch, so that at least I will understand this the next time I read it.


Just to be clear: I knew about exists() ;-) My reason for using undef as a value in the hash, is that it uses less memory than a defined value:

use Devel::Size qw(size total_size); $a{1} = undef; $b{1} = 0; $c{1} = 1; $d{1} = 10; $e{1} = 'abcd'; print "$_: ".(total_size( \%{$_} ) - size( \%{$_} ))."\n" for a..e; __END__ a: 12 b: 16 c: 16 d: 16 e: 29
This is still a lot more than I would have hoped ;-( Especially if you have millions of keys.

Replies are listed 'Best First'.
Re: Deleting undef values from a hash
by Zaxo (Archbishop) on Dec 14, 2003 at 00:55 UTC

    I agree that a documentation patch is the best response. Since the value for a key can be literally any scalar, it is impossible to think of a return value that would serve to distinguish between the two cases (at least without some perl magic).

    We can imagine delete returning a hash or a list of key-value pairs, but (besides breaking compatibility) that would injure idioms that are more useful than knowledge of which keys existed before a deletion. The idiom for re-keying a value, $foo{'bar'} = delete $foo{'baz'}; would be wrecked, for instance.

    A workaround can be devised pretty easily if we need the existence information, my @actual_values = delete @a{ grep { exists $a{$_}} @somekeys};
    There are lots of variations possible on that snippet.

    After Compline,

      Rafael applied my documentation patch yesterday:
      -Returns each element so deleted or the undefined value if there was n +o such -element. Deleting from C<$ENV{}> modifies the environment. Deleting + from +Returns a list with the same number of elements as the number of elem +ents +for which deletion was B<attempted>. Each element of that list consi +sts of +either the value of the element deleted, or the undefined value. In +scalar +context, this means that you get the value of the last element delete +d (or +the undefined value if that element did not exist). + + %HASH = (foo => 11, bar => 22, baz => 33); + $scalar = delete $HASH{foo}; # $scalar is 11 + $scalar = delete @HASH{qw(foo bar)}; # $scalar is 22 + @array = delete @HASH{qw(foo bar baz)} # @array is (undef,undef +,33) + +Deleting from C<$ENV{}> modifies the environment. Deleting from a hash tied to a DBM file deletes the entry from the DBM file. Delet +ing from a C<tie>d hash or array may not necessarily return anything.


Re: Deleting undef values from a hash
by ysth (Canon) on Dec 14, 2003 at 03:31 UTC
    Not all undefs are created equal:
    #!/usr/bin/perl -w use strict; use warnings; our %foo; @foo{'foo','baz'} = (); for my $key ('foo','bar','baz') { print "key $key ", (\undef == \delete $foo{$key} ? "didn't exist" : "existed"), "\n"; }
    Doesn't work for tied hashes or magic hashes such as %SIG or %ENV.
      I think that you'll have a hard time finding that documented. ;-) (Though I have a pretty good idea why it works.)

      I also wouldn't want to trust it across Perl versions. That seems very specific to perl5, and I would not plan on that specific optimization finding its way into Ponie for instance.

        It should be ok for perl5, though. The division of responsibilities between hv_delete_* and pp_delete pretty much require that it work. hv_delete_* will return null if the key didn't exist, which pp_delete will replace with the immortal undef value. If the key did exist, hv_delete_* will return the value (even undef) in a different sv.

        It's obviously better practice to check exists in the first place, outside of obfuscation and golf.

Re: Deleting undef values from a hash
by pg (Canon) on Dec 14, 2003 at 05:04 UTC

    This is probably the best result you can expect for now.

    Ideally, in the world of modern computing, if you delete an element that does not exist, some sort of exception can be threw. Now, the two different cases become clearly distiguished:

    • If you delete an element that does not exist, you get exception;
    • If it exists, but has undef as value, no exception caught, but return undef (which is consistant with the way any defined value being handled)

    For now, if you want to clearly distiguish the two cases (if there is a need), use exists() to check first, and only delete when the element really exists.

    use strict; use warnings; $a = {"a" => 1, "b" => 2, "c" => undef}; print "c exists\n" if (exists($a->{"c"}));#c exists, although its valu +e is undef print "d exists\n" if (exists($a->{"d"}));#d does not exist
Re: Deleting undef values from a hash
by davido (Archbishop) on Dec 14, 2003 at 07:47 UTC
    You make an interesting observation, and it sounds like a documentation patch may be in order, as you suggest, liz.


    I wanted to follow up by further exploring the exists, defined, and each functions.

    First, exists:

    The POD for exists states that 'exists' returns truth if a hash or array element has been initialized, even if its value is 'undef'.


    my %hash = ('First' => 1, 'Second' => undef ); print "First\n" if exists $hash{'First'}; print "Second\n" if exists $hash{'Second'}; print "Third\n" if exists $hash{'Third'}; __OUTPUT__ First Second

    So if the hash element's value is undefined, but the element exists, exists correctly detects the existance of an element.

    defined can be a little tricky if you don't think it through. For existant keys with undef value, defined will return false, telling you that 'Second' is not defined. However, defined will also return false for elements that don't exist ('Third', for example). So defined is not the right way to check for the existance of an element, because it will return false for nonexistant elements and for existant elements with undef value. This is old news, but worth mentioning in such discussions as this.

    Now for each (the real point to this followup): each knows which elements exist, even if the element's value is undef. So iterating over a hash with each will iterate over all existant elements, regardless of their value (or lack thereof). The POD says it's generally a bad idea to add or delete elements while iterating over a hash with each. But there is an exception though, which is also documented in the POD. Per the POD for each: It is always safe to delete the item most recently returned by each()...

    It turns out this is useful. Consider the following code which will remove existing hash elements with undef value:

    while ( my ( $key, $val ) = each %hash ) { delete $hash{$key} if not defined $val; }




Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://314585]
Approved by Zaxo
Front-paged by Enlil
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2018-07-21 14:01 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (449 votes). Check out past polls.