Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Hashing it out: defined? exists?

by eff_i_g (Curate)
on Sep 02, 2005 at 19:35 UTC ( #488752=perlquestion: print w/ replies, xml ) Need Help??
eff_i_g has asked for the wisdom of the Perl Monks concerning the following question:

Most humble sages,

I am working with some scripts I've inherited. Currently, multiple files are opened and certain field values are placed into a hash like so:

%hash_of_a_field{$the_record_value} = 1;


Later on the these are checked by using defined; This works fine. My question is geared toward a design / self-documenting approach.

Although this method works, would it not be better to set the key (no value) and use exists? Are both methods one in the same? I've thought of pushing the fields into an array, but I assume that using defined or exists is quicker than running a grep on the entire array?

Thank you for your advice :)

Comment on Hashing it out: defined? exists?
Download Code
Re: Hashing it out: defined? exists?
by halley (Prior) on Sep 02, 2005 at 19:41 UTC
    One, your syntax should use $hash{key} = 1, not %.

    Two, unless you're storing millions of these, and that's a real bottleneck for you, don't worry about all the extra scalar 1 references. The scalar 1 may already be special-cased to work the same way internally as the singleton scalar undef, but I don't know for sure.

    Three, yes, checking for exists is probably more suitable if you are concerned with the presence in the hash than using defined, which discriminates on the type of value associated with a key. After all, you could say $hash{key} = () and the key exists but the value is not defined.

    --
    [ e d @ h a l l e y . c c ]

Re: Hashing it out: defined? exists?
by ikegami (Pope) on Sep 02, 2005 at 19:42 UTC

    Three ways of checking the value in a hash:

    if (exists $hash{$key}) # If exists if (defined $hash{$key}) # If exists, defined if ( $hash{$key}) # If exists, defined and true

    If the value is just a flag, then I wouldn't even use defined. I would just treat the value as boolean. It's less verbose, and it works for all three checks listed above.

    As for defined vs exists, there isn't really any difference as long as the value can't normally be undefined.

      Which of these approachs autovivifies a hash entry? I know from the exists docs that the exists test won't autovivify. I'm pretty sure the last test will autovivify. I'm not sure about defined, though.

        None of them will.

        { my %h; 1 if exists $h{key}; print("exists: ", (%h ? "auto-vivi" : "empty"), "\n"); } { my %h; 1 if defined $h{key}; print("defined: ", (%h ? "auto-vivi" : "empty"), "\n"); } { my %h; 1 if $h{key}; print("true: ", (%h ? "auto-vivi" : "empty"), "\n"); } __END__ exists: empty defined: empty true: empty

        The following are the only forms of auto-vivification that come to mind:

        # If $h{$i} and $a[$i] are not defined, $h{$i}{$j} # Creates a hash and stores a ref to it in $h{$i}. $h{$i}[$j] # Creates an array and stores a ref to it in $h{$i}. $a[$i]{$j} # Creates a hash and stores a ref to it in $a[$i]. $a[$i][$j] # Creates an array and stores a ref to it in $a[$i].

      Just to add a little bit for the OP to understand:

      my %hash = ("a" => undef, "b" => 0, "c" => 2); testing("a"); testing("b"); testing("c"); testing("d"); sub testing { my $key = shift; print "$key exists\n" if (exists $hash{$key}); print "the value of $key is defined\n" if (defined $hash{$key}); print "the value for $key is true" if ($hash{$key}); }

      This prints:

      a exists b exists the value of b is defined c exists the value of c is defined the value for c is true
      Here is how I tested these (correctly, I hope):
      #!/usr/bin/perl -w use strict; use Benchmark qw( timethese ); use vars qw( %hash ); @hash{ 'A' .. 'Z', 'a' .. 'z' } = (1) x 52; timethese 100000, { 'defined' => sub { if (defined $hash{X}) { }; }, 'exists' => sub { if (exists $hash{X}) { }; }, 'as is' => sub { if ($hash{X}) { }; }, };


      the results:

      Benchmark: timing 100000 iterations of as is, defined, exists...
           as is:  1 wallclock secs ( 0.16 usr +  0.00 sys =  0.16 CPU)
                  (warning: too few iterations for a reliable count)
         defined:  0 wallclock secs ( 0.16 usr +  0.00 sys =  0.16 CPU)
                  (warning: too few iterations for a reliable count)
          exists:  0 wallclock secs ( 0.16 usr +  0.00 sys =  0.16 CPU)
                  (warning: too few iterations for a reliable count)
      
        Benchmark: timing 500000 iterations of as is, defined, exists...
             as is:  0 wallclock secs ( 0.79 usr +  0.00 sys =  0.79 CPU)
           defined:  2 wallclock secs ( 0.82 usr +  0.00 sys =  0.82 CPU)
            exists:  1 wallclock secs ( 0.76 usr +  0.00 sys =  0.76 CPU)
        

        Changing 100000 to -3 (means 3 seconds) and changin timethese to cmpthese gives:

        Rate defined exists as is defined 1961549/s -- -10% -12% exists 2181802/s 11% -- -2% as is 2236990/s 14% 3% --

        Oddly enough, defined is slower. Then again, this isn't a very accurate test since the condition always evaluates to true. Also, any time loss is so minute compared losses from other inefficiencies that it doesn't matter. Deciding which of these to use based on efficiency is like trying to trying to what kind of gum to buy based on how long it takes to remove the packaging.

Re: Hashing it out: defined? exists?
by eff_i_g (Curate) on Sep 02, 2005 at 19:45 UTC
    Thanks :) The % in the code was my typo.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://488752]
Approved by friedo
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2014-10-01 11:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (9 votes), past polls