Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Hash checking

by Anonymous Monk
on Apr 25, 2007 at 19:10 UTC ( #612075=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm attempting to debug a program that I can not get to work. Basically, I have a list of integer values that I read into a hash. I need to output every integer between 1 and 5000 that are not in the hash. My problem arises when I attempt to check if a value exists in the hash. No matter what I do, the condition always seems to evaluate as false. I'm obviously doing something wrong, but I can not figure out what. Can anyone shed some light on the situation?

The code is below...
my %faclist; my $i; open(DATA, "AXP_FACS.DAT"); while(<DATA>){ my $line = $_; $faclist{$line} = ""; } close(DATA); for ($i=1;$i<5001;$i++){ if exists $faclist{$i}{#THIS IS WHERE THE PROBLEM IS print "$i\n"; } }

Replies are listed 'Best First'.
Re: Hash checking
by kyle (Abbot) on Apr 25, 2007 at 19:20 UTC

    Each line you read in during your input loop has a newline character at the end. Use chomp to take it off.

    while(<DATA>){ chomp; my $line = $_; $faclist{$line} = ""; }

    After that, I think it should work.

    Update: I notice also that this line:

    if exists $faclist{$i}{#THIS IS WHERE THE PROBLEM IS

    ...has a syntax error. It should be:

    if ( exists $faclist{$i} ) {#THIS IS WHERE THE PROBLEM IS

    Note the parentheses.

      After that, I think it should work.

      I disagree (I may be wrong). Consider exists:

      Given an expression that specifies a hash element or array element, returns true if the specified element in the hash or array has ever been initialized, even if the corresponding value is undefined.

        I think maybe we're misunderstanding each other somehow. I think the reason the OP's search loop is not finding anything in the array is because each of the array keys has an extraneous newline at the end. Take that off (with chomp), and it should be fine.

        use Test::More 'tests' => 4; my $line; my %faclist; $line = "17\n"; $faclist{$line} = ""; $line = "23\n"; chomp $line; $faclist{$line} = ""; ok( ! exists $faclist{17}, 'number 17 does not exist' ); ok( ! exists $faclist{'17'}, 'string 17 does not exist' ); ok( exists $faclist{23}, 'number 23 exists' ); ok( exists $faclist{'23'}, 'string 23 exists' );
      Thank you kyle, that helped a lot. Now, another problem has appeared.

      I need to check for the presence of some values in the hash. For example, I am sure tha 1090 is in the hash. But the following code never evaluates as true and thus never does what's inside of the if statement.

      for ($i=1;$i<5001;$i++){
      if (exists $faclist{$i}){
      print "$i\n";
      }
      }
        just how sure are you, and how are you sure?
        use Data::Dumper; print Dumper(\%faclist);
Re: Hash checking
by GrandFather (Saint) on Apr 25, 2007 at 21:08 UTC

    There are a few small items worth pointing out. As can be seen in the sample below, DATA is special: avoid using it in other roles to avoid confusion.

    It is strongly recommended that you use the three parameter version of open. The intent is clearer and where a file name is provided the three parameter open is much safer. It looks like open (INFILE, '<', 'AXP_FACS.DAT'); (many people omit the parentheses).

    Using a Perl for loop is generally much preferred over the C style for. Combined with the range operator the intent is much clearer and less prone to off by one errors: for (1 .. 5000) {.

    A cleaned up version of the code might look like:

    use strict; use warnings; my %faclist; while (<DATA>) { chomp; $faclist{$_}++; } for (1 .. 5000) { if (exists $faclist{$_}) { print "Found $faclist{$_} of $_\n"; } } __DATA__ 1 1090 wibble 1

    Prints:

    Found 2 of 1 Found 1 of 1090

    An interesting variation of the print loop that you might like to ponder is:

    print "Found $faclist{$_} of $_\n" for sort {$a <=> $b} grep {/^\d+$/} keys %faclist;

    DWIM is Perl's answer to Gödel
Re: Hash checking
by shigetsu (Hermit) on Apr 25, 2007 at 19:22 UTC

    I assume you're struggling to have $faclist{$line} not contain a previous entry.

    Then you should be using delete as in:

    delete $faclist{$line};

Re: Hash checking
by njcodewarrior (Pilgrim) on Apr 25, 2007 at 20:58 UTC

    No need to use a hash if you're just reading numbers from a file. Use an array instead:

    #!/usr/bin/perl use strict; use warnings; my $file = 'AXP_FACS.DAT'; open my $FH, '<', $file or die "Error opening file: $!"; my @faclist = (<$FH>); % Read all lines into the array close $FH; chomp @faclist; % Remove newlines from each entry. my @numbers = 1..5000; foreach my $integer ( @numbers ) { unless ( grep { /\b$integer\b/ } @faclist ) { print "Not found: $integer\n"; } }

    The '\b' at the start and end of the grep regular expression matches the entire number, not just a single digit.

    njcodewarrior

      No need to use a hash ...

      unless you are concerned with execution time. grep performs a linear search through the entire array each time through the loop so the search is O(n2). A hash performs an essentially constant time lookup so the search is O(n).


      DWIM is Perl's answer to Gödel

        Thanks for the reply GrandFather. Not that I didn't believe you, but here's the proof:

        #! /usr/bin/perl
        
        use strict;
        use warnings;
        
        use File::Spec;
        use Data::Dumper;
        use Benchmark qw( timethese cmpthese );
        
        
        my ( undef, undef, $app ) = File::Spec->splitpath( $0 );
        
        open my $DATA, '<', './AXP_FACS.DAT' or die "Error opening file: $!";
        my @faclist = (<$DATA>);
        chomp @faclist;
        close $DATA;
        
        sub grep_by_array {
            my ( $ref ) = @_;
            my @faclist = @$ref;
            my @found;
            foreach my $integer ( 1..5000 ) {
                if ( grep { /\b$integer\b/ } @faclist ) {
                    unshift @found, $integer;
                }
            }
        
            return \@found;
        
        }
        
        # Convert the array to a hash with the numbers as keys
        my %list = map { $_ => 1 } @faclist;
        
        sub grep_by_hash {
            my ( $ref ) = @_;
            my %faclist = %$ref;
            my @found;
            foreach my $integer( 1..5000 ) {
                if ( exists $faclist{$integer} ) {
                    unshift @found, $integer;
                }
            }
        
            return \@found;
        
        }
        
        # Benchmark the 2 subs
        my $r = timethese( 1000, {
                'array' => sub{ grep_by_array(\@faclist) },
                'hash'  => sub{ grep_by_hash(\%list) },
            }
        );
        
        cmpthese( $r );
        

        RESULTS:

        Benchmark: timing 5000 iterations of array, hash...
             array: 339 wallclock secs (339.02 usr +  0.04 sys = 339.06 CPU) @ 14.75/s (n=5000)
              hash:  8 wallclock secs ( 8.27 usr +  0.00 sys =  8.27 CPU) @ 604.59/s (n=5000)
                Rate array  hash
        array 14.7/s    --  -98%
        hash   605/s 4000%    --
        

        That's quite an improvement using a hash!
        You learn something every day...

        njcodewarrior

Re: Hash checking
by Anonymous Monk on Apr 26, 2007 at 13:45 UTC
    My problem has been fixed. Perl was using the values read in as strings. I had to add a "$_+=0;" to my code to force it to be seen as an integer. Once that was done, the rest of the code worked as expected. Thank you all for your help.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://612075]
Approved by shigetsu
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2023-06-06 05:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you go to conferences?






    Results (26 votes). Check out past polls.

    Notices?