http://www.perlmonks.org?node_id=983108


in reply to Deleting undefined entries from DB_FILE hash

tie (my %hash, 'DB_File', $file, O_RDWR, 0600) || next; foreach my $key ( keys %hash ) { if (!defined $hash{$key}) {

The undef value is an internal perl meta-value.    When you store data on an external device (like a tied hash) the undef value is converted to a string, so trying to use defined on this value makes no sense.

Replies are listed 'Best First'.
Re^2: Deleting undefined entries from DB_FILE hash
by gossamer (Sexton) on Jul 23, 2012 at 19:07 UTC

    Hi,

    Thanks everyone for your offer to help. I've posted below a complete program that runs on my system. Where it encounters an error and quits, it displays the following, with a few of the preceding lines:

    bucket: aa type: spam 2012-07-02 20120702T132446 bucket: aa type: spam 2012-07-07 20120707T083754 bucket: aa type: spam 2012-07-02 20120702T203543 bucket: aa type: virus 2012-07-08 20120708T113543 bucket: aa type: spam 2012-07-16 20120716T151849 bucket: aa type: virus 2012-07-22 20120722T100147 bucket: aa type: spam 2012-07-03 20120703T042249 bucket: aa type: spam 2012-07-07 20120707T204829 Use of uninitialized value in split at ./list_hash.pl line 32. Use of uninitialized value $dt in pattern match (m//) at ./list_hash.p +l line 35. key 2 not defined. Deleting.

    It's reading from 255 hashes for each host, which you aren't going to have, so it probably won't execute for you.

    As I mentioned previously, the delete() isn't actually deleting for some reason. If I run the program again, it dies at the same spot, indicating to me that the record with the undef'd value or key still remained.

    Any ideas greatly appreciated.

    #!/usr/bin/perl -w # use perl; use DB_File; use DBI; use File::Basename qw(basename); use strict; use vars qw($verbose); my $me = basename($0); $me =~ s/\.pl$//; $verbose = shift || 1; sub DBG($); my $qdir = '/var/www/noc.mydomain.com-80/'; my %hashes = ( ); my $version = '1.9'; my @mailhosts = qw(); push @mailhosts, 'mail01'; push @mailhosts, 'mail02'; foreach my $mhosts (@mailhosts) { for (my $i = 0; $i < 256; $i++) { my $bucket = sprintf('%02x', $i); my $file = sprintf('%s/%s/%02x.db', $qdir, $mhosts, $i); tie (my %hash, 'DB_File', $file, O_RDWR, 0600, $DB_HASH) || next; foreach my $key ( keys %hash ) { my @tmp = split /\t/, $hash{$key}, 7; my $type = $tmp[0]; my $dt = $tmp[1]; my ($year, $month, $day) = $dt =~ m|(\d{4})(\d{2})(\d{2})T.*|; if(!defined($year)) { DBG("key 2 not defined. Deleting.\n"); de +lete($hash{$key}); untie %hash; exit 1; }; if(!defined($month)) { DBG("key 3 not. Deleting.\n"); delete($h +ash{$key}); untie %hash; exit 1; }; if(!defined($day)) { DBG("key 4 not defined. Deleting.\n"); del +ete($hash{$key}); untie %hash; exit 1; }; printf("bucket: %s\ttype: %s\t%s-%s-%s\t%s\n",$bucket,$type,$ye +ar,$month,$day,$dt) } untie %hash; } } # end foreach mailhost sub DBG($) { my $msg = shift; print $msg if ($verbose); }

    Version with line numbers:

    1 #!/usr/bin/perl -w 2 3 # use perl; 4 use DB_File; 5 use DBI; 6 use File::Basename qw(basename); 7 use strict; 8 use vars qw($verbose); 9 10 my $me = basename($0); $me =~ s/\.pl$//; 11 $verbose = shift || 1; 12 13 sub DBG($); 14 15 my $qdir = '/var/www/noc.mydomain.com-80/'; 16 my %hashes = ( ); 17 18 my $version = '1.9'; 19 my @mailhosts = qw(); 20 21 push @mailhosts, 'mail01'; 22 push @mailhosts, 'mail02'; 23 24 foreach my $mhosts (@mailhosts) { 25 for (my $i = 0; $i < 256; $i++) { 26 my $bucket = sprintf('%02x', $i); 27 my $file = sprintf('%s/%s/%02x.db', $qdir, $mhosts, $ +i); 28 29 tie (my %hash, 'DB_File', $file, O_RDWR, 0600, $DB_HASH) + || next; 30 foreach my $key ( keys %hash ) { 31 32 my @tmp = split /\t/, $hash{$key}, 7; 33 my $type = $tmp[0]; 34 my $dt = $tmp[1]; 35 my ($year, $month, $day) = $dt =~ m|(\d{4})(\d{2})(\d +{2})T.*|; 36 if(!defined($year)) { DBG("key 2 not defined. Deletin +g.\n"); delete($hash{$key}); untie %hash; exit 1; }; 37 if(!defined($month)) { DBG("key 3 not. Deleting.\n"); + delete($hash{$key}); untie %hash; exit 1; }; 38 if(!defined($day)) { DBG("key 4 not defined. Deleting +.\n"); delete($hash{$key}); untie %hash; exit 1; }; 39 printf("bucket: %s\ttype: %s\t%s-%s-%s\t%s\n",$bucket +,$type,$year,$month,$day,$dt) 40 41 } 42 untie %hash; 43 } 44 } # end foreach mailhost 45 46 sub DBG($) { my $msg = shift; print $msg if ($verbose); } 47

      Instead of this:

      foreach my $key ( keys %hash ) { my @tmp = split /\t/, $hash{$key}, 7; my $type = $tmp[0]; my $dt = $tmp[1]; my ($year, $month, $day) = $dt =~ m|(\d{4})(\d{2})(\d +{2})T.*|;
      Try this and see what you get:
      • Remove the LIMIT in the split function, in this case 7, and change "\t" to "\s+"
      • Test your match string, i.e $dt if it matches
      foreach my $key ( keys %hash ) { my @tmp = split /\s+/, $hash{$key}; my $type = $tmp[0]; my $dt = $tmp[1]; my ($year, $month, $day) = ('','',''); if($dt =~ m|(\d{4})(\d{2})(\d+{2})T.*$|){ ($year, $month, $day)=($1,$2,$3); } ...... ...... printf("bucket: %s\ttype: %s\t%s-%s-%s\t%s\n",$bucket +,$type,$year,$month,$day,$dt); ......

        I actually just a few minutes ago removed the LIMIT from the split, and it made no difference alone.

        I also added the following line directly before the split line:

        if(!defined($hash{$key})) { print "key $key ($hash{$key}) not defined\ +n"; delete($hash{$key}); next; }

        For values where $hash{$key} isn't defined, it still outputs this:

        Use of uninitialized value in concatenation (.) or string at ./list_ha +sh.pl line 33. key spam-de102f8ac36e56326312e8701b3b4520-20120419T161342-05203-09-2.g +z () not defined

        In other words, the key (the spam-de... filename) is defined, but the rest of the values of the record are not.

        The rest of the values include the email's subject, so switching \t for \s+ won't work, because email subjects have spaces in them.

        If I run this program again, it prints the same 'uninitialized value' line at the same spot. Why doesn't it delete() the record? Eventually the hash is "untied", so it's properly closed. That's the issue that really has me perplexed now.

        Thanks,
        Alex