Compare Values in HoH

AcidHawk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have 2 csv files, File1 look as follows

1141286452,ServerA,Disk Full,Arb data,other,stuff
1141286737,ServerB,Net Down,Arb data,other,stuff
1141286737,ServerC,Disk Full,Arb data,other,stuff

and File2 like so

1141286737,ServerB,Net Down
1141286780,ServerD,Bit Bucket Missing

I read them into 2 seperate hashes. (In the example code I have just hardcoded the hashes.) Firstly, I need each line of the file to appear in the hash, and none of the fields contain UNIQUE data, hence the index (or KEY of Ahash and Bhash) I create ie (1,2,3 etc which relates to line numbers). The real data is held in the value of Ahash and Bhash as another hash. Looking at the example code may help with the explanation.

What I want to do is check if a VALUE in Ahash is also in Bhash (or a line in File1 is also in File2). So here is what I have that works on small hashes...

#!/usr/bin/perl

use strict;
use warnings;
use Data::Dumper;

my %Ahash = (
    1 => {     'Epoch' => 1234567,
            'Node'    => "ServerA",
            'Event'    => "Disk Full",
            'other' => "Arb data", },
    2 => {     'Epoch' => 1234570,
            'Node'    => "ServerB",
            'Event'    => "Net Down",
            'other' => "Arb data", },
    3 => {     'Epoch' => 1234580,
            'Node'    => "ServerC",
            'Event'    => "Screen Stolen",
            'other' => "Arb data", },
            );
            
my %Bhash = (
    1 => {     'Epoch' => 1234530,
            'Node'    => "ServerD",
            'Event'    => "Bit  Bucket Missing",},
    2 => {     'Epoch' => 1234567,
            'Node'    => "ServerA",
            'Event'    => "Disk Full",},
            );
            
#print Dumper (%Ahash);
#print "===================================\n";
#print Dumper (%Bhash);
            
while ( (my $Akey, my $Avalue) = each %Ahash) {
    while ( (my $Bkey, my $Bvalue) = each %Bhash) {
        if (
            $$Bvalue{'Epoch'} eq $$Avalue{'Epoch'} &&
            $$Bvalue{'Node'} eq $$Avalue{'Node'} &&
            $$Bvalue{'Event'} eq $$Avalue{'Event'}
        ) {
            print ">>>>>>>MATCHED\n";
            print "Ahash => " . Dumper($Avalue);
            print "Bhash => " . Dumper($Bvalue);
            print "MATCHED<<<<<<<<<<<<<<\n";
        }
        else {
            print "NOT MATCHED\n";
        }
    }
}
[download]

This does not scale well as you can see that each value in Ahash is checked agains each value in Bhash. Is there a better way to compare values in hashes that are hashes?

-----
Of all the things I've lost in my life, its my mind I miss the most.

Comment on Compare Values in HoH Download Code

Replies are listed 'Best First'.
Re: Compare Values in HoH by GrandFather (Saint) on Apr 06, 2006 at 09:04 UTC
One way to do it is to generate a lookup hash keyed on your search data. Here's a starting point: `use warnings; use strict; my @file1 = ( '1141286452,ServerA,Disk Full', '1141286737,ServerB,Net Down', '1141286737,ServerC,Disk Full' ); my @file2 = ( '1141286737,ServerB,Net Down', '1141286780,ServerD,Bit Bucket Missing' ); my %file1LookUp; push @{$file1LookUp{$file1[$_ - 1]}}, $_ for (1..@file1); for (@file2) { print "Matched $_ at line(s) @{$file1LookUp{$_}}\n" if exists $fil +e1LookUp{$_}; }` [download] Prints: `Matched 1141286737,ServerB,Net Down at line(s) 2` [download] DWIM is Perl's answer to Gödel	[reply] [d/l] [select]
Re: Compare Values in HoH by Samy_rio (Vicar) on Apr 06, 2006 at 09:30 UTC
Hi AcidHawk, Try this, This will help you if your files hava datas in rows seprated by ',' #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use List::Compare::Functional qw(:originals :aliases); my $file1 = '1141286452,ServerA,Disk Full,Arb data,other,stuff 1141286737,ServerB,Net Down,Arb data,other,stuff 1141286737,ServerC,Disk Full,Arb data,other,stuff'; my $file2 = '1141286452,ServerA,Disk Full,Arb data,other,stuff 1141286737,ServerB,Net Down 1141286780,ServerD,Bit Bucket Missing'; my @file1 = split/\n/, $file1; my @file2 = split/\n/, $file2; map{s/^([^,]+\,[^,]+\,[^,]+)\,(.?)$/$1/}@file1; map{s/^([^,]+\,[^,]+\,[^,]+)\,(.?)$/$1/}@file2; my @Comm; my $comm = @Comm = get_intersection( [ \@file1, \@file2 ] ); if ($comm <=> 0) { print "\n\nFollowing informations are present in both File1 & File2.\n +"; print "\t\t\t$_.\n" foreach (@Comm); } __END__ Following informations are pesent in both File1 & File2. 1141286452,ServerA,Disk Full. 1141286737,ServerB,Net Down. [download] Updated Regards, Velusamy R. eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@\|6%,53!-9@2~j';	[reply] [d/l] [select]
Re: Compare Values in HoH by Herkum (Parson) on Apr 06, 2006 at 14:48 UTC
I think that you are making this too complicated. Lets see if we can understand the real problem, You have two files that have some common fields between them, you want to see which file has something the other file doesn't. Taking your file information from above, `File 1, 1141286452,ServerA,Disk Full,Arb data,other,stuff 1141286737,ServerB,Net Down,Arb data,other,stuff 1141286737,ServerC,Disk Full,Arb data,other,stuff File2, 1141286737,ServerB,Net Down 1141286780,ServerD,Bit Bucket Missing` [download] You can probably boil this down to the data that you want to string that would be, `File 1, 1141286452,ServerA,Disk Full 1141286737,ServerB,Net Down 1141286737,ServerC,Disk Full File2, 1141286737,ServerB,Net Down 1141286780,ServerD,Bit Bucket Missing` [download] Now lets take the data that you want to compare and stick in a hash. The value assigned in the hash relates to which file it came from. 1 means it came from file 1, 2 from file 2 and 3 means from file 1 and 2. my %index; foreach my $entry (@file1) { if (not exists $index{$entry}) { $index{$entry} = 1 } } foreach my $entry (@file2) { if (not exists $index{$entry}) { $index{$entry} = 2 } else { $index{$entry} += 2; } } foreach my $entry (keys %index) { if ($index{$entry} == 1) { print "Entry $entry is only in file one\n"; } elsif ($index{$entry} == 2) { print "Entry $entry is only in file two\n"; } elsif ($index{$entry} == 3) { print "Entry $entry is in both files\n"; } else { print "Entry $entry is screwed up!\n"; } } [download] Now you have a hash with all the unique entries and what files that they came from. Which is what I assume that you really want.	[reply] [d/l] [select]
Re^2: Compare Values in HoH by MidLifeXis (Monsignor) on Apr 06, 2006 at 18:09 UTC
Small logic error which you catch after the fact in your output loop. Assuming unique values in each file, your loops can be reduced to `foreach my $entry (@file1) { $index{$entry} = 1 } foreach my $entry (@file2) { $index{$entry} += 2; }` [download] Not assuming unique values, to avoid "screwed up" entries, it could be reduced to this `foreach my $entry (@file1) { $index{$entry} = 1 } foreach my $entry (@file2) { # '\|\|0' is to still the warnings under 'use warnings' $index{$entry} += 2 if (($index{$entry}\|\|0) < 2); }` [download] Nothing big. --MidLifeXis	[reply] [d/l] [select]
Re: Compare Values in HoH by graff (Chancellor) on Apr 07, 2006 at 04:19 UTC
As Herkum pointed out above, you are making this too complicated. You said yourself: What I want to do is check if a VALUE in Ahash is also in Bhash (or a line in File1 is also in File2). So just using whole lines from each file as the hash keys (like Herkum does) is the thing to do. In case the same string can occur multiple times in one file -- and in case it's important to keep track of how many times it occurs -- here's a simple variation on his approach to handle that: my %strings; while (<FILE1>) { chomp; $string{$_} .= '1'; # if same data appears three times, hash valu +e is "111"; } while (<FILE2>) { chomp; $string{$_} .= '2'; # same as above, but with "2" instead of "1" } # get hash keys (lines) that occur in both files: my @common = grep { $strings{$_} =~ /12/ } sort keys %strings; # report findings: for my $key ( @common ) { my ( $n1, $n2 ) = ( $strings{$key} =~ /(1+)(2+)/ ); printf("%s found %d times in file1, %d times in file2\n", $key, length($n1), length($n2)); } # you can also pick out strings unique to file1 (/1$/) # and/or strings unique to file2 (/^2/), along with their # frequency of occurrence. This also scales fairly well to # handling three or more files. [download]	[reply] [d/l]
Re: Compare Values in HoH by SheridanCat (Pilgrim) on Apr 06, 2006 at 23:46 UTC
Perhaps Test::Deep will do what you want. SheridanCat	[reply]


XP is just a number
	PerlMonks