Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

find differences between multiple hashes

by JYDawg (Novice)
on Apr 09, 2004 at 23:22 UTC ( #344057=perlquestion: print w/ replies, xml ) Need Help??
JYDawg has asked for the wisdom of the Perl Monks concerning the following question:

Fellow wizards and witches,

I often find myself comparing hashes of file information or options and taking steps according on the differences. Normally I would loop though one of the hashes removing the equal items on both sides and then combining what's left. However after toiling with referenced array's in hashes for e.g. Template Toolkit I'm curious: Is there another (simpler) way of comparing hashes?

Example:

$given = { 'Subtype' => [ { 'url' => 'http://www.google.nl/', 'title' => 'testNL' }, { 'url' => 'http://www.google.be/', 'title' => 'testBE' } ], 'name' => 'test1' }; $retrieved = { 'Subtype' => [ { 'url' => 'http://www.google.nl/', 'title' => 'testNL' }, { 'url' => 'http://www.google.be/', 'title' => 'testBE' }, { 'url' => 'http://www.google.de/', 'title' => 'testBE' } ], 'name' => 'test2', 'type' => 'test2' };
The result should be:
$result = { 'Subtype' => [ { 'url' => 'http://www.google.de/', 'title' => 'testBE' } ], 'name' => 'test2', 'type' => 'test2' };

Thanks,

John

--- Lead me not into temptation for I can find it myself...

Comment on find differences between multiple hashes
Select or Download Code
Re: find differences between multiple hashes
by kvale (Monsignor) on Apr 10, 2004 at 02:43 UTC
    Comparing two general hierarchical data structures is in general a hard problem. First, you have to establish a criterion for equivalency. Do hash values have to be exactly the same, or is it the values' contents? Must arrays have exactly the same elements in the same order, or is it that they form equivalent sets good enough? Second, you have to come up with a search strategy.

    For instance, for a hash of hashes and assuming $retrieved is a superset of $given, the following can be used:

    my $result = {}; foreach my $main_key (keys %$retrieved) { unless (exists $given->{$main_key} ) { $result->{$main_key} = $retrieved->{$main_key}; next; } # The key exists, compare subhashes foreach my $sub_key (keys %$main_key) { $result->{$main_key}{$sub_key} = $retrieved->{$main_key}{$sub_ke +y} unless exists $given->{$main_key}{$sub_key} && $given->{$main_key}{$sub_key} eq $retrieved->{$main_key +}{$sub_key}; } }
    The idea is that given your data structure and equivalence criteria, you can drill down and simply do comparisons, rather than deletions. This should be quicker. For your particular application, I cannot discern your equivalence criterion, so I'll stop here.

    -Mark

Re: find differences between multiple hashes
by tachyon (Chancellor) on Apr 10, 2004 at 06:42 UTC

    Data::Diff or Struct::Diff will tell you if the structures differ. To drill down and compare arbitrary perl structures to generate a perl structure of the diffs is a task similar (but more complex) than that done by Data::Dumper Data::Denter or YAML.

    AFAIK there is no module that currently does this. I would suggest that writing a minimum case to deal with your data (like it sounds your have) is the best solution short of rethinking the app logic. BTW your desired result output is logically inconsistent (as noted by kvale) and should also contain 'name' => 'test1'.

    cheers

    tachyon

Re: find differences between multiple hashes
by BrowserUk (Pope) on Apr 10, 2004 at 07:46 UTC

    Here's a crude (but accurate) method of finding the differences. Reconstructing the required result is left as an exercise for the reader:)

    #! perl -slw use strict; use Data::Dumper; use Algorithm::Diff qw[ diff ]; my $given = { 'Subtype' => [ { 'url' => 'http://www.google.nl/', 'title' => 'testNL' }, { 'url' => 'http://www.google.be/', 'title' => 'testBE' } ], 'name' => 'test1' }; my $retrieved = { 'Subtype' => [ { 'url' => 'http://www.google.nl/', 'title' => 'testNL' }, { 'url' => 'http://www.google.be/', 'title' => 'testBE' }, { 'url' => 'http://www.google.de/', 'title' => 'testBE' } ], 'name' => 'test2', 'type' => 'test2' }; =pod The result should be: $result = { 'Subtype' => [ { 'url' => 'http://www.google.de/', 'title' => 'testBE' } ], 'name' => 'test2', 'type' => 'test2' }; =cut print @$_ for map{ @$_ } diff( [ split "\n", Dumper( $retrieved ) ], [ split "\n", Dumper( $given ) ] ); __END__ 8:37:10.53 P:\test>344057.pl -9 }, -10 { -11 'url' => 'http://www.google.de/', -12 'title' => 'testBE' -15 'name' => 'test2', -16 'type' => 'test2' +11 'name' => 'test1'

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
      ok i made it ... no idea why it didnt work in the first place. Thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://344057]
Approved by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (15)
As of 2014-08-28 16:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (264 votes), past polls