http://www.perlmonks.org?node_id=89879

acser has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, Try the following code for  $HASH_SIZE = 5; and  $HASH_SIZE = 10; and  $HASH_SIZE = 100;. I cannot explain the results. Can anyone please help me understand this? I am trying to test equality of contents of two hash tables. Your help is greatly appreciated, Andras Attached testcase:
#!/usr/bin/perl $HASH_SIZE = 100; %hash1; %hash2; for ($i=0; $i < $HASH_SIZE; $i++) { $hash1{"$i"} = "i=$i"; } %hash2 = %hash1; print "\n\n%hash1\n"; foreach my $key (keys %hash1) { print "$key:$hash1{$key} "; } print "\n%hash2\n"; foreach my $key (keys %hash2) { print "$key:$hash2{$key} "; } if (%hash1 == %hash2) { print "\nequal\n"; } else { print "\nnon equal\n"; } $hash2{"anotherone"} = "anotherONE"; print "\n\n%hash1\n"; foreach my $key (keys %hash1) { print "$key:$hash1{$key} "; } print "\n%hash2\n"; foreach my $key (keys %hash2) { print "$key:$hash2{$key} "; } if (%hash1 == %hash2) { print "\nequal\n"; } else { print "\nnon equal\n"; } #End of testcase

Replies are listed 'Best First'.
Re: How to test equality of hashes?
by bikeNomad (Priest) on Jun 20, 2001 at 06:46 UTC
    Unfortunately, when you evaluate a hash in a scalar context it returns a fraction representing the hash table bucket usage, which is rarely what anyone is interested in. It doesn't even return the number of elements.

    If you want to compare the contents of two hashes, you'll have to look at the keys and values of both. One way to compare them:

    my @k1 = keys(%hash1); my @k2 = keys(%hash2); # do they have the same number of elements? if (@k1 != @k2) { # they're different... } # are the keys the same? if (join($; , sort(@k1)) ne join($; , sort(@k2))) { #they're different } # are the values the same? if (join($; , @hash1{@k1}) ne join($; , @hash2{@k1})) { #they're different }

    All of this assumes that neither your keys nor values contain the $; character (by default \034)

Re: How to test equality of hashes?
by jeroenes (Priest) on Jun 20, 2001 at 12:05 UTC
    bikeNomad's solution 1 does not take into account the probability that the values may be the same, but that they are not tied to the same key. So:
    my @k1 = keys(%hash1); my @k2 = keys(%hash2); # do they have the same number of elements? if (@k1 != @k2) { # they're different... } # are the keys the same? if ((join $; , sort @k1 ) ne join $; , sort @k2)) { #they're different } # are the values the same? if ( scalar grep { $hash1{$_} ne $hash2{$_} } @k1 ) { #they're different }
    Number warning still applies.

    Jeroen

      Here's the subroutine I came up with. Thanks for everyone who responded. I could never get the flattening the array method to work. It is not very elegant and probably slow, but this is the only solution I could find that prints the right answer:
      equal non equal equal
      #!/usr/bin/perl $HASH_SIZE = 100; %hash1; %hash2; for ($i=0; $i < $HASH_SIZE; $i++) { $hash1{"i=$i"} = "j=$i"; } %hash2 = %hash1; if (&hasheq(\%hash1, \%hash2)) { print "\nequal\n"; } else { print "\nnon equal\n"; } $hash2{"anotherone"} = "anotherONE"; if (&hasheq(\%hash1, \%hash2)) { print "\nequal\n"; } else { print "\nnon equal\n"; } $hash1{"anotherone"} = "anotherONE"; if (&hasheq(\%hash1, \%hash2)) { print "\nequal\n"; } else { print "\nnon equal\n"; } sub hasheq { my ($ha1, $ha2) = @_; my %h1 = %$ha1; my %h2 = %$ha2; my @k1 = keys(%h1); my @k2 = keys(%h2); # do they have the same number of elements? if (@k1 != @k2) { return 0; } # are the keys the same? if ((join '/' , sort @k1 ) ne (join '/' , sort @k2)) { return 0; } # are the values the same? if ( scalar grep { $h1{$_} ne $h2{$_} } @k1 ) { return 0; } return 1; }
        Nice. I can't see this kind of follow-ups often enough!

        A few comments, naturally:

        1. use strict warnings and diagnostics or die. These are your friends on the long run.
        2. You declare 2 vars without need:
          my %h1 = %{ $_[0] }; #h2 similar
        3. Check whether your refs are defined:
          my %h1 = (defined $_[0] and ref $_[0]) ? %{ $_[0] } : ();
          This will make the code more portable/general/cleaner. You may add an extra check for hashiness.
        4. I just realized that this code can't distinguish between undefined or empty, so the grep becomes:
          scalar grep { my ($a, $b) = ( $h1{$_}, $h2{$_} ); ( not defined $a and defined $b ) or ( not defined $b and defined $a ) or $a ne $b } @k1
          Had to think extra carefully there!

        Hope this helps,

        Jeroen

Re: How to test equality of hashes?
by bikeNomad (Priest) on Jun 20, 2001 at 06:55 UTC
    Another way to compare (possibly arbitrarily deep) hashes is to stringify them using Storable and then compare the strings:
    use Storable 'freeze'; $Storable::canonical = 1; if (freeze(\%hash1) eq freeze(\%hash2)) { # they're equal }
    However, both this and the prior version I suggested suffer from a potential problem having to do with floating point accuracy and the stringification of numbers. Be warned. If you're not using numbers, both will work OK.
      But, see the following from perldoc Storable

      Setting "$Storable::canonical" may not yield frozen strings that compare equal due to possible stringification of numbers. When the string version of a scalar exists, it is the form stored, therefore if you happen to use your numbers as strings between two freezing operations on the same data structures, you will get different results.

      --
      Brovnik
      There is no way Storable can garantee the order of your hashes. You could use slices to assure an order, but that doesn't work for nested structures. Moreover, only references to objects are allowed, destroying the order of the slices.

      Update: Missed a part of the Storable manpage. Ignore my comment please

        I'm not sure what you're saying. I'm not expecting Storable to guarantee anything other than that it'll serialize hashes by sorted key order, which is what setting $Storable::canonical to true is supposed to do. From the Storable manpage:

        If you set $Storable::canonical to some TRUE value, Storable will store hashes with the elements sorted by their key. This allows you to compare data structures by comparing their frozen representations (or even the compressed frozen representations), which can be useful for creating lookup tables for complicated queries.

Re: How to test equality of hashes?
by Zaxo (Archbishop) on Jun 20, 2001 at 06:00 UTC

    The expression (%hash1 == %hash2) puts the hashes in scalar context. You're testing whether they have the same number of elements.

    How strong an equality do you want to test?

    To check that all elements are present and equal in each, you need to recurse through them. The &&= operator may be of help.

    To check that %hash1 and %hash2 are the same object in memory, test (\%hash1 == \%hash2).

andye Re: How to test equality of hashes?
by andye (Curate) on Jun 20, 2001 at 14:17 UTC
    How about this?
    sub flatten {return "@_"}; if ( flatten(%hash1) eq flatten(%hash2) ) { print "equal\n" } else { print "not\n" }
    This will compare the keys as well as the values, so if the same values are fixed to different keys then it should still return false. The 'stringification warning' above applies, i.e. this might have a problem with fractional numbers. Should be ok with strings and integers though.

    andy.

    PS If anyone can tell me how to get this hash-flattened-into-an-array behaviour without a sub, I'd be interested. I've been fiddling around with the syntax trying to get that.

    PPS Can two hashes with identical keys and values ever store the keys in a different order? I've been assuming not, but if they can, then the above won't work reliably. Again, would be interested in info... or a pointer to the correct RTFM... on this.

      PS If anyone can tell me how to get this hash-flattened-into-an-array behaviour without a sub, I'd be interested. I've been fiddling around with the syntax trying to get that.
      if ( "@{[%hash1]}" eq "@{[$hash2]}" ) { # foo ... }


      ar0n ]

        None of these repsonses sort the array. I would have thought that this solution would be flakey unsorted since hashs can come out in any order they please...

        I do flattening for comparison by shoving a hashref into an array ( @foo = %{$foo} ) and then sorting the array ( @foo = sort @foo ) and flattening that into a string.,p> The obvious problem with this is that if you have a hash where the keys/values were transposed then it would still appear equal. Similar situations will occur with other data anomolies.

        Not a problem in my case, but worth remembering.

      Neat solution.

      I was going to say that they always come out in the same order because the hashing key does not change. Unfortunately that did not stand up to testing. The order the elements are inserted into the hash makes a difference:

      my ( %hash1, %hash2 ); $hash1{"value$_"} = $_ for ( 1 .. 11 ); $hash2{"value$_"} = $_ for ( 5 .. 11 ); $hash2{"value$_"} = $_ for ( 1 .. 4 ); sub flatten { return "@_" } if ( flatten(%hash1) eq flatten(%hash2) ) { print "equal\n". flatten( %hash1 ) . "\n" . flatten(%hash2) } else { print "not\n" . flatten( %hash1 ) . "\n" . flatten(%hash2); } not value10 10 value11 11 value1 1 value2 2 value3 3 value4 4 value5 5 val +ue6 6 value7 7 value8 8 value9 9 value10 10 value1 1 value11 11 value2 2 value3 3 value4 4 value5 5 val +ue6 6 value7 7 value8 8 value9 9 ========== [C:\users\jake\code\komodo\test3.pl] run finished. ======== +==
      So flatten needs to be:
      sub flatten { my %hash = @_; return join '', map "$hash{$_}$_", sort keys %hash; }

      The answer to your question on how not to use a sub would be to use the join, map in the if clause. This saves passing the hash by value, but duplicates the logic on either side of the eq

      -- iakobski