Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

compare two hashes inefficiently.

by deprecated (Priest)
on Sep 08, 2001 at 02:32 UTC ( #111051=snippet: print w/replies, xml ) Need Help??
Description: Yeah, I know this is the wrong way to do it, and it could be done by iterating over keys and values, but I didnt want to do it that way.

I also could have done this in storable, but storable's freeze and thaw are broken on ultrasparc/solaris. So I used Data::Dumper. You get the idea, and could switch over to Storable to make it faster if you were so inclined.

Suggested usage:

my @newarray = grep { hcmp( \%goodhash, $_ ) } @somebadhashes;
sub hcmp {
  use Digest::MD5 qw{ md5_hex };
  use Data::Dumper;
  my ($a, $b) = @_;
  my $astr = md5_hex( Dumper( $a ) );
  my $bstr = md5_hex( Dumper( $b ) );
  return ($astr eq $bstr) ? 1 : 0;
Replies are listed 'Best First'.
Re: compare two hashes inefficiently.
by MrNobo1024 (Hermit) on Sep 08, 2001 at 03:17 UTC
    You don't need to use MD5 just to compare two strings. :-) I would have written it like this:
    sub hcmp { use Data::Dumper; return(Dumper(shift) eq Dumper(shift)); }

    -- MrNobo1024


Re (tilly) 1: compare two hashes inefficiently.
by tilly (Archbishop) on Sep 08, 2001 at 19:05 UTC
    This approach is unreliable. On my machine the script:
    use Data::Dumper; print Dumper({ 1007 => "Hello", 1195 => "World", }); print Dumper({ 1195 => "World", 1007 => "Hello", });
    produces the output:
    $VAR1 = { 1195 => 'World', 1007 => 'Hello' }; $VAR1 = { 1007 => 'Hello', 1195 => 'World' };
    The issue is that Data::Dumper just returns keys in the order that keys gave it, which returns keys that are in the same bucket in the insertion order. With a 2 key hash your odds of hitting this are 1/8. But when the hashing algorithm works right, a fixed portion of the keys wind up in a bucket with a neighbour. Therefore for larger hashes you are virtually guaranteed that the set of keys alone does not determine what order they come back in.

    If this explanation confuses, then Re (tilly) 4: Flip Flop III - Musical Buckets may help.

Re: compare two hashes inefficiently.
by jmcnamara (Monsignor) on Sep 08, 2001 at 15:13 UTC
Re: compare two hashes inefficiently.
by OfficeLinebacker (Chaplain) on Jan 11, 2007 at 13:21 UTC
    Hmm, I think I will find this useful in a specific program I wrote, but it turns out my algorithm isn't quite right. Thanks. ++

    I like computer programming because it's like Legos for the mind.
      Hi, please don't use this code. I wrote it ages and ages ago. The above replies were correct in that there are better ways to do this. Upon thinking back to when I wrote some of the code I've posted to perlmonks, I realize that I had a lot more perl knowledge than I had common sense. In this case, I had a solution for the problem ahead of me, but I didn't understand that I was solving the wrong problem. This may be one of the most important things I've learned over the years; I'm sure somebody here (or even myself) could help you come up with a better, and certainly more elegant, solution to the problem you're trying to solve with the above hack(s).


      Tilly is my hero.

        That's OK. I don't even remember what I was talking about in my post. I think I was also writing a file comparison routine (basically handrolling my own stripped-down rsync). Since then I discovered FreezeThaw, which has worked very nicely for me for hash comparisons.

        I like computer programming because it's like Legos for the mind.
Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: snippet [id://111051]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (1)
As of 2023-05-28 03:54 GMT
Find Nodes?
    Voting Booth?

    No recent polls found