Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Merge 2 hashes which contains duplicate Keys

by slayedbylucifer (Scribe)
on Sep 25, 2012 at 18:40 UTC ( #995613=perlquestion: print w/ replies, xml ) Need Help??
slayedbylucifer has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

THis question is regarding merging two hashes with duplicate keys. I found few solutions on hte internet and they all work, but I could not understand any of them . Hence working it out on my own.

Also, I tried the Hash::Merge and it works too, but I am feeling that this is not really that diffuclt and hence want to do it myself.

Lets consider below 2 hashes:

my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 2222, );

I want to merge these two hash so that the resulting hash look like below:

$VAR1 = { 'one' => [ 1, 2222 ] 'two' => 2, 'three' => 3, 'four' => 4, 'five' => 5, 'six' => 6, };

So when a duplicate key is found, the value of that key should get into an anonymous array

I referred http://perldoc.perl.org/perlreftut.html and wrote below code:

#!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 11111, ); foreach my $x ( keys %h2 ){ push @{ $h1{$x} }, $h2{$x}; } print Dumper (\%h1);

the output is:

Can't use string ("1") as an ARRAY ref while "strict refs" in use at s +lice_hash.pl line 48

So, I remove "use strict" and I get below output:

$VAR1 = { 'three' => 3, 'five' => [ 5 ], 'six' => [ 6 ], 'one' => 1, 'two' => 2, 'four' => [ 4 ] };

I am not able to figure out where I am going wrong. Any suggestions would be of great help.

Thanks.

Comment on Merge 2 hashes which contains duplicate Keys
Select or Download Code
Re: Merge 2 hashes which contains duplicate Keys
by kennethk (Monsignor) on Sep 25, 2012 at 19:17 UTC
    What's going on in your initial code is when a duplicate key is encountered, you attempt to deference a string -- in the case of key one, you are accessing the non-existent variable @1. Your value then gets pushed onto @1, and not the hash element. The appropriate action is to test the values of the hash to see if they exist, and act accordingly, a la:
    #!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 11111, ); foreach my $x ( keys %h2 ){ if (exists $h1{$x}) { $h1{$x} = [$h1{$x}, $h2{$x}] } else { $h1{$x} = $h2{$x}; } } print Dumper (\%h1);
    In general, I find accessing data structures with inconsistent structure irritating and way too bug prone. Rather that varying structure based upon whether there is one or more value, I would make every hash entry an array reference, a la:
    #!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 11111, ); foreach my $x ( keys %h1 ){ $h1{$x} = [$h1{$x}]; } foreach my $x ( keys %h2 ){ push @{$h1{$x}}, $h2{$x}; } print Dumper (\%h1);

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Thanks kennethk. Your Explanation clarifies a lot of things for me. Thanks for your time.
Re: Merge 2 hashes which contains duplicate Keys
by BrowserUk (Pope) on Sep 25, 2012 at 19:26 UTC

    You cannot push a second value onto a key that already exists and has a scalar value, because the first value is a scalar, not an array (ref).

    So you need to remove that value, replace it with an array ref, and then push the original value + the second value into that array.

    #!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 11111 +, ); foreach my $x ( keys %h2 ){ if( exists $h1{ $x } ) { my $temp = delete $h1{ $x }; push @{ $h1{$x} }, $temp , $h2{ $x }; } else { $h1{ $x } = $h2{ $x }; } } print Dumper (\%h1);

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

      thanks BrowserUk. This is a different approach. Good learning for me. thanks for your time.

        Do be aware that this only caters for 2 elements per key. If a third key value with the same key needed to be added, then the code would produce something like:

        { one => [ [ 1, 2 ], 3 ]; ... }

        And for 4 values with the same key:

        { one => [ [ [ 1, 2 ], 3 ], 4 ]; ... }

        Which is almost certainly not what you want. The purpose was to answer your question "I am not able to figure out where I am going wrong.", rather than solve the problem per se.

        If you were only ever going to merge 2 hashes it would probably be okay as is, but otherwise would need a second check to test whether the existing value was a scalar or an array ref and act accoringly.

        In general, I agree kennethk, you would be better off making all your values array refs, even when there is only one value contained. It greatly simplifies not only the construction, but also subsequent code that uses iterates the combined hash.

        Another alternative that I used frequently when dealing with large volumes of data, is to build the values up as a concatenated scalar:

        { one => '1 2 3 4', ... }

        And when I need to iterate the values, I use my @vals = split ' ', $hash{ $key }; to separate the elements.

        The advantage of this (Hash of composite scalars) over a hash of arrays, is that it trades slightly slower access for considerably reduced memory requirement. A HoAs with 1e6 keys x 4 values perkey requires around 600MB; whereas a HoCS (composite scalars) with teh same data only requires 110MB.

        The code for this then becomes:

        #!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 1111 +1, ); foreach my $x ( keys %h2 ){ $h1{ $x } .= ' ' . $h2{ $x }; } print Dumper (\%h1);

        Or if your values can contain spaces -- or you wish to accommodate that future possibility:

        #!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 1111 +1, ); foreach my $x ( keys %h2 ){ $h1{ $x } .= $; . $h2{ $x }; } print Dumper (\%h1);

        In the latter case, you would use my @vals = split $;, $hash{ $key } to retrieve the values.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        RIP Neil Armstrong

Re: Merge 2 hashes which contains duplicate Keys
by 2teez (Priest) on Sep 25, 2012 at 22:51 UTC

    You might also find the following useful:

    use warnings; use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 11111, ); my $grand_hash_ref = {}; foreach my $hash_ref ( \( %h1, %h2 ) ) { while ( my ( $key, $value ) = each %{$hash_ref} ) { push @{ $grand_hash_ref->{$key} }, $value; } } print Dumper $grand_hash_ref;

    Please, check perldsc for detailed information.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Merge 2 hashes which contains duplicate Keys
by davido (Archbishop) on Sep 25, 2012 at 23:01 UTC

    It's been done.

    use strict; use warnings; use Hash::Merge; use Data::Dumper; my %hash1 = ( this => 1, that => 2, the => 3, other => 4 ); my %hash2 = ( those => 1, them => 2, thine => 3, other => 42 ); my %merged = %{ Hash::Merge->new('RETAINMENT_PRECEDENT')->merge( \%has +h1, \%hash2 ) }; print Dumper \%merged;

    Hash::Merge can merge arbitrarily deep hashes, following pre-defined rules for key conflict resolution.


    Dave

      Thanks Dave. yes, I had gone through the Hash::Merge and it is a life savior. I want to learn more about perl data structure and hence trying things on my own without taking help from any module. Thanks for your time.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://995613]
Approved by johngg
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2014-09-19 04:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (129 votes), past polls