http://www.perlmonks.org?node_id=995706


in reply to Re^2: Merge 2 hashes which contains duplicate Keys
in thread Merge 2 hashes which contains duplicate Keys

Do be aware that this only caters for 2 elements per key. If a third key value with the same key needed to be added, then the code would produce something like:

{ one => [ [ 1, 2 ], 3 ]; ... }

And for 4 values with the same key:

{ one => [ [ [ 1, 2 ], 3 ], 4 ]; ... }

Which is almost certainly not what you want. The purpose was to answer your question "I am not able to figure out where I am going wrong.", rather than solve the problem per se.

If you were only ever going to merge 2 hashes it would probably be okay as is, but otherwise would need a second check to test whether the existing value was a scalar or an array ref and act accoringly.

In general, I agree kennethk, you would be better off making all your values array refs, even when there is only one value contained. It greatly simplifies not only the construction, but also subsequent code that uses iterates the combined hash.

Another alternative that I used frequently when dealing with large volumes of data, is to build the values up as a concatenated scalar:

{ one => '1 2 3 4', ... }

And when I need to iterate the values, I use my @vals = split ' ', $hash{ $key }; to separate the elements.

The advantage of this (Hash of composite scalars) over a hash of arrays, is that it trades slightly slower access for considerably reduced memory requirement. A HoAs with 1e6 keys x 4 values perkey requires around 600MB; whereas a HoCS (composite scalars) with teh same data only requires 110MB.

The code for this then becomes:

#!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 1111 +1, ); foreach my $x ( keys %h2 ){ $h1{ $x } .= ' ' . $h2{ $x }; } print Dumper (\%h1);

Or if your values can contain spaces -- or you wish to accommodate that future possibility:

#!/usr/bin/perl -w use strict; use Data::Dumper; my %h1 = ( "one" => 1, "two" => 2, "three" => 3, ); my %h2 = ( "four" => 4, "five" => 5, "six" => 6, "one" => 1111 +1, ); foreach my $x ( keys %h2 ){ $h1{ $x } .= $; . $h2{ $x }; } print Dumper (\%h1);

In the latter case, you would use my @vals = split $;, $hash{ $key } to retrieve the values.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

RIP Neil Armstrong