http://www.perlmonks.org?node_id=244265

jens has asked for the wisdom of the Perl Monks concerning the following question:

I'm doing a bit of data munging for a client and
I want to suck around 6,000 records from a flat file
into a large hash of hashes. I've RTFM'd and I'm still
struggling--your help would be much appreciated.

Here's what I have so far:
my %hash_of_hashes = ( my $record_no => my %unitfiles_hash ); #suck all the unit files into a big hash to make searching easier while (<UNITFILES>) { my @unitfiles_field = split /,/; #this is just an autogen number $record = $unitfiles_field[0]; %hash_of_hashes{$record_no} = ( %unitfiles_hash = ( lastname => $unitfiles_field[1], firstname => $unitfiles_field[2], DOB => $unitfiles_field[7], funding => $unitfiles_field[18], URNo => $unitfiles_field[15], Photo_permission => $unitfiles_field[14], ); ); } #end while
I've also considered using an array of hashes, but that's also confused me.
Please help!
--
Microsoft delendum est.

Replies are listed 'Best First'.
Re: Hash of hashes syntax
by Enlil (Parson) on Mar 19, 2003 at 06:53 UTC
    I think you are looking for something more along these lines:
    my %hash_of_hashes; while (<UNITFILES>) { my @unitfiles_field = split /,/; #this is just an autogen number my $record_no = $unitfiles_field[0]; $hash_of_hashes{$record_no} = { lastname => $unitfiles_field[1], firstname => $unitfiles_field[2], DOB => $unitfiles_field[7], funding => $unitfiles_field[18], URNo => $unitfiles_field[15], Photo_permission => $unitfiles_field[14], }; } #end while
    or as a Array of hashes:
    my @Array_of_Hashes; while (<UNITFILES>) { #this is just an autogen number my @unitfiles_field = split /,/; push @Array_of_Hashes, { record_number => $unitfiles_field[0], lastname => $unitfiles_field[1], firstname => $unitfiles_field[2], DOB => $unitfiles_field[7], funding => $unitfiles_field[18], URNo => $unitfiles_field[15], Photo_permission => $unitfiles_field[14], }; } #end while

    hope this helps

    -enlil

Re: Hash of hashes syntax
by jdporter (Paladin) on Mar 19, 2003 at 07:03 UTC
    O.k. Here's your code, quick-fixed:
    my %hash_of_hashes; while (<UNITFILES>) { my @unitfiles_field = split /,/; # this is just an autogen number my $record = $unitfiles_field[0]; my %unitfiles_hash = ( lastname => $unitfiles_field[1], firstname => $unitfiles_field[2], DOB => $unitfiles_field[7], funding => $unitfiles_field[18], URNo => $unitfiles_field[15], Photo_permission => $unitfiles_field[14], ); # add this hash to the hash-of-hashes: $hash_of_hashes{$record_no} = \%unitfiles_hash; }
    I'd probably make it somewhat cleaner, like so:
    my %hash_of_hashes; while (<UNITFILES>) { my @f = split /,/; $hash_of_hashes{$f[0]} = { lastname => $f[1], firstname => $f[2], DOB => $f[7], funding => $f[18], URNo => $f[15], Photo_permission => $f[14], }; }
    Or I might even consider the following. (TIMTOWTDI!)
    my %hash_of_hashes; while (<UNITFILES>) { my %h; (my $rn, @h{ qw(lastname firstname DOB funding URNo Photo_permission) }) = (split /,/)[0,1,2,7,18,15,14]; $hash_of_hashes{$rn} = \%h; }

    jdporter
    The 6th Rule of Perl Club is -- There is no Rule #6.

      I'm afraid your last solution, as concise as it might be, simply does not work. \%h always refers to the same hash. You may well change its content at each loop, you're always modifying the same hash, and all members of %hash_of_hashes refer to it.

      You really must create a new hash at each loop. This will do it:

      $hash_of_hashes{$rn} = { %h };
      (the hash keys and values are flattened into a list, which in turn is used to create an anonymous hash).

      --bwana147

        I'm afraid your last solution, as concise as it might be, simply does not work. \%h always refers to the same hash

        Not true.

        In this case, it would refer to a different hash. This is because the variable my %h is lexically scoped, and thus it is recreated each time through the loop, and so \%h will point to a different hash each time through the loop. To illustrate what I mean here is some code:

        The differences are subtle, but jdporter's code works.

        -enlil

        Good eye, but this actually isn't a problem in this case. Because of the my %h inside of the loop, %h is a different (and new) lexical through each iteration -- thus there is no problem just sticking a reference to it into the hash or array. The problem comes from code like this (nearly directly copied from perldsc):

        my %hash; for $i (1..10) { %hash = somefunc($i); $hash_of_hashes{$rn} = \%hash; # WRONG! }

        Note where the "my" is. See perldsc or perlreftut or perllol for more.

        perl -pe '"I lo*`+$^X$\"$]!$/"=~m%(.*)%s;$_=$1;y^`+*^e v^#$&V"+@( NO CARRIER'

Re: Hash of hashes syntax
by thor (Priest) on Mar 19, 2003 at 13:13 UTC
    The other monks who have responded in this thread have already answered your question. Just my 2 cents...arrays and hashes are collections of scalars. Therefore, to create a multi tiered data structure, you need to put either an annonymous (hash|array), or a reference to a (hash|array). Both of the aforementioned "things" can be contained in scalars, and thus are candidates for entry into a hash or an array. Hope this helps.

    thor

      Therefore, to create a multi tiered data structure, you need to put either an annonymous (hash|array), or a reference to a (hash|array).

      This may be a bit pedantic of me, but there's no sense in confusing the issue.

      You can't put an anonymous array (or an anonymous hash) into another structure. References are required in order to build nested data structures. It doesn't matter whether a reference refers to a named thingie or an anonymous one.

      This ['anon', 'array'] is not an "anonymous array." It is a reference to an anonymous array. Similary with this reference to an {anonymous => 'hash'}. It is best to understand that these constructs actually are references and to avoid useless distinctions between them and other references.

      -sauoq
      "My two cents aren't worth a dime.";
      
        You're absolutely right, however, the OP didn't know how to constuct a multi-level data structure, so I'm pretty sure that he didn't know about annonymous arrays. I was just presenting more of the lingo so that next time someone says something about either an anonymous array or an array ref, s/he will at least have had exposure. Also, FWIW, the Camel seems to use annonymous array to be synonymous with array reference (similarly for hashes).

        thor

          A reply falls below the community's threshold of quality. You may see it by logging in.