http://www.perlmonks.org?node_id=978001


in reply to Re: (another) HoH question
in thread (another) HoH question

Thanks for your answers!

I had a vague sense that something was blocking either the level-1 assignments or level-3 assignments. Data::Dumper certainly cleared that up (I had never used this before, and this will become a favorite.)

What I'm trying to do is use a hash to do a hash to merge two sets:

Set 1: I have a text file establishing the universe of accepted values:
abc 123 abc 456 abc 789 xyz 456 Hash: 123->abc 456->abc 456->xyz 789->abc
Set 2: A set of values pointing to the hash. There may be multiple recs pointing to a hash rec (that's OK, as I just need to know of the existence of one). There maybe hash records that no one uses.
xxx ==> 456->abc xxx ==> 789->abc yyy ==> 456->abc yyy ==> 456->abc (a dup) zzz ==> 123->abc Resulting hash 123->abc->zzz 456->abc->xxx 456->abc->yyy 456->xyz 789->abc->xxx
I could just hash the second set, but I won't get the non-use ones like 456->xyz.

I've tried building the first hash and using arrays for the latter hash, but it runs rather slowly from all the repeated scans.

Suggestions appreciated.

Replies are listed 'Best First'.
Re^3: (another) HoH question
by muba (Priest) on Jun 23, 2012 at 19:38 UTC

    Data::Dumper certainly cleared that up (I had never used this before, and this will become a favorite.)
    It works both ways: if you can write down your data structure in a manner that looks like Data::Dumper output, then you can be relatively sure your data is at least syntactically correct.

    Anyway. If I get you right, you're reading Set One, which would result in the hash as defined in Listing 1, and then you're reading Set Two, which would alter the original hash to that it becomes the one as defined in Listing Two.

    # Listing One: %hash = ( "123" => "abc", "456" => { # I'm chosing for a hashref here becaus +e "abc" => undef, # eventually we'll possibly replacing t +hose "xyz" => undef, # undefs with something else. } "789" => "abc" );
    # Listing Two: %hash = ( "123" => {"abc" => "zzz"}, "456" => { "abc" => ["xxx", "yyy"], "xyz" => undef # Or [], or 0, or "", or whatev +er }, "789" => {"abc" => "xxx"} );

    But really, this is such a mess that I doubt this is what you want.

      Actually, this does come close to what I was looking for. Understanding it, however, is another matter. LOL.

      I will give it a try.

      Thanks for your help, everyone.

        That's okay. Which part(s) do you find hard to understand? I'd love to explain things in more depth. I could write an elaborate essay of the whole routine, but I'd rather just clarify those parts that are unclear to you. And by "part" I don't necesserily mean pieces of the code - you might as well wonder how or why certain things were done.

Re^3: (another) HoH question
by johngg (Canon) on Jun 24, 2012 at 16:24 UTC

    By confining the structure to hashes and avoiding arrays you don't have to worry about duplicates and, by always pre-assigning an anoymous hash, you can also dispense with the bother of turning a scalar value into a hash or an array. I think this is a little simpler and clearer than muba's solution if the structure produced is what you are after.

    use strict; use warnings; use Data::Dumper; open my $univFH, q{<}, \ <<EOD or die qq{open: < HEREDOC: $!\n}; abc 123 abc 456 abc 789 xyz 456 EOD my %univ; while ( <$univFH> ) { my( $lev2Key, $lev1Key ) = split; $univ{ $lev1Key }->{ $lev2Key } = {}; } close $univFH or die qq{close: < HEREDOC: $!\n}; print Data::Dumper->Dumpxs( [ \ %univ ], [ qw{ *univ } ] ); open my $valuesFH, q{<}, \ <<EOD or die qq{open: < HEREDOC: $!\n}; xxx ==> 456->abc xxx ==> 789->abc yyy ==> 456->abc yyy ==> 456->abc zzz ==> 123->abc EOD while ( <$valuesFH> ) { chomp; my( $lev3Key, $upperKeys ) = split m{\s*==>\s*}; my( $lev1Key, $lev2Key) = split m{->}, $upperKeys; $univ{ $lev1Key }->{ $lev2Key }->{ $lev3Key } = 1; } close $valuesFH or die qq{close: < HEREDOC: $!\n}; print Data::Dumper->Dumpxs( [ \ %univ ], [ qw{ *univ } ] );

    The Data::Dumper output, first of the structure after reading the universe file and second after reading the values file and adding its data.

    %univ = ( '456' => { 'abc' => {}, 'xyz' => {} }, '123' => { 'abc' => {} }, '789' => { 'abc' => {} } ); %univ = ( '456' => { 'abc' => { 'xxx' => 1, 'yyy' => 1 }, 'xyz' => {} }, '123' => { 'abc' => { 'zzz' => 1 } }, '789' => { 'abc' => { 'xxx' => 1 } } );

    I hope this is helpful.

    Cheers,

    JohnGG