Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

A Hash that is giving me the ####s

by hoffy (Acolyte)
on Sep 29, 2010 at 07:25 UTC ( [id://862555]=perlquestion: print w/replies, xml ) Need Help??

hoffy has asked for the wisdom of the Perl Monks concerning the following question:

Hello oh wise monks,

I am trying to construct a Hash of an Array and frankly I am making a Hash of it. This is what I am trying to achieve

  1. I am creating the hash with the keys only and no values. This is done by reading in a file and splitting out values based on the same position. This also includes removing tabs and white space, creating an array to be used later and using a temp hash to weed out any duplicates. The file I am reading in looks like this (there are multiple lines to the file):
    # Comments - To be ignored field 1 ;field 2 ;field 3 ;field 4 ;field 5; field 6;
    Fields 5 onwards are not needed for this process. Field 4 is what I want to use for the key, but is not unique. Field 1 is then going to be used to populate the array in the hash and is unique for each line. The first part of the code, to create the empty hash is done like this:
    while (<FILE>) { chomp; if ($_ !~ m/#/) { s/\t//g; s/\s//g; ($f1,$f2,$f3,$f4,$f5) = split (/;/, $_,5); push @file,$_; unless ($seen{$f4}) { $seen{$f4}=1; $mailhouse{$f4} = ""; } } }
    This seems to work
  2. What I am doing next, is to then search against @file the array that I have made, using the hash keys to match against field 4. If field 4 is a match, I then want to push field 1 into the array for the hash with that key:
    for $mh ( keys %mailhouse ) { foreach (@file) { ($g1,$g2,$g3,$g4,$g5) = split (/;/, $_,5); if ($mh eq $g4) { push @{ $mailhouse{$mh} }, $g1; } } }
    This is the part that does not work.

What seems to happen is once the field 1 is pushed into the array, it seems to populate for all the keys of the hash.

Can anyone point me in the right direction? Can anyone suggest an alternative?

Cheers, the hoff

Replies are listed 'Best First'.
Re: A Hash that is giving me the ####s
by jwkrahn (Abbot) on Sep 29, 2010 at 09:27 UTC
    I am creating the hash with the keys only and no values.

    That is not possible.    In Perl's hashes, all keys have a value, and conversely, all values have a key.

    $mailhouse{$f4} = "";

    Actually, you are assigning a string as a hash value, and then later on trying to use that string as an array reference which is not possible!    If you had strict enabled then Perl would have informed you of this mistake.

    What you should be doing is assigning an anonymous array as your hash value:

    $mailhouse{$f4} = [];
Re: A Hash that is giving me the ####s
by suhailck (Friar) on Sep 29, 2010 at 07:53 UTC
    Is this what you are looking for?
    cat test 1;2; 3; 4; 5 ; 5 2;2;4; 4; 6; 7 3;1;2; 3;4;5 4;1;1;1;1; 1


    perl -MData::Dumper -le 'my %hash; while(<>) { my ($field1,$field4)=(split /\s*;\s*/)[0,3]; push @{$hash{$field4}},$field1 } print Dumper(\%hash) ' test $VAR1 = { '1' => [ '4' ], '4' => [ '1', '2' ], '3' => [ '3' ] };
    Update: about the actual problem in your posted code, I think you will need to refer Autovivification in Perl.
Re: A Hash that is giving me the ####s
by halfcountplus (Hermit) on Sep 29, 2010 at 11:21 UTC

    One little thing. You can do this: (CORRECTED)

    s/[\s\t]//g;

    To replace instances from a set of characters, in this case whitespace or a tab. However, neither that nor what you have is actually necessary, as \s indicates any kind of whitespace, which with ascii means either a space or a tab. So just \s is fine.

    The other thing is that is that what jwkrahn recommends:

    $mailhouse{$f4} = [];

    Is correct, but I would call it an anonymous array reference just to be completely clear (altho, of course, the only thing that can contain an anonymous array is a reference).

      s/[\s|\t]//g;

      If the  \t in the  [\s|\t] character set had been some non-redundant, non-whitespace character and thus not subject to removal, a potential problem would have remained. The  | (pipe) character in the set is a literal '|', not the regex alternation metacharacter, so the set would have contained an extraneous character – perhaps the basis of a subtle bug. A regex character set implies alternation.

        Egads!! My bad AnomalousMonk, and much thanks; I had not noticed that about [] -- looks like I have a few scripts to grep thru and correct. Heh-heh. I've also changed the above post. Point being:

        s/\s|\t//g;

        Is more or less equivalent to:

        s/[\s\t]//g;

        Whereas:

        s/[\s|\t]//g;

        Contains that subtle flaw with the pipe. (But again, just \s will do in this case anyway.)

Re: A Hash that is giving me the ####s
by hoffy (Acolyte) on Sep 29, 2010 at 23:03 UTC

    "An operator with Perl knowledge is just like a Donkey with a Piano. Now one knows how he got it and he sure doesn't know what to do with it......"

    Thank you for your quick responses! I knew it would be something easy.

    cheers

      Now one knows how he got it

      How does one now know how he got it? You didn't tell us.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://862555]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-04-16 04:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found