http://www.perlmonks.org?node_id=752860

Angharad has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that looks like this:
>34513 -------------------------------MVAIIFDMDGVLYRG -----N-RAIPGVRELIEF-------LKE-R--------G------ >22476 ------------------------------ALKAVLVDLNGTLHI- --------AVPGAQEALKR--------------------------- >56832 ------MARCERLRGA-----ALRDVLG--RAQGVLFDCDGVLWNG- ----E-RAVPGAPELLER-------LAR------------------- >12543 ---------------------------E--QFDILLLDLDGVVYVG- ----D-RLLPGARRALRR----------------------------G >29078 ---------------------------------AVLFDIDGVLVLS- ----W-RAIPGAAETVRQ-------LTH-R--------G--------
For now, I'm just interested in the 'headers' (that is, the line starting with '>'). I would like to place each of these in a hash and increment a count. So, for example, the header '>34513' would have a count value of '1', the header '>12543' count value 4 and so forth. This is what I've done so far.
#!/usr/local/bin/perl use strict; use English; use Data::Dumper; use UNIVERSAL qw(isa); use FileHandle; use Exception; my $alignment = shift; if (!$alignment || ! -e $alignment) { die new Exception("couldnt open names file $alignment $!"); } warn "# Reading alignment data"; my $alignData = getAlignData($alignment); warn "# Got data: ".scalar (keys %$alignData); ################################################# sub getAlignData { my ($fIn) = @ARG; my $fh = new FileHandle($fIn) or die ""; my $count = 0; my $hData = {}; while (my $line = $fh->getline) { my @cols = split /\s+/, $line; # search only for lines with identifier my $field = $cols[0]; my $test = substr($field, 0, 1); if("$test" eq ">") { $count++; my $hEntry = { 'identifier' => $cols[0], 'line' => $count, }; my ($record) = sort ($hEntry->{identifier}); $hData->{$record} = $hEntry; } } foreach my $k ( keys %{$hData} ) { printf "%s -> %s\n", $k, $hData->{$k}; } return $hData; }
However, when I try to print out the hash I get the following.
>34513 -> HASH(0x87a3a40) >22476 -> HASH(0x87a3980) >56832 -> HASH(0x8762380) >12543 -> HASH(0x87a3940) >29078 -> HASH(0x8892b30)
Can anyone please tell me what I may be doing wrong? Thanks in advance.

Replies are listed 'Best First'.
Re: Not able to print out hash contents correctly
by Erez (Priest) on Mar 24, 2009 at 13:44 UTC

    Each of the values in $hData is in itself a hash. You need to either dereference them, and iterate over them (like you did in foreach my $k ( keys %{$hData} )) or call Dumper ($hData->{$k}) and have it print the data-structure for you.

    "A core tenant of the greater Perl philosophy is to trust that the developer knows enough to solve the problem" - Jay Shirley, A case for Catalyst.

Re: Not able to print out hash contents correctly
by poolpi (Hermit) on Mar 24, 2009 at 15:50 UTC

    I would like to place each of these in a hash and increment a count

    Your code could be simplified and fit your needs :

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my ( %h, $count ); while (<DATA>) { next unless /^>(\d+)/; $count++; defined $1 and $h{$1} = $count; } print Dumper \%h; __DATA__ >34513 -------------------------------MVAIIFDMDGVLYRG -----N-RAIPGVRELIEF-------LKE-R--------G------ >22476 ------------------------------ALKAVLVDLNGTLHI- --------AVPGAQEALKR--------------------------- >56832 ------MARCERLRGA-----ALRDVLG--RAQGVLFDCDGVLWNG- ----E-RAVPGAPELLER-------LAR------------------- >12543 ---------------------------E--QFDILLLDLDGVVYVG- ----D-RLLPGARRALRR----------------------------G >29078 ---------------------------------AVLFDIDGVLVLS- ----W-RAIPGAAETVRQ-------LTH-R--------G--------


    hth,
    PooLpi

    'Ebry haffa hoe hab im tik a bush'. Jamaican proverb