I have a file that looks like this:
>34513
-------------------------------MVAIIFDMDGVLYRG
-----N-RAIPGVRELIEF-------LKE-R--------G------
>22476
------------------------------ALKAVLVDLNGTLHI-
--------AVPGAQEALKR---------------------------
>56832
------MARCERLRGA-----ALRDVLG--RAQGVLFDCDGVLWNG-
----E-RAVPGAPELLER-------LAR-------------------
>12543
---------------------------E--QFDILLLDLDGVVYVG-
----D-RLLPGARRALRR----------------------------G
>29078
---------------------------------AVLFDIDGVLVLS-
----W-RAIPGAAETVRQ-------LTH-R--------G--------
For now, I'm just interested in the 'headers' (that is, the line starting with '>'). I would like to place each of these in a hash and increment a count.
So, for example, the header '>34513' would have a count value of '1', the header '>12543' count value 4 and so forth.
This is what I've done so far.
#!/usr/local/bin/perl
use strict;
use English;
use Data::Dumper;
use UNIVERSAL qw(isa);
use FileHandle;
use Exception;
my $alignment = shift;
if (!$alignment || ! -e $alignment) {
die new Exception("couldnt open names file $alignment $!");
}
warn "# Reading alignment data";
my $alignData = getAlignData($alignment);
warn "# Got data: ".scalar (keys %$alignData);
#################################################
sub getAlignData {
my ($fIn) = @ARG;
my $fh = new FileHandle($fIn)
or die "";
my $count = 0;
my $hData = {};
while (my $line = $fh->getline)
{
my @cols = split /\s+/, $line;
# search only for lines with identifier
my $field = $cols[0];
my $test = substr($field, 0, 1);
if("$test" eq ">")
{
$count++;
my $hEntry = {
'identifier' => $cols[0],
'line' => $count,
};
my ($record) = sort ($hEntry->{identifier});
$hData->{$record} = $hEntry;
}
}
foreach my $k ( keys %{$hData} )
{
printf "%s -> %s\n", $k, $hData->{$k};
}
return $hData;
}
However, when I try to print out the hash I get the following.
>34513 -> HASH(0x87a3a40)
>22476 -> HASH(0x87a3980)
>56832 -> HASH(0x8762380)
>12543 -> HASH(0x87a3940)
>29078 -> HASH(0x8892b30)
Can anyone please tell me what I may be doing wrong? Thanks in advance.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
|
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
|
|