I can see some answers have been supplied to solve the problem of entering the values into an Hash via a mapped grep. I have looked at the initial problem of identifying multiple values within each line. The aligned families can be retrieved using an array reading loop. However I caqn imagine further data manipulation would mean having the families in a hash may be more helpful in future reference. Here is how I would initially extract the multiple values via an array loop.
use strict;
use warnings;
my $gendata = './gen.dat';
open GENDAT, "< $gendata" or die "can't open $gendata $!";
my @count;
while (<GENDAT>){
push @count, $_;
}
close GENDAT;
my $a=shift @count;
my @families=split(/\s+/,$a);
my $b=0;
foreach my $c(@count){
my @state;
push @state, split(/\s+/,$c);
print $state[0].' ';
for($b=0;$b<=$#state;$b++){
print $families[$b].' ' if $state[$b] eq '1';
}
print $/;
}
exit (0);
this gives:
OG_1 lacM mori
OG_2 taba glyB
A different approach might be assigning values to each of the families and then extract the 01 strings and convert them into a binary then return the relevant family pair as determined by the binary value. Of course this would only work if there were a definite 2 unique families per 'OG'.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|