http://www.perlmonks.org?node_id=1052671


in reply to control DNA file with perl

Something like the code below works, for some value of 'works'. If there are Us, Ns, spaces, blank lines, or lowercase bases, then it won't, and I would suggest you add some sanity checking.

However, you've not really explained what a position frequency matrix is (I'm taking an informed guess). It would also have been helpful to have seen what code you had already attempted so we could comment on that, rather than having the sneaking suspicion we're doing your homework for you. See How (Not) To Ask A Question

use strict; my @pmf; my $count = 0; while( my $seq = <DATA> ) { $count++; chomp $seq; my @bases = split //, $seq; for my $i ( 0 .. $#bases ) { $pmf[ $i ]{ $bases[$i] }++; } } for my $i ( 0 .. $#pmf ) { printf "%3u: ", $i; for my $base ( qw{ A C G T } ) { printf "$base %3.0f, ", 100 * $pmf[ $i ]{ $base } / $count; } print "\n"; } __DATA__ ATTCATCTCTCGG ATTGTGAGATAGA AAGATGATCGCTC AGATAGATCGCTG