in reply to Lower-casing Substrings and Iterating Two Files together
If you bitwise or (|) an uppercase letter with a space, (assuming latin-1/ASCII files), it will lowercase it:
print 'ACGT' | ' ';; acgt
So, if you translate all the 'N's in your mask to spaces and then bitwise or the sequence and the mask, it will achieve your goal very efficiently:
$s = 'GGTACACAGAAGCCAAAGCAGGCTCCAGGCTCTGAGCTGTCAGCACAGAGACCGAT';; $m = 'GGTNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNT';; ( $mm = $m ) =~ tr[N][\x20];; print $mm;; GGT T print $s | $mm;; GGTacacagaagccaaagcaggctccaggctctgagctgtcagcacagagaccgaT
Which makes your entire program (excluding the unmentioned fact that your files may be in FASTA format):
#! perl -slw use strict; open SEQ, '<', 'data1.dat' or die $!; open MASK, '<', 'data2.dat' or die $!; while( my $seq = <SEQ> ) { ## Read a sequence my $mask = <MASK>; ## And the corresponding mask $mask =~ tr[N][ ]; ## Ns => spaces print $seq | $mask; ## bitwise-OR them and print the result } close SEQ; close MASK;
Redirect the output to a third file and you're done.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
In Section
Seekers of Perl Wisdom