splitting and coloring

sanku has asked for the wisdom of the Perl Monks concerning the following question:

hai monks, i have one sequence as follows,

MAIFNSLVFLPASFTLSYMWNFGSMLGIMLMSQILTGFFLTFYYTAGEAFSSVQYIMFEVNLGWLLRIMH
+SNGASMFFLFIYLHIFKGLIYGSYRLIGVWLSGIFIYFLLMGIAFTGYVLIWGQMSYWAAVVITSLMTS
+VPYLGKYLVWWGSFSVCENTLKFFYSVHFILPWSLMVLVVFHLFFLHFTGSSSSLYCHGDYDKIHFFPS
+FWLKDGFDIFFYFFLILFSLYFSFDLSDPMIFVESDSMAS
[download]

output of the above code should be like in this format;

MAIFNSLVFL PASFTLSYMW NFGSMLGIML MSQILTGFFL TFYYTAGEAF 
SSVQYIMFEV NLGWLLRIMH SNGASMFFLF IYLHIFKGLI YGSYRLIGVW 
LSGIFIYFLL MGIAFTGYVL IWGQMSYWAA VVITSLMTSV PYLGKYLVWW 
IWGSFSVCEN TLKFFYSVHF ILPWSLMVLV VFHLFFLHFT G<font color=red>SSS</font
+>LYCHG
[download]

In the above sequence the search patter is 'SSS' if they their are 4 'S' continousely then it should be colored as SSSS In case if the search patter is found between two gaps of the sequences like as follows the it should be colored as follows MAIFNSVFSS SSALKM i tryed it by as following codes

My code is
@y=q{MAIFNSLVFLPASFTLSYMWNFGSMLGIMLMSQILTGFFLTFYYTAGEAFSSVQYIMFEVNLGWL
+LRIMHSNGASMFFLFIYLHIFKGLIYGSYRLIGVWLSGIFIYFLLMGIAFTGYVLIWGQMSYWAAVVIT
+SLMTSVPYLGKYLVWWGSFSVCENTLKFFYSVHFILPWSLMVLVVFHLFFLHFTGSSSSLYCHGDYDKI
+HFFPSFWLKDGFDIFFYFFLILFSLYFSFDLSDPMIFVESDSMAS};

while($j<$n-$m)
    {
    if($lastch==$y[$j+$m-1] && $firstch == $y[$j]){
                                $data="";$pat="";
                for($i=$m-1;$i>=0;$i--){
                       $data=$y[$j+$i].$data;
                        $pat=$x[$i].$pat;
                       
                    }

                    if($data eq $pat){
                    @start=($j);
                    @end=$j+$m;

print "<b><font color=brown  face=britanic,helvetica,sans-serif,times,
+gill,courier size=4>Match Position : ",$j+1; #print "\n$entry\n$title
+\n$acc \n SEQUENCES=>\n@y\n\n";    
                    print " - ", $j +$m,"</b></font>";


print "<font color=blue  size=4 face=\"courier\"><b>";$ss=0;
                for(my $j=0;$j<=$#start;$j++)
                {
                    for(my $i=$ss;$i<=$#y;$i++)
                    {
                        if($start[$j] == -1)
                        {
                        next;
                        }
                        if($i % 10 == 0)
                        { print " ";}
                        if($i % 50 == 0)
                        {print "<br>";print OUTFILE"\n";}

                        if($start[$j] != $i && $i<$start[$j])
                        {
                            print "$y[$i]";
                            print OUTFILE "$y[$i]";    
                        }
                        if($start[$j] == $i)
                        {
                        line:    $k=$i;
                            do
                            {
                                print "</b></font><font color=red  siz
+e=4 face=\"courier\"><b><blink>";
                                if($k % 10 == 0 && $k != $start[$j])
                                { 
                                print " ";
                                print OUTFILE " ";
                                }
                                if($k % 50 == 0 && $k != $start[$j])
                                {        
                                    print "<br>";print OUTFILE "\n";
                                }
                                print "$y[$k]</b></blink></font><font 
+color=blue size=4 face=\"courier\"><b>";
                                print OUTFILE "$y[$k]";
                                $k++;
                            }until($end[$j] == $k || $end[$j+1] == $k 
+);$ss=$k;
                            if($k-1 == $start[$j+1])
                            {
                                $i=($start[$j+1])+1;
                                $j=$j+1;goto line;
                            }
                            elsif($k-1>=$start[$j+1] && $j <$#start) 
                            {
                                $i=$k;
                                $j=$j+1;goto line;
                            }
                            else
                            {last;}
                        }
                    }
                }
                for(my $i=$k;$i<$#y;$i++)
                {
                    if($i % 10 == 0)
                    { print " ";
                    print OUTFILE " ";}
                    if($i % 50 == 0 )
                    {print "<br>";print OUTFILE "\n";}
                    print "</b></font><font color=blue size=4 face=\"c
+ourier\"><b>$y[$i]";
                    print OUTFILE "$y[$i]";
                }
                print "</b></font><br> <hr color=navy>";
[download]

can i able to reduce this code so, Please, give me any suggession on it, because it take more time to execute this program as i colored the pattern by using arrays, if their is any way to reduce the code so, i am eager to know the answer, form you monk, please can any body give me suggession so for it. by sanku.

Comment on splitting and coloring Select or Download Code

Replies are listed 'Best First'.
Re: spliting a lengthy text by BrowserUk (Patriarch) on Nov 27, 2006 at 10:26 UTC
I'd do it this way: (This requires 5.8.x) `$str = 'MKTVEQTSPTMTSEKARWIY ... ELLTDVYVSYDHEGRNG'; print join ' ', unpack '(a10)', $_ for unpack '(a50)', $str;; MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG` [download] Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l]
Re^2: spliting a lengthy text by reasonablekeith (Deacon) on Nov 27, 2006 at 10:35 UTC
Very nice (++) but aren't you missing the new line? ... `print join ' ', unpack '(a10)', "$_\n" for unpack '(a50)', $str;` [download] update that newline is in the wrong place... `print "" . join ' ', unpack '(a10)', $_ . "\n" for unpack '(a50)', $ +str;` [download] update A regex solution based on your plan of attack (if only because (un)?pack still scares me :) `for ($str =~ /.{1,50}/g) {print join " ", /.{1,10}/g; print "\n"}` [download] --- my name's not Keith, and I'm not reasonable.	[reply] [d/l] [select]
Re^3: spliting a lengthy text by BrowserUk (Patriarch) on Nov 27, 2006 at 11:02 UTC
I habitually run perl with `-l` enabled (see perlrun), which has the effect of setting `$/ = "\n"` (and `$\ = "\n"`), so print adds a newline automatically. Hence the output I posted is unmodifed. This makes print (roughly?) equivalent to `say` in Perl 6. I also seem to recall seeing a news item that suggested that this keyword would also be available in Perl 5 as of v5.10? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re^4: spliting a lengthy text by grinder (Bishop) on Nov 27, 2006 at 14:17 UTC
Re: spliting a lengthy text by davorg (Chancellor) on Nov 27, 2006 at 10:06 UTC
What have you tried? What problems are you having? We're far more likely to help people who have tried to solve their problems for themselves first. Have you looked at split? Or substr? Personally I think I'd use a regular expression. -- <http://dave.org.uk> "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg	[reply]
Re: spliting a lengthy text by Samy_rio (Vicar) on Nov 27, 2006 at 10:15 UTC
Hi sanku, First thing is look into How (Not) To Ask A Question and How do I post a question effectively?, then you try like this, `use strict; use warnings; my $str = 'MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAIAVGL +CAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIVGGGF +TLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYASACE +QITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTGREKE +EHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYDHEGR +NG'; $str =~ s/(.{10})/$1 /gsi; $str =~ s/((?:[^ ]+ ){5})/$1\n/gsi; print $str;` [download] davorg++ Updated : Thanks BrowserUK Regards, Velusamy R. eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@\|6%,53!-9@2~j';	[reply] [d/l] [select]
Re^2: spliting a lengthy text by BrowserUk (Patriarch) on Nov 27, 2006 at 10:23 UTC
Using `$&` (and friends) can be nasty. See Devel::SawAmpersand for why. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]
Re: spliting a lengthy text by johngg (Canon) on Nov 27, 2006 at 11:02 UTC
Here's a solution using `substr` for the sake of variety. use strict; use warnings; my $string = q{MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAI +AVGLCAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIV +GGGFTLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYA +SACEQITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTG +REKEEHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYD +HEGRNG}; my $newString = q{}; my $count = 0; while (length $string >= 10) { $count += 10; $newString .= substr $string, 0, 10, q{}; $newString .= $count % 50 ? q{ } : qq{\n}; } $newString .= qq{$string\n}; print $newString; [download] and the output `MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG` [download] I hope this is of interest. Cheers, JohnGG Update: Amended code to remove `length` test in last `.=` assignment as it would fail to append a newline if the string length was a multiple of 10 but not of 50. Original line `$newString .= qq{$string\n} if length $string;` [download]	[reply] [d/l] [select]
Re: spliting a lengthy text (BioPerl) by bobf (Monsignor) on Nov 27, 2006 at 20:51 UTC
If your question is how to convert a sequence file in fasta format to a sequence file in genbank format, then I'd suggest looking at Bio::SeqIO from BioPerl. That module makes formatting changes trivial: `use Bio::SeqIO; my $in_obj = Bio::SeqIO->newFh( '-file' => $infile_name, '-format' => $infile_format ); my $out_obj = Bio::SeqIO->newFh( '-file' => '>' . $outfile_name, '-format' => $outfile_format ); while( <$in_obj> ) { print $out_obj $_; }` [download] If you want a custom format based on the genbank format, you could tweak the Bio::SeqIO::GenBank module. The relevant section of code (based on your original description) is in the `write_seq` function: `# print out the sequence my $nuc = 60; # Number of nucleotides per line my $whole_pat = 'a10' x 6; # Pattern for unpacking a whole line my $out_pat = 'A11' x 6; # Pattern for packing a line` [download]	[reply] [d/l] [select]
Re: spliting a lengthy text by themage (Friar) on Nov 27, 2006 at 10:26 UTC
Hi sanku, I think that this code does what you need: `$a=<>; while ($a=~s/(\s?)(\w{10})(\w)/$1$2 $3/){} $a=~s/((\w+\s){5})/$1\n/g; print $a,"\n";` [download] The code assumes that the original string comes from STDIN and the final strings goes to STDOUT. I presume you'll have no problem adapting it to your real needs. TheMage http://themage.bliker.com	[reply] [d/l]
Re: spliting a lengthy text by jwkrahn (Abbot) on Nov 27, 2006 at 10:33 UTC
$ perl -le' my $string = "MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAIA +VGLCAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIVG +GGFTLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYAS +ACEQITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTGR +EKEEHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYDH +EGRNG"; my $count; print map $_ . (++$count % 5 ? " " : "\n"), unpack "(A10)*", $string; ' MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG [download]	[reply] [d/l]
Re: spliting a lengthy text by jwkrahn (Abbot) on Nov 27, 2006 at 10:45 UTC
$ perl -le' my $string = "MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAIA +VGLCAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIVG +GGFTLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYAS +ACEQITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTGR +EKEEHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYDH +EGRNG"; my $count; $string =~ s/(?<=\G.{10})/ ++$count % 5 ? " " : "\n" /eg; print $string; ' MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG [download]	[reply] [d/l]


XP is just a number
	PerlMonks