Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

splitting and coloring

by sanku (Beadle)
on Nov 27, 2006 at 10:00 UTC ( [id://586214]=perlquestion: print w/replies, xml ) Need Help??

sanku has asked for the wisdom of the Perl Monks concerning the following question:

hai monks, i have one sequence as follows,
MAIFNSLVFLPASFTLSYMWNFGSMLGIMLMSQILTGFFLTFYYTAGEAFSSVQYIMFEVNLGWLLRIMH +SNGASMFFLFIYLHIFKGLIYGSYRLIGVWLSGIFIYFLLMGIAFTGYVLIWGQMSYWAAVVITSLMTS +VPYLGKYLVWWGSFSVCENTLKFFYSVHFILPWSLMVLVVFHLFFLHFTGSSSSLYCHGDYDKIHFFPS +FWLKDGFDIFFYFFLILFSLYFSFDLSDPMIFVESDSMAS
output of the above code should be like in this format;
MAIFNSLVFL PASFTLSYMW NFGSMLGIML MSQILTGFFL TFYYTAGEAF SSVQYIMFEV NLGWLLRIMH SNGASMFFLF IYLHIFKGLI YGSYRLIGVW LSGIFIYFLL MGIAFTGYVL IWGQMSYWAA VVITSLMTSV PYLGKYLVWW IWGSFSVCEN TLKFFYSVHF ILPWSLMVLV VFHLFFLHFT G<font color=red>SSS</font +>LYCHG
In the above sequence the search patter is 'SSS' if they their are 4 'S' continousely then it should be colored as SSSS In case if the search patter is found between two gaps of the sequences like as follows the it should be colored as follows MAIFNSVFSS SSALKM i tryed it by as following codes
My code is @y=q{MAIFNSLVFLPASFTLSYMWNFGSMLGIMLMSQILTGFFLTFYYTAGEAFSSVQYIMFEVNLGWL +LRIMHSNGASMFFLFIYLHIFKGLIYGSYRLIGVWLSGIFIYFLLMGIAFTGYVLIWGQMSYWAAVVIT +SLMTSVPYLGKYLVWWGSFSVCENTLKFFYSVHFILPWSLMVLVVFHLFFLHFTGSSSSLYCHGDYDKI +HFFPSFWLKDGFDIFFYFFLILFSLYFSFDLSDPMIFVESDSMAS}; while($j<$n-$m) { if($lastch==$y[$j+$m-1] && $firstch == $y[$j]){ $data="";$pat=""; for($i=$m-1;$i>=0;$i--){ $data=$y[$j+$i].$data; $pat=$x[$i].$pat; } if($data eq $pat){ @start=($j); @end=$j+$m; print "<b><font color=brown face=britanic,helvetica,sans-serif,times, +gill,courier size=4>Match Position : ",$j+1; #print "\n$entry\n$title +\n$acc \n SEQUENCES=>\n@y\n\n"; print " - ", $j +$m,"</b></font>"; print "<font color=blue size=4 face=\"courier\"><b>";$ss=0; for(my $j=0;$j<=$#start;$j++) { for(my $i=$ss;$i<=$#y;$i++) { if($start[$j] == -1) { next; } if($i % 10 == 0) { print " ";} if($i % 50 == 0) {print "<br>";print OUTFILE"\n";} if($start[$j] != $i && $i<$start[$j]) { print "$y[$i]"; print OUTFILE "$y[$i]"; } if($start[$j] == $i) { line: $k=$i; do { print "</b></font><font color=red siz +e=4 face=\"courier\"><b><blink>"; if($k % 10 == 0 && $k != $start[$j]) { print " "; print OUTFILE " "; } if($k % 50 == 0 && $k != $start[$j]) { print "<br>";print OUTFILE "\n"; } print "$y[$k]</b></blink></font><font +color=blue size=4 face=\"courier\"><b>"; print OUTFILE "$y[$k]"; $k++; }until($end[$j] == $k || $end[$j+1] == $k +);$ss=$k; if($k-1 == $start[$j+1]) { $i=($start[$j+1])+1; $j=$j+1;goto line; } elsif($k-1>=$start[$j+1] && $j <$#start) { $i=$k; $j=$j+1;goto line; } else {last;} } } } for(my $i=$k;$i<$#y;$i++) { if($i % 10 == 0) { print " "; print OUTFILE " ";} if($i % 50 == 0 ) {print "<br>";print OUTFILE "\n";} print "</b></font><font color=blue size=4 face=\"c +ourier\"><b>$y[$i]"; print OUTFILE "$y[$i]"; } print "</b></font><br> <hr color=navy>";
can i able to reduce this code so, Please, give me any suggession on it, because it take more time to execute this program as i colored the pattern by using arrays, if their is any way to reduce the code so, i am eager to know the answer, form you monk, please can any body give me suggession so for it. by sanku.

Replies are listed 'Best First'.
Re: spliting a lengthy text
by BrowserUk (Patriarch) on Nov 27, 2006 at 10:26 UTC

    I'd do it this way: (This requires 5.8.x)

    $str = 'MKTVEQTSPTMTSEKARWIY ... ELLTDVYVSYDHEGRNG'; print join ' ', unpack '(a10)*', $_ for unpack '(a50)*', $str;; MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Very nice (++) but aren't you missing the new line? ...
      print join ' ', unpack '(a10)*', "$_\n" for unpack '(a50)*', $str;
      update that newline is in the wrong place...
      print "" . join ' ', unpack '(a10)*', $_ . "\n" for unpack '(a50)*', $ +str;
      update A regex solution based on your plan of attack (if only because (un)?pack still scares me :)
      for ($str =~ /.{1,50}/g) {print join " ", /.{1,10}/g; print "\n"}
      ---
      my name's not Keith, and I'm not reasonable.

        I habitually run perl with -l enabled (see perlrun), which has the effect of setting $/ =  "\n" (and $\ = "\n"), so print adds a newline automatically. Hence the output I posted is unmodifed.

        This makes print (roughly?) equivalent to say in Perl 6.

        I also seem to recall seeing a news item that suggested that this keyword would also be available in Perl 5 as of v5.10?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: spliting a lengthy text
by davorg (Chancellor) on Nov 27, 2006 at 10:06 UTC

    What have you tried? What problems are you having?

    We're far more likely to help people who have tried to solve their problems for themselves first.

    Have you looked at split? Or substr? Personally I think I'd use a regular expression.

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Re: spliting a lengthy text
by Samy_rio (Vicar) on Nov 27, 2006 at 10:15 UTC

    Hi sanku, First thing is look into How (Not) To Ask A Question and How do I post a question effectively?, then you try like this,

    use strict; use warnings; my $str = 'MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAIAVGL +CAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIVGGGF +TLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYASACE +QITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTGREKE +EHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYDHEGR +NG'; $str =~ s/(.{10})/$1 /gsi; $str =~ s/((?:[^ ]+ ){5})/$1\n/gsi; print $str;

    davorg++

    Updated : Thanks BrowserUK

    Regards,
    Velusamy R.


    eval"print uc\"\\c$_\""for split'','j)@,/6%@0%2,`e@3!-9v2)/@|6%,53!-9@2~j';

      Using $& (and friends) can be nasty. See Devel::SawAmpersand for why.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: spliting a lengthy text
by johngg (Canon) on Nov 27, 2006 at 11:02 UTC
    Here's a solution using substr for the sake of variety.

    use strict; use warnings; my $string = q{MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAI +AVGLCAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIV +GGGFTLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYA +SACEQITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTG +REKEEHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYD +HEGRNG}; my $newString = q{}; my $count = 0; while (length $string >= 10) { $count += 10; $newString .= substr $string, 0, 10, q{}; $newString .= $count % 50 ? q{ } : qq{\n}; } $newString .= qq{$string\n}; print $newString;

    and the output

    MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG

    I hope this is of interest.

    Cheers,

    JohnGG

    Update: Amended code to remove length test in last .= assignment as it would fail to append a newline if the string length was a multiple of 10 but not of 50. Original line

    $newString .= qq{$string\n} if length $string;

Re: spliting a lengthy text (BioPerl)
by bobf (Monsignor) on Nov 27, 2006 at 20:51 UTC

    If your question is how to convert a sequence file in fasta format to a sequence file in genbank format, then I'd suggest looking at Bio::SeqIO from BioPerl. That module makes formatting changes trivial:

    use Bio::SeqIO; my $in_obj = Bio::SeqIO->newFh( '-file' => $infile_name, '-format' => $infile_format ); my $out_obj = Bio::SeqIO->newFh( '-file' => '>' . $outfile_name, '-format' => $outfile_format ); while( <$in_obj> ) { print $out_obj $_; }

    If you want a custom format based on the genbank format, you could tweak the Bio::SeqIO::GenBank module. The relevant section of code (based on your original description) is in the write_seq function:

    # print out the sequence my $nuc = 60; # Number of nucleotides per line my $whole_pat = 'a10' x 6; # Pattern for unpacking a whole line my $out_pat = 'A11' x 6; # Pattern for packing a line

Re: spliting a lengthy text
by themage (Friar) on Nov 27, 2006 at 10:26 UTC
    Hi sanku,

    I think that this code does what you need:
    $a=<>; while ($a=~s/(\s?)(\w{10})(\w)/$1$2 $3/){} $a=~s/((\w+\s){5})/$1\n/g; print $a,"\n";
    The code assumes that the original string comes from STDIN and the final strings goes to STDOUT. I presume you'll have no problem adapting it to your real needs.

Re: spliting a lengthy text
by jwkrahn (Abbot) on Nov 27, 2006 at 10:33 UTC
    $ perl -le' my $string = "MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAIA +VGLCAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIVG +GGFTLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYAS +ACEQITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTGR +EKEEHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYDH +EGRNG"; my $count; print map $_ . (++$count % 5 ? " " : "\n"), unpack "(A10)*", $string; ' MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG
Re: spliting a lengthy text
by jwkrahn (Abbot) on Nov 27, 2006 at 10:45 UTC
    $ perl -le' my $string = "MKTVEQTSPTMTSEKARWIYQKMVEIRMFEDRVHDIFSKGEIPGFVHLYAGEEAIA +VGLCAHLDHNDYITSTHRGHGHCIAKGCELDGMMAEIYGKSTGLCKGKGGSMHIADLDRGMLGANGIVG +GGFTLAAGAALTAKFKQTGGVAVCFFGDGANNQGTFHEGINLAAIWDLPVVFVAENNGYGEATPFHYAS +ACEQITDRAKGYNIPGVKVDGKDVVAVYEVAREAVERARRGEGPTLIECITYRNYGHFEGDAQTYKTGR +EKEEHTEERDAITLFEKYALSNNLLTEEAIQTVRHEVEQSVDRAVDFANASDYPQPEELLTDVYVSYDH +EGRNG"; my $count; $string =~ s/(?<=\G.{10})/ ++$count % 5 ? " " : "\n" /eg; print $string; ' MKTVEQTSPT MTSEKARWIY QKMVEIRMFE DRVHDIFSKG EIPGFVHLYA GEEAIAVGLC AHLDHNDYIT STHRGHGHCI AKGCELDGMM AEIYGKSTGL CKGKGGSMHI ADLDRGMLGA NGIVGGGFTL AAGAALTAKF KQTGGVAVCF FGDGANNQGT FHEGINLAAI WDLPVVFVAE NNGYGEATPF HYASACEQIT DRAKGYNIPG VKVDGKDVVA VYEVAREAVE RARRGEGPTL IECITYRNYG HFEGDAQTYK TGREKEEHTE ERDAITLFEK YALSNNLLTE EAIQTVRHEV EQSVDRAVDF ANASDYPQPE ELLTDVYVSY DHEGRNG

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://586214]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-25 23:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found