http://www.perlmonks.org?node_id=213729

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, I'm trying to wrap my code so that it prints to 60 chars per line which i have done with the reg exp.
$genome1 =~ s/(.{60})/$1\n/g;
However, i want to wrap 3 strings in this way but so they print one from each string in each wrap... e.g.
111111111111111111111 222222222222222222222 333333333333333333333 111111111111111111111 222222222222222222222 333333333333333333333 etc
can anyone help me?? thanks ;-)

Replies are listed 'Best First'.
Re: trivial wrapping
by BrowserUk (Patriarch) on Nov 18, 2002 at 13:10 UTC

    If you don't mind destroying the strings as you output them, then this will do the job. If you need them for further processing then you could copy the array first. However, from the use of $genome as a variable name, it could well be that your strings are of extreme length, in which case copying them may be prohibitively expensive on memory. In which case, say so and someone will suggest a non-destructive way. Its not much harder

    #! perl -sw use strict; my @strings = map{"$_" x 200} 1 .. 3; while( length "@strings" > 2 ) { print substr( $strings[$_], 0, 60, '') . "\n" for 0 .. $#strings; print "\n"; } __END__ c:\test>213729 111111111111111111111111111111111111111111111111111111111111 222222222222222222222222222222222222222222222222222222222222 333333333333333333333333333333333333333333333333333333333333 111111111111111111111111111111111111111111111111111111111111 222222222222222222222222222222222222222222222222222222222222 333333333333333333333333333333333333333333333333333333333333 111111111111111111111111111111111111111111111111111111111111 222222222222222222222222222222222222222222222222222222222222 333333333333333333333333333333333333333333333333333333333333 11111111111111111111 22222222222222222222 33333333333333333333 c:\test>

    Update: I noticed the mention of $genome1 after my first attempt at this and then disliked the destructive nature of the code and also that the loop condition would be exspensive on memory for large strings and/or arrays, so here's a better version without those caveats. The output is the same.

    #! perl -sw use strict; my @strings = map{"$_" x 200} 1 .. 3; my ($total, $p) = (0, 0); do { $total = 0; print substr( $strings[$_], $p, 60 ) . "\n" for 0 .. $#strings; print "\n"; $p+=60; $total += length($_) - $p for @strings; } while( $total > 0 ); __END__

    Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
    Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
    Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
    Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

           That way is much too complicated. A format will serve the purpose very well in this case...

      format FH = ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~ $var1 . $var1 = "hahhahahahaThis is my great big long string of more " . "than 60 characters hahahahahahahahahahahahahaahahahhahah" ; open (FH, "> somefile.txt"); write FH; close FH;

      Summary:
      1. The circumflex (^) indicates that it is a variable length record
      2. The less-than signs (<) left-justify the printed value
      3. The squiggles (tildes (~)) suppress blank lines and keep printing lines until the variable is empty.
      4. This also destroys the variable

           I got this directly from the Camel's mouth... All hail the Camel (O'Reilly Programming Perl).

      Invulnerable. Unlimited XP. Unlimited Votes. I must be...
              GhodMode

        Now show me a FORMAT that will wrap two strings to 60 wide and interleave them as requested by the OP and I'll buy you a coffee next time I see you:)


        Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
        Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
        Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
        Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

      Hi BrowserUk, thanks for your help. Unfortunatley I am having problems getting this to work.. i dont completely understand your code and cant see where to plug in my two strings ($genome1 and $genome2). Can i do ;
      my @strings = map {$genome1, $genome2};
      or something like? thanks again ;-)

        You dont need the map, a simple array assignment will do.

        my @strings = ($genome1, $genome2);

        Update. That said, if you don't have your strings in an array, and its difficult to put them into one to start with, a small modification to the routine will prevent needless duplication.

        #! perl -sw use strict; # ... your existing code.... my @strings = \($genome1, $genome2); my ($total, $p) = (0, 0); do { $total = 0; #NOTE: Extra $'s v ........................................v print substr( $$strings[$_]}, $p, 60 ) . "\n" for 0 .. $#$strings; print "\n"; $p+=60; $total += length($_) - $p for @$strings; # And here! } while( $total > 0 ); __END__

        Okay you lot, get your wings on the left, halos on the right. It's one size fits all, and "No!", you can't have a different color.
        Pick up your cloud down the end and "Yes" if you get allocated a grey one they are a bit damp under foot, but someone has to get them.
        Get used to the wings fast cos its an 8 hour day...unless the Govenor calls for a cyclone or hurricane, in which case 16 hour shifts are mandatory.
        Just be grateful that you arrived just as the tornado season finished. Them buggers are real work.

Re: trivial wrapping
by UnderMine (Friar) on Nov 18, 2002 at 13:41 UTC
    #! perl -sw use strict; my @strings = map{"$_" x 200} 1 .. 3; for (my $x=0; $x<length($strings[0]); $x+=60) { print join("\n",map(substr( $strings[$_],$x,60),0..2))."\n\n"; }
    The non destructive method ;)

    Hope it helps
    UnderMine

      thanks for this non-destructive method! ;) I am now trying to get more complicated and have written a sub-routine to calculate the mis-matches between the two sequences, ideally i want it to print out like this;
      AGACACTACTGCTG ***** *** ACTACTTACCATCG
      If you see what i mean. I tried to plug the $mismatches into the code you gave me but its not working. it thinks $mismatches is another string!
      my $mismatches = get_mismatches ($genome1, $genome2); my @strings = ($genome1, $mismatches, $genome2); for (my $x = 0; $x < length ($strings[0]; $x +=60) { print join ("\n", map (substr ($strings[$_], $x, 60), 0 .. 2)). " +\n\n"; }
      The subroutine get_mismatches works perfectly well. Can you see why its going wrong? e.g the above code prints this;
      TGTATCTACTGACTGAC TGTATCTACTGACTGAC TGTATCTACTGACTGAC ATCTG ATCTG ATCTG
        I think it would be prudent for you to refrain from writing mismatch tools etc. unless you REALLY know what you are about. Simply interjecting a * between mismatched nucleotides only works if the sequences are equal length and have no more correct alignment, and has only limited usefulness besides. Are you prepared to write a fully-fledged, correct alignment tool just so you can find mismatches? If you are, you are either a glutton for punishment or absolutely silly, or both, and hopefully brilliant regardless. ;) To align two sequences just use pairwise BLAST or the EMBOSS Smith-Waterman tool. Whatever you do, I recommend that you check out bioperl (http://www.bioperl.org or find it on CPAN in the Miscellaneous->Bio namespace) v1.02 first. You can save loads of development time and be using stronger code that has been tested and improved by lots of people. There is a lot of useful functionality. Granted, some of it is the bleeding edge of development so there are a lot of bugs and it is improved constantly. But the basics of bioinformatics, like converting between formats (EMBL to FASTA for example) and BLASTing (and more importantly, parsing BLAST output), and calling/parsing EMBOSS tools have all been around for a while and so are (dare I say it) trustworthy in bioperl. The code is also readable and easy to program. To convert from raw to FASTA, for example:
        use Bio::SeqIO; my $instream = Bio::SeqIO->new( -file => 'your_in_file', -format => 'raw' ); my $outstream = Bio::SeqIO->new( -file => '>>your_out_file', -format => 'Fasta' )'; while ( my $sequence = $instream->next_seq() ) { $outstream->write_seq( $sequence ); }
        Take it from someone who has been down this road: it's hard to worm your way into understanding how to use bioperl's many interlocking modules but once you define the (I would bet) small subset that you need, which is the hard part, any time spent is time saved in the long run. Don't reinvent the wheel. Good luck. :)
Re: trivial wrapping
by Spudnuts (Pilgrim) on Nov 18, 2002 at 15:08 UTC
Re: trivial wrapping
by mce (Curate) on Nov 18, 2002 at 15:06 UTC
    Hi,
    This is also non destructive, and uses a regexp:
    ( $_ = $genome1 ) =~ s/(.{60})/print $1."\n"/eg;
    Note the print statement in the regex using the e flag.

    I don't know which is the most performant, but I do think that substr is more performant than a plain regex.
    But regexes are more fun :-)
    ---------------------------
    Dr. Mark Ceulemans
    Senior Consultant
    IT Masters, Belgium

Re: trivial wrapping
by snafu (Chaplain) on Nov 18, 2002 at 19:36 UTC
    How about this?

    use strict; my $width = 60; while ( <DATA> ) { chomp; my ($len,$more); my $char; next if ( /^[^\w\d]/ ); $char = substr($_,0,1); if ( ($len = length()) < $width ) { $more = $width - $len; print $_ . $char x $more,"\n"; } else { print "$_\n"; } } ## __DATA__ 111111111111111111111 222222222222222222222 333333333333333333333 111111111111111111111 222222222222222222222 333333333333333333333
    output was:
    perl monks1.pl 111111111111111111111111111111111111111111111111111111111111 222222222222222222222222222222222222222222222222222222222222 333333333333333333333333333333333333333333333333333333333333 111111111111111111111111111111111111111111111111111111111111 222222222222222222222222222222222222222222222222222222222222 333333333333333333333333333333333333333333333333333333333333

    _ _ _ _ _ _ _ _ _ _
    - Jim
    Insert clever comment here...