Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

File::Wrap and Fasta format

by clairudjinn (Beadle)
on Mar 12, 2003 at 07:20 UTC ( [id://242272]=CUFP: print w/replies, xml ) Need Help??

Wrap sequences for widely used Fasta format, without the hassle of installing BioPerl and learning to use Bio::SeqIO or rolling your own sub. Not difficult other than being so obvious; a true kick-yourself.
use Text::Wrap; $Text::Wrap::columns = 51; #includes terminal \n ... print $fh ">$description\n"; print $fh wrap('', '', $sequence_string."\n");

Replies are listed 'Best First'.
Re: File::Wrap and Fasta format
by aging acolyte (Pilgrim) on Mar 13, 2003 at 12:21 UTC
    I have always liked a simple regex to format for fasta

    for 80 nt per line:

    $seq =~ s/([^\n]{80}|[^\n]{1,79}$)/$1\n/g; print $out ">$description\n$seq"; close $out;

    A.A.

Re: File::Wrap and Fasta format
by robsv (Curate) on Mar 25, 2003 at 21:41 UTC
    Here's another shorter regex (although it's not doing any substitution):
    ... print $fh ">$description\n"; print $fh "$_\n" foreach ($seq =~ /.{1,50}/g); ...

    The fastest way I've found of dumping a FASTA sequence (with very large sequences, anyways), is with substr:
    ... print $fh ">$description\n"; print $fh substr($seq,0,50,'') . "\n" while ($seq); ...
    The thing to watch here is that it's destructive - $seq will be empty when it finishes.

    - robsv

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://242272]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (5)
As of 2024-04-24 07:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found