Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Write to multiple files

by julio_514 (Acolyte)
on Feb 06, 2012 at 20:23 UTC ( [id://952152]=perlquestion: print w/replies, xml ) Need Help??

julio_514 has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks, I wrote a script that loops through a reference file that have potentially a specific keyword on each lines. I have a list of these keywords and each time I have a match, I want to print the line of the reference file to a unique txt output file.
my $ref_db = FastaDb->new($fasta); while( my $seq = $ref_db->next_seq() ) { foreach(@ko_array){ if($seq->header() =~ m/($_)/){ open(OUT, ">>".$outdir.$_.".fasta"); print OUT $seq->header()."\n".$seq->seq()."\n"; close(OUT); } } }
What I'd like to do is to avoid opening fh each time I have a hit, but rather open them all before the while loop.

Replies are listed 'Best First'.
Re: Write to multiple files
by kcott (Archbishop) on Feb 06, 2012 at 21:17 UTC

    Trying to guess which files you'll need beforehand is not a good idea. Here's a solution that opens the files you need on demand and closes them all at the end of the script.

    #!/usr/local/bin/perl # multifileout.pl use strict; use warnings; my @file_ends = qw{./ .fasta}; my %handle = (); my @ko_array = qw(abc def ghi); while (<DATA>) { chomp; my ($head, $seq) = split / /; for my $ko (@ko_array) { if ($head =~ m/$ko/) { if (! exists $handle{$head}) { open my $fh, q{>>}, join($ko, @file_ends); $handle{$head} = $fh; } print { $handle{$head} } qq{$head\n$seq\n}; } } } map { close $handle{$_} } keys %handle; __DATA__ abc a123 def d123 abc a456 ghi g123 ghi g456 def d456

    Here's the results:

    ken@ganymede: ~/tmp $ ls -l *.fasta ls: *.fasta: No such file or directory ken@ganymede: ~/tmp $ multifileout.pl ken@ganymede: ~/tmp $ ls -l *.fasta -rw-r--r-- 1 ken staff 18 7 Feb 08:29 abc.fasta -rw-r--r-- 1 ken staff 18 7 Feb 08:29 def.fasta -rw-r--r-- 1 ken staff 18 7 Feb 08:29 ghi.fasta ken@ganymede: ~/tmp $ cat abc.fasta abc a123 abc a456 ken@ganymede: ~/tmp $ cat def.fasta def d123 def d456 ken@ganymede: ~/tmp $ cat ghi.fasta ghi g123 ghi g456 ken@ganymede: ~/tmp $

    -- Ken

      You can simplify this: map { close $handle{$_} } keys %handle; to close $_ for values %handle;


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

      while (<DATA>) { chomp; my ($head, $seq) = split / /;

      That would be better as:

      while ( <DATA> ) { my ($head, $seq ) = split;


      my @file_ends = qw{./ .fasta}; ... open my $fh, q{>>}, join($ko, @file_ends);

      With three argument open you don't need to prepend the file name with "./" and you should always verify that the file successfully opened before trying to use a possibly invalid filehandle:

      my file_end = '.fasta'; ... open my $fh, '>>', "$ko$file_end" or die "Cannot open +'$ko$file_end' because: $!";
      Many thanks!:D It works fine now. I realized I could speed up thing modifying some code and using a hash. Here's what I came up with.
      my %handle = (); my $ref_db = FastaDb->new($fasta) or die("Unable to open Fasta file, $ +fasta\n"); while( my $seq = $ref_db->next_seq() ) { if($seq->header() =~ m/(K[0-9]{5})/){ my $ko = $1; if(!exists $handle{$ko}){ open my $fh, ">>".$outdir.$ko.".fasta"; $handle{$ko} = $fh; } print {$handle{$ko}} $seq->header()."\n".$seq->seq()."\n"; } } map { close $handle{$_} } keys %handle;
Re: Write to multiple files
by kielstirling (Scribe) on Feb 06, 2012 at 21:14 UTC
    Hi,

    open the file before you enter the while loop

    #!/usr/bin/perl -w use strict; use IO::File; my $out_fh = IO::File->new(">>/tmp/output.log"); die "failed to open /tmp/output.log" unless defined $out_fh; my $ref_db = FastaDb->new($fasta); while( my $seq = $ref_db->next_seq() ) { foreach(@ko_array) { print $out_fh $seq->header()."\n".$seq->seq()."\n" if $seq->header() =~ m/($_)/; } } $out_fh->close;

    Update: I did fail to notice that you are generating the file name within the loop .

    Ken's example of storing open file handles is the way to go.

    Back to zzzzzzzzzzzzzzzzzz for me!!

Re: Write to multiple files
by MidLifeXis (Monsignor) on Feb 07, 2012 at 13:50 UTC

    Depending on the number of output files you have, you may need to limit the number of open file handles that you have. This, however, depends on your OS, configuration, and other factors outside of Perl. See FileCache if this becomes an issue.

    --MidLifeXis

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://952152]
Approved by kcott
Front-paged by MidLifeXis
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2024-03-29 13:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found