In simple terms the single > means to open a file from output starting it from the beginning, while the double >> means to open a file for output but to append what you are about to write at the end of anything that already exists. This write-append method will also open a non existent file and start writing from the beginning
In your case i would think that the write-append mode is dangerous. The files could fill up with repeated identical sequences because the id existed in more than one input file or you just ran the program a second time, for in write-append mode each run will just append the new data at the end of the existing file.
There are ways to tell if you have already encountered that id and wrote it to a file already. i might do it like this
foreach $id (keys %id2seq){
if (-f $id) { print $id." already exists. about to overwrite i
+t\n";}
open my $out_fh, '>', $id or die $!; ##Amendment here
print $out_fh ($id."\n",$id2seq{$id}, "\n");
close $out_fh; ## moved into the foreach loop
}
Note the change to write-from-beginning, and the test to see if the file already exists. This way there would only be the last sequence found in any give file.
|