Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Renaming headers works for some files but fails for some files

by MVRS (Acolyte)
on May 09, 2013 at 10:14 UTC ( #1032752=perlquestion: print w/replies, xml ) Need Help??
MVRS has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks ,following code works changing headers for some

files but for some files
use strict; use warnings; #my $count = 1; #my $file; my @files = <xxxx/*>; foreach my $file(@files){ open (FILE,"$file") or die "$!"; my $outfile = $file."out"; open (OFILE,'>',"$outfile") or die "$!"; my $count = 1; while (<FILE>) { chomp; if( s/^>\S+/>contig_$count/) { $count++; } if( /^(\S+)$/) {print OFILE"$1\n\n"} } }

input file

>ATCC33693_scaffold0

AAAAAAGAGAGAAACACTAGCTTCTCTCTTGTTATGAGCTTGGCAAATCCATACTCTCCC AGGCCGCTTCCAGCCAAGTACCATCAGCGTATATGGGCTTAACTTCTAGGTTCGGAATGT AACTAGGTGTACCCCCATAGCTATACTCACCAAGCATATATATTGTATCACATAAAGTTA

>ATCC33693_scaffold1

AAAAAAGAGAGAAACACTAGCTTCTCTCTTGTTATGAGCTTGGCAAATCCATACTCTCCC AGGCCGCTTCCAGCCAAGTACCATCAGCGTATATGGGCTTAACTTCTAGGTTCGGAATGT GTTATGAGCTTGGCAAATCCATACTCTCCCAGGCCGCTTCCAGCCAAGTA

output file

>contig_1

AAAAAAGAGAGAAACACTAGCTTCTCTCTTGTTATGAGCTTGGCAAATCCATACTCTCCC AGGCCGCTTCCAGCCAAGTACCATCAGCGTATATGGGCTTAACTTCTAGGTTCGGAATGT AACTAGGTGTACCCCCATAGCTATACTCACCAAGCATATATATTGTATCACATAAAGTTA

>contig_2

AAAAAAGAGAGAAACACTAGCTTCTCTCTTGTTATGAGCTTGGCAAATCCATACTCTCCC AGGCCGCTTCCAGCCAAGTACCATCAGCGTATATGGGCTTAACTTCTAGGTTCGGAATGT GTTATGAGCTTGGCAAATCCATACTCTCCCAGGCCGCTTCCAGCCAAGTA

i need to change for large number of files but some files fails and the output is empty file please help me where am going wrong

Thanks

Replies are listed 'Best First'.
Re: Renaming headers works for some files but fails for some files
by Random_Walk (Prior) on May 09, 2013 at 10:21 UTC

    You only print lines to your output file when they contain no whitespace.

    if( /^(\S+)$/) {print OFILE"$1\n\n"}

    From your input that may not be what you want. Are you after the first contiguous sequence on each line AFTER the first text with the scaffold<num> or just the entire line with the first part altered?

    It would also be good practice to explicitly close your files when you are finished reading/writing them.

    Update

    # this may help while (my $line = <FILE>) { # The entire line with a substitution # if( $line =~ s/^>\S+\/>contig_$count) { # Substitution and 1st contiguous sequence if( $line =~ s/^>\S+\s+(\S+)/>contig_$count $1\n/) { print OFILE $line; $count++; } } close FILE; close OFILE;

    Cheers,
    R.

    Pereant, qui ante nos nostra dixerunt!
Re: Renaming headers works for some files but fails for some files
by choroba (Bishop) on May 09, 2013 at 10:21 UTC
    How does the input file look like that produces the empty output?
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Renaming headers works for some files but fails for some files
by bioinformatics (Friar) on May 09, 2013 at 20:16 UTC

    To save yourself some grief, I would simply look for > and then replace the line, regardless of what else is there (which is what you are doing anyway). You can also use  print $variable and sleep(3); in parts of the code to help see what is present in certain places, particularly if you don't get a match. We don't have the input files, so we can't tell if there are spaces or tabs in the FASTA headers.

    use strict; use warnings; my @files = glob("*.fa"); for my $file ( @files ) { # open the input file open my $in, "<", "$file" or die "Cannot open "$file": $!\n"; # open the output file open my $out, ">", "$file.out" or die "Cannot open "$file.out": $!\ +n"; # reset contig number my $contig_number = 1; while ( <$in> ) { chomp; if ( $_ ~= m/^>/ ) { # it's a header print $out ">config_$contig_number\n\n"; $contig_number++; } else { # it's sequence print $out "$_\n\n"; } close $in; close $out; } } my $file_count = @files; print "Successfully processed $file_count files!\n"; exit;
    Bioinformatics
      Thank you very much i made changes accordingly to my script it works.......... may i know your mail id please

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1032752]
Approved by Corion
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2018-07-20 01:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (422 votes). Check out past polls.

    Notices?