http://www.perlmonks.org?node_id=983347

mariannapac has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,I'm a new user for linux and perl....I have a problem with a perl scrip to parse blast results.This is the script:

############################################## ############ BLAST PARSING ################### ############################################## use strict; use warnings; use Bio::SearchIO; #Ferma il processo se gli argomenti allo script sono diversi da 3 die "Usage: $0 <BLAST report file> <number-of-top-hits> <output-file> +\n", if (@ARGV!=3 ); my ($infile,$numHits,$outfile) = @ARGV; print "Parsing BLAST results \n"; my $in = Bio::SearchIO -> new ( -format=>'blast', -file=>$infile ); open (OUT, ">$outfile" ) or die "Cannot open $outfile :$!" #print the header info for tab-delimitated columns print OUT "query_name\tquery_length\taccession_number\tlength\tdescrip +tion\tE value\tbit score\tframe\tquery_start\t"; print OUT "query_end\thit_start\thit_end\tidentical\tHSP_length\tHSP_E + value\n"; ### Estrazione dei risultati in maniera ricorsiva: while( my $result = $in -> next_result ) { #nome della sequence QUERY: print OUT $result-> query_name . "\t"; #lunghezza della QUERY: print OUT $result-> query_length; #"No Hits found" case: if ($result->num_hits == 0) { print OUT "\n No hits found \n"; } else { my $count = 0; #Processing heach HIT recursively... while(my $hit = $result -> next_hit) { print OUTFILE "\t" if ($count > 0); #get the accession number of the hits: print OUTFILE "\t" . $hit->accession . "\t"; #get the sequence length of the hit: print OUT $hit -> length . "\t"; #Get the description of the HIT sequence: print OUT $hit -> description . "\t"; #GEt the Evalue of the HIT: print OUT $hit -> significance . "\t"; #Get the score of the HIT: print OUT $hit -> bits . "\t"; my $hspcount = 0; #Process the top HSP from the top list: while( my $hsp = $hit -> next_hsp) { #Metto i separatori perche per raccogliere i dati di ogni singolo HSP +(livello più basso) devo dirgli di iniziare scrivere direttamente nel +la colonna numero 8 lasciando libere le precedenti) print OUT "\t\t\t\t\t\t\t", if ($hspcount > 0); #Get the frame of the query sequence: print OUT $hsp->query->frame . "\t"; #Get the start and the end of the QUERY in the ali +gnment: print OUT $hsp-> start('query') . "\t" . $hsp->end +('query'). "\t"; #Get the start and the end of the HIT in the align +ment: print OUT $hsp-> start('hit') . "\t" . $hsp->end(' +hit'). "\t"; #Get the similarity value: NOTA BENE: UTILE SOLAM +ENTE PER BLAST SU SEQUENZE AMINOACIDICHE!!!! #printf OUT "%.1f" , ($hsp->frac_conserved * 100); #print OUT "%\t"; #Get the similarity value: printf OUT "%.1f" , ($hsp->frac_identical * 100); print OUT "%\t"; #Get the lenght of the HSP and the E-value: print OUT $hsp-> length('total') . "\t"; print OUT $hsp-> expect . "\t"; print OUT "\n"; # questo ci dice che deve andare + a capo! $hspcount++; } $count++; #flow control for the number of hits needed: @In +pratica significa che il secondo argomento dato allo script ($numHits +) è il numero di Hit richieste, # una volta raggiunto il numero richiesto di Hits +per una certa query, lo script passa al 'Result' successivo. last if ($count == $numHits); } } } close OUT; print "DONE!!!\n"; #NOTE; # Un singolo risultato di BLAST è composto da più HIT # Può accadere che una certa HIT contenga piu di un allineamento: cioè + più HSPs ## RESULT-->HIT--->HSPS ### Ogni pacchetto di BIOPERL (Bio::SearchIO) come in questo caso, è c +aratterizzato da una serie di funzioni proprie che lavorano sugli ogg +etti creati nel pacchetto stesso #Ogni oggetto (HSP, Result, HIT) ha le sue proprie funzioni! (vedi www +.bioper.org/wiki/HOWTO::SearchIO/) ##USAGE: # >Perl.exe blast.parsing.pl blast.output.txt 10 blast.output.top10.tx +t

I launched it with /usr/bin$ perl blast.parsing.pl blastoutput 20 output This is the error message: syntax error at blast.parsing.pl line 27, near "print" Execution of blast.parsing.pl aborted due to compilation errors. What happened? Please help me. Thanks in advance

Replies are listed 'Best First'.
Re: syntax error near print
by moritz (Cardinal) on Jul 24, 2012 at 10:05 UTC
Re: syntax error near print
by marto (Cardinal) on Jul 24, 2012 at 10:07 UTC
    open (OUT, ">$outfile" ) or die "Cannot open $outfile :$!"

    Is missing a semi colon:

    open (OUT, ">$outfile" ) or die "Cannot open $outfile :$!";

      I added semicolon at prevoius line as you suggested but now there is a new error message: Parsing BLAST results Cannot open 429F04: Permesso negato at blast.parsing.pl line 24. 429F04 is the output name, just as example... thanks again for your suggestion

        I guess this is a 'permission denied' message, in Italian. Ensure that the value you have for $outfile is what you think it should be and that you can create this file manually. If you have permissions issues check with your sysadmin, perhaps you don't have access to write files to the directory in question.

Re: syntax error near print
by GrandFather (Saint) on Jul 25, 2012 at 21:17 UTC

    A couple of style comments may help:

    Use three parameter open and lexical file handles (open my $out, '>', "$outfile") or die "Cannot open $outfile :$!";). The lexical file handle would have caught your OUT/OUTFILE issue at compile time.

    Use "early exits" to make code clearer and to avoid extra levels of nesting. The case in point is your "No hits found" test. By rewriting it as:

    if ($result->num_hits == 0) { #"No Hits found" case: print OUT "\n No hits found \n"; next; }

    you don't require the else part of the statement - that code simply follows the if and is executed only when the if condition is false (the next following the print statement effectively skips the following code).

    True laziness is hard work