TJCooper has asked for the wisdom of the Perl Monks concerning the following question:
I have a script that essentially extracts DNA sequences at specific locations and piles them up to calculate the frequency of each letter (A/G/C/T):
use strict; use warnings; use Bio::SeqIO; use Getopt::Long; use File::Basename; use List::Util qw(all); my $scriptname = basename($0); #Obtain script-name my $outext = '.txt'; #Output .file-extension my ($peaks, $fasta, $width, $mode, $exclusion, %list, $chr, $min, $max +, %sequences); my $usage = "Usage: $scriptname -i <HistogramFile> -r <ReferenceFASTA> + -w <Width> -m <Mode: Pos/Freq> -e <ExclusionList (Optional)>"; + #Error/usage message GetOptions('i=s' => \$peaks, 'r=s' => \$fasta, 'e=s' => \$exclusion, 'w=i' => \$width, 'm=s' => \$mode) or die("\n$usage\n"); die("\nError: Arguments or -flags are missing and/or incorrectly speci +fied.\n\n$usage\n\n") unless all {defined} $peaks, $fasta, $width, $m +ode; my @files = glob($peaks."/*.txt"); my $chk = scalar(@files); print "\nFailed to detect any valid files within the current directory +.\n\n" if $chk == 0; exit if $chk == 0; if (defined $exclusion) { open my $IN, '<', $exclusion or die "$!"; while (<$IN>) { chomp $_; ($chr, $min, $max) = split('\s+', $_); push @{$list{$chr}}, [$min,$max]; } } my $seqio = Bio::SeqIO->new(-file => $fasta); while(my $seqobj = $seqio->next_seq) { my $id = $seqobj->display_id; my $seq = $seqobj->seq; $sequences{$id} = $seq; } for my $file (@files) { open my $IN2, '<', $file or die "$!"; (my $sample = basename($file)) =~ s/.txt//; my $dir = dirname($file); my $outfile = $dir."/".$sample."_SeqBias_"."$mode".$outext; open my $OUT, '>', $outfile or die "$!"; my (@flankseq, $seqEX); sub seqstore { my @rcd = @_; if ($mode eq "Pos") { push (@flankseq, $rcd[0]); } elsif ($mode eq "Freq") { push (@flankseq, $rcd[0]) for (1..$rcd[1]); } return; } <$IN2> for (1..1); while (<$IN2>) { chomp $_; my(@F) = split("\t", $_); next if grep {$F[1] >= $_->[0] && $F[1] <= $_->[1]} @{ $list{$ +F[0]}}; next if $F[1]-$width < 1; if ($F[3] > 0) { $seqEX = substr($sequences{$F[0]},$F[1]-1-$width,$width) # +Extracts Xbp upstream (not incl. 5' end) .substr($sequences{$F[0]},$F[1]-1,$width+1); $seqEX =~ tr/GATC/CTAG/; $seqEX = reverse($seqEX); seqstore($seqEX,$F[3]); } if ($F[2] > 0) { $seqEX = substr($sequences{$F[0]},$F[1]-1-$width,$width) # +Extracts Xbp upstream (not incl. 5' end) .substr($sequences{$F[0]},$F[1]-1,$width+1); seqstore($seqEX,$F[2]); } } my %freq; my $inc = 1 / @flankseq; for my $FS (@flankseq) { $freq{substr $FS, $_, 1}[$_] += $inc for 0..length($FS)-1; } printf($OUT "%s\t"."%s\t" x ((keys %freq)-1)."%s"."\n", "Pos",sort + keys %freq); my $label = -$width-1; for my $pos (1..length $flankseq[0]) { $label++; printf($OUT "%d\t"."%7.4f\t" x ((keys %freq)-1)."%7.4f"."\n", +$label,map{$freq{$_}[$pos-1] // 0} sort keys %freq); } } my $run_time = time() - $^T; print "\n-------------------------------------"; print "\nAnalysis Complete\n"; print "Execution Time: $run_time Seconds\n"; print "-------------------------------------\n\n";
I've recently tried to add batch processing into the script so it can handle more than 1 input file at a time. However, i've run into a problem that I cannot figure out:
Illegal division by zero at line 80, <$IN2>Line 80 is:
my $inc = 1 / @flankseq;When handling the first file, @flankseq is populated as expected (I push the extracted sequences into an array).
However when it comes to opening the second file - which is an identical copy of the first - and repeating the process, @flankseq remains empty leading to the error. Everything else seems to work up until that point and all variables such as $seqEX look as expected.
I have no idea why! Any suggestions?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Array ending up empty when processing more than a single file
by Eily (Monsignor) on Oct 05, 2017 at 13:02 UTC | |
by Your Mother (Archbishop) on Oct 05, 2017 at 16:12 UTC | |
by Eily (Monsignor) on Oct 05, 2017 at 16:44 UTC | |
by LanX (Saint) on Oct 05, 2017 at 17:35 UTC | |
by Your Mother (Archbishop) on Oct 05, 2017 at 18:01 UTC | |
by Eily (Monsignor) on Oct 06, 2017 at 01:36 UTC | |
| |
Re: Array ending up empty when processing more than a single file
by choroba (Cardinal) on Oct 05, 2017 at 13:02 UTC |
Back to
Seekers of Perl Wisdom