Hi Grand Father and other Perl Monks,
Thank you Grand Father. Your program has worked for basecount in 299 MB file. But when I have tried to add a do-untilloop to the end of this program to find the number of motifs and to find lengths between motifs with a <STDIN> input for motif (with 2- 100 letters), I am not getting correct results in either cmd or output text file. I have initialized several variables. But I am confused about how to correct the cmd warning i.e."my" variable $motifmasks earlier declaration in same scope at C:\Documents and Settings\user\Desktop\basecount.pl line 58” so that I get correct results of inter-motif lengths vertically in text output page.
I have given the program below along with the cmd and text output results. I shall appreciate if you kindly help in this matter.
#!usr/bin/perl
use strict;
use warnings;
if (! @ARGV) {
print <<HELP; # Line 5
Usage:
> basecount.pl <bases file>
HELP
exit;
} # Line 10
open my $dnaIn, '<', $ARGV[0] or die "Can't open bases file $ARGV[0]:
++$!\n";
my %counts;
my @baseList = qw(A T G C);
while (defined (my $line = <$dnaIn>)) {
chomp $line; # Line 15
++$counts{$_} for grep {/\S/} split '', $line;
}
my $bases;
my $errors;
$bases += $_ for @counts{@baseList}; # Line 20
$errors += $_ for map {$counts{$_}} grep {! /[ATGC]/} keys %counts;
print "\n\n Total bases: $bases\n\n";
print join (', ', map {"$_= $counts{$_}"} @baseList), "\n";
print "Errors (N)= $errors\n" if $errors; # Line 24
# In a loop, ask the user for a motif, search for the motif, and r
+eport if it was found.Exit if no motif is entered.Line 25
my $DNA=join('',@ARGV); my $motif=''; # Line 26
do {
print "\n\nEnter a motif to count its number and lengths between motif
+s:\n";# Line 28
$motif = <STDIN>;# Line 29
# Remove the newline at the end of $motif
chomp $motif;
# Look for the motif Line 32
if ( $DNA=~ / $motif/ ) {
print "I found the motif!\n\n"; # Line 34
} else {
print"I couldn\'t find it.\n\n";
} #Line 37
# Count number of motifs and Count number of nt between two motifs
use 5.010; # Line 39
my $string ="@ARGV";
# Remove whitespace Line 41
$string=~ s/\s//g;
my $count= () =$string=~ /$motif/g; # Line 43
print "Number of motifs: $count.\n\n";
say "The inter-motif nt Lengths are:\n"; # Line 45
say length for split/$motif/,$string;
my @a=map length,split/$motif/,$string;
# exit on an empty user input Line 48
my $output="result .txt";
unless (open(RESULT,"> $output")){
print"Cannot open file\"$output\".\n\n";# Line 51
exit;
} # Line 53
print RESULT"\n\n Number of bases: $bases. Errors(N)=$errors.\n
Motif: $motif. Number of motifs: $count.\n\n The inter-motif nt Lengt
+hs are:\n\n @a";
close(RESULT);# Line 56
}
until (my $motif =~ /^\s*$/ );# Line 58
exit;
Cmd output as follows:
Microsoft Windows XP [Version 5.1.2600]
(C) Copyright 1985-2001 Microsoft Corp.
C:\Documents and Settings\user>cd d*
C:\Documents and Settings\user\Desktop>basecount.pl t.txt
"my" variable $motif masks earlier declaration in same scope at C:\Doc
+uments and
Settings\user\Desktop\basecount.pl line 58.
Total bases: 36
A= 9, T= 11, G= 6, C= 10
Errors (N)= 3
Enter a motif to count its number and lengths between motifs:
AT
I couldn't find it.
Number of motifs: 0.
The inter-motif nt Lengths are:
5
Use of uninitialized value $motif in pattern match (m//) at C:\Documen
+ts and Set
tings\user\Desktop\basecount.pl line 27, <STDIN> line 1.
C:\Documents and Settings\user\Desktop>
Text Output as follows:
Number of bases: 36. Errors(N)=3.
Motif: AT. Number of motifs: 0.
The inter-motif nt Lengths are:
5