Here is what I've found out so far.
Firstly, my original, smaller script had been modified (it was referring to a different input file than previously). So I switched it back as accurately as I can. However, I did realize that in the part of the error that refers to <GENO> the line number given is always the last line of the input file (I tried deleting lines to test this).
Just focusing on my original (smaller) program for now, the code for this is as follows:
#!/usr/bin/perl
use warnings;
use strict;
use v5.14;
use Bio::PopGen::IO;
use Bio::PopGen::Statistics;
open my $out_file, ">", "chr22_exome_snps_processed_AMR_TRUNCATED_STAT
+S"
or die "Can't open output file: $!\n";
my $io = new Bio::PopGen::IO(
-format => 'csv',
-file => "chr22_exome_snps_processed_AMR_TRUNCATED"
);
my @markers;
my @samples;
while ( my $ind = $io->next_individual ) {
if ( $ind =~ /^SAMPLE/ ) {
push @markers, $ind;
}
else {
push @samples, [$ind];
}
}
my $segsites = Bio::PopGen::Statistics->segregating_sites_count( \@s
+amples );
my $singletons = Bio::PopGen::Statistics->singleton_count( \@samples )
+;
my $pi = Bio::PopGen::Statistics->pi( \@samples );
my $theta = Bio::PopGen::Statistics->theta( \@samples );
my $tajima_D = Bio::PopGen::Statistics->tajima_D( \@samples );
my $D_star = Bio::PopGen::Statistics->fu_and_li_D_star( \@samples
+);
my $F_star = Bio::PopGen::Statistics->fu_and_li_F_star( \@samples
+);
say $out_file "Population: AMR\tChromosome: 22_TRUNCATED";
say $out_file
"Seg sites\tSingletons\tPi\tTheta\tTajima's D\tFu & Li F*\tFu & Li D
+*";
say $out_file
"$segsites\t$singletons\t$pi\t$theta\t$tajima_D\t$F_star\t$D_star";
And the output is now:
Can't call method "isa" on unblessed reference at /usr/local/share/per
+l/5.14.2/Bio/PopGen/Statistics.pm line 901, <GEN0> line 3
I am using a truncated input file here, to test the general approach. This input file has a header line and two lines with CSV genetic data from separate individiduals, i.e. a 3-line input file, with the error referencing
<GENO> line 3.