Help De-bugging

GeneGeek has asked for the wisdom of the Perl Monks concerning the following question:

Hi! I'm using a script that is not my own and I've run into a bug that is beyond my de-bugging capabilities. Here is a trucated example of a data entry in my data file, notice how there is no 'Sample'

 
HAC10188
Name    HAC1018
Extension    .fsa

Dye Green1
Allele Count 298
Size    100    100.5    101    102.1    102.6    103.3    104.1
NChar    297
Allele    0$0    0$0    1$1503    0$0    0$0    0$0
[download]

The script I'm using to analyze this data is giving me the following error:

Use of uninitialized value $nchar in subtraction (-) at AFLP_replicate_difference_v1.1.pl line 192.

Here's the bit of code in the script that is giving me trouble, line 192 is the last line:

my $tax_sample = $data->{$taxon_label}{'Samples'}{$dye};
            my $rep_sample = $data->{$rep_label}  {'Samples'}{$dye};
            my $nchar = ${$tax_sample->{'NChar'}}[0];
            # I don't set these row sum variables to zero, because I w
+ant them undefined ifthe row is empty.
            my ($row_sum, $row_sum_abs, $row_sum_sq);
            my ($cnt00, $cnt01, $cnt10, $cnt11) = (0) x 4;
            my ($first_site, $last_site) = $use_filtered ? @{$threshol
+ds->{$dye}{'indices'}} : (0, $nchar-1);
[download]

$tax_sample and $rep_sample aren't initialized because 'Samples' doesn't exist. I think its the result of an error in the script (authored by the same person that authored the script I'm using here) I used to generate my data file. Its leaving the word 'Samples' out somewhere. I'm just not sure where because I'm not entirely sure what

my $rep_sample = $data->{$rep_label} {'Samples'}{$dye};

is doing. Where should the text 'Samples' be in my data file for this to work?

I'll post the entire script below if anyone needs that

#!/usr/bin/perl

# AFLP Replicate Difference Calculator
# Version 1.1
# Prints a table of differences between AFLP replicates for a given pa
+rameter.
# Forms part of the "Genotyping Utilities Package"
# Copyright (C) 2007  Warwick P M Allen
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110
+-1301, USA.
#
#
# For author's details see http://awcmee.massey.ac.nz/aflp/aflp.html

##############
# CHANGE LOG #
##############
#
# 2007-04-23 WPMA: Created.

use warnings FATAL => 'all';
use strict;
use genotyping_utilities qw( &canonpath &openForWriting &readGTR &getR
+epsTable &transpose &isReal &getReplicateGroups );

sub safeDiv($$) {defined( $_[0] ) && defined( $_[1] ) && $_[1] ? $_[0]
+/$_[1] : undef}

our $USAGE = "$0 [options] <replicates table file> [<gtr file>]\n";

my $HELP = <<HELP;
$USAGE
  AFLP Replicate Difference Calculator  is a utility calculates  the d
+ifference
  of some parameter (peak height, for example) between replicates in a
+ table of
  AFLP data.  This script takes two inputs:  the AFLP data and a table
+ that de-
  clares which  samples are replicates  of which other samples.  The A
+FLP  data
  must be  in the  "native"  format as  exported by the  "Genotyper Re
+arranger"
  script (which can be found at http://awcmee.massey.ac.nz/aflp/aflp.h
+tml). The
  table of  replicates is a tab-delimited  text file  in which the sam
+ple names
  (that is the "Sample File" column for GeneMapper data) that are repl
+icates of
  each other appear on the same row.  If no file name is specified for
+ the AFLP
  data, then it is read from standard input.

  The output of this script  is a tab-delimited table  that has a row 
+ for each
  pair of replicates that have been compared. The first two columns co
+ntain the
  replicate names, the remaining columns contain the values of the sam
+ple named
  in  the second column subtracted from  the values of the sample name
+d  in the
  first column.  The script writes  the output to  the terminal (stand
+ard out);
  this can be redirected to a file using the redirection symbol ">".

  A typical command line might look something like:
./AFLP_replicate_difference.pl -p "Peak Area" -r 0.01 rep_table.txt Ge
+notypes_Table.gtr >Size_differences.txt

  You can even feed the output of Genotyper Rearranger straight into t
+he Repli-
  cate Difference Calculator:
./genotyper_rearranger.pl gtr -o - 'Genotypes Table.txt' | ./AFLP_repl
+icate_difference.pl -p Size rep_table.txt >Size_differences.txt


  Note that the file 'genotyper_rearranger.pm' must be in the path (ju
+st put it
  in the same directory as this script).

Options:
    -a          Display differences  for all sites.  The default  is t
+o display
    --all-sites only the differences where the site has been called di
+fferently
                for each of the two replicates under test.
    -f         Only consider the sites that weren't filtered out by Ge
+neMapper.
    --filter   Note that columns will still exist for filtered sites, 
+ but they
               will be empty.
    -h
    --help     Display this help.
    -p <difference parameter>
    --parameter <difference parameter>
               Use  <difference parameter>  as the parameter to compar
+e between
               the replicates.  Note that this  is case-sensitive and,
+  to keep
               generality,  there is  no checking to  ensure it is  a 
+canonical
               field name.  Check in the input file  to make sure that
+ you have
               exactly  the same field name.  Several difference param
+eters may
               be requested.  If no difference parameter is given,  "H
+eight" is
               used.
    -o <base output file name>
    --base-output-name <base output file name>
               Selects the first part of the name of the output files.
    -Q
    --remove-quotes
               Strip single quotes from around sample labels in  the r
+eplicates
               table file.  If this  is selected,  then tab  character
+s  inside
               single quote marks  will be treated as part of  the sam
+ple label
               (but doing this is strongly discouraged  because it wil
+l confuse
               other programs). The default is to treat every characte
+r that is
               is a tab as part of the sample label.
    -r <real>
    --rounding <real>
               Rounds  the differences to  the nearest <real>.  The ro
+unding is
               done last, so round-off errors are not propagated throu
+gh to the
                 sums at the end of each row and column.
    --strip-extensions
               Removes the extension (that is, the last dot and every 
+character
               that follows it) from the sample labels given in the re
+petitions
               table file.  This can be  useful because,  for GeneMapp
+er input,
               the  GenoTyper Rearranger script uses the  "Sample File
+"  field,
               stripped of its extension, as the sample's label.
    -T
    --transpose Transpose the output.

HELP

# Option defaults.
my $all_sites = 0;                    # If false, only print differenc
+es where the site has been called
                                      # differently in each of the two
+ replicates under test.
my $indicate_reps_using = undef;      # If this is defined, it is the 
+string that delimits the sample name from
                                      # the replicate identifier.  # T
+ODO
my @difference_parameters = ();
my $rounding = undef;
my %reps_table_options = (
    'remove_quotes' => 1,
    'strip_extensions'=>0
);
my $use_filtered = 0;
my $transpose = 0;
my $base_fname;

# Process options.
while (@ARGV && $ARGV[0] =~ /^-/) {
    $_ = shift;
    # split single-letter options that are joined together
    /^-([^-]\S+)/            and do {unshift @ARGV, map "-$_", split /
+/, $1}      or
    /^-(?:a|-all-sites)/     and do {$all_sites = 1}                  
+            or
    /^-(?:f|-filter)/        and do {$use_filtered = 1}               
+            or
    /^-(?:h|-help)/          and do {print( $HELP ), exit}            
+            or
    /^-(?:p|-parameter)/     and do {push @difference_parameters, shif
+t; 1}       or
    /^-(?:o|-base-output-name)/ and do {$base_fname = canonpath shift;
+ 1}         or
    /^-(?:Q|-remove-quotes)/ and do {$reps_table_options{'strip_extens
+ions'} = 1} or
    /^-(?:r|-rounding)/      and do {$rounding = shift; 1}            
+            or
    /^--strip-extensions/    and do {$reps_table_options{'strip_extens
+ions'} = 1; 1} or
    /^-(?:T|-transpose)/     and do {$transpose = 1}                  
+            or
    /^-/                     and die "Unknown option `$_'.\nType `$0 -
+-help' for available options.";
}

# Height is the default parameter for which to calculate the differenc
+es.
push @difference_parameters, 'Height' if !@difference_parameters;


# Get hash table of replicates: each key in the hash table is a samle 
+name, and points to an array of all of the
# other sample names that are replicates of each other. The array incl
+udes the sample name that is the hash key.
# Many hash keys can point to the same array.
my $reps_table = getRepsTable \%reps_table_options, $indicate_reps_usi
+ng, @ARGV ? shift() : undef;

# Read in the data.
my ($thresholds, $data, $comments);
{
    if (@ARGV) {
        my $gtr_fname = canonpath $ARGV[0];
        open GTR_IN, $gtr_fname or die "Cannot open `$gtr_fname' for r
+eading: $!";
    } else {
        print STDERR "Reading data from standard input ...\n";
        *GTR_IN = *STDIN;
    }
    ($thresholds, $data, $comments) = readGTR *GTR_IN;
    close GTR_IN;
}

# Find the maximum number of sites over all the dyes.
my $global_nchar = 0;
foreach my $dye (keys %$thresholds) {
#    my $nchar = ${$thresholds->{$dye}{$use_filtered ? 'nchar (filtere
+d)': 'nchar'}}[0];
    my $nchar = ${$thresholds->{$dye}{'nchar'}}[0];
    $global_nchar = $nchar if $nchar > $global_nchar;
}

# Generate the difference table.
my (%differences, %totals, %grand_totals);
foreach my $difference_parameter (@difference_parameters) {
    foreach my $dye (keys %$thresholds) {
        foreach my $replicate_group (getReplicateGroups( $data, $reps_
+table )) {
            next if @$replicate_group < 2;  # Only one replicate in th
+is group.
            my ($taxon_label, $rep_label) = @$replicate_group;
            my $tax_sample = $data->{$taxon_label}{'Samples'}{$dye};
            my $rep_sample = $data->{$rep_label}  {'Samples'}{$dye};
            my $nchar = ${$tax_sample->{'NChar'}}[0];
            # I don't set these row sum variables to zero, because I w
+ant them undefined ifthe row is empty.
            my ($row_sum, $row_sum_abs, $row_sum_sq);
            my ($cnt00, $cnt01, $cnt10, $cnt11) = (0) x 4;
            my ($first_site, $last_site) = $use_filtered ? @{$threshol
+ds->{$dye}{'indices'}} : (0, $nchar-1);
            my @differences = (undef) x $nchar;
            for (my $i = $first_site; $i <= $last_site; $i++) {
                !${$tax_sample->{'Allele'}}[$i] && !${$rep_sample->{'A
+llele'}}[$i] and $cnt00++ or
                !${$tax_sample->{'Allele'}}[$i] &&  ${$rep_sample->{'A
+llele'}}[$i] and $cnt01++ or
                 ${$tax_sample->{'Allele'}}[$i] && !${$rep_sample->{'A
+llele'}}[$i] and $cnt10++ or
                 ${$tax_sample->{'Allele'}}[$i] &&  ${$rep_sample->{'A
+llele'}}[$i] and $cnt11++ ;
                next unless $all_sites || (${$tax_sample->{'Allele'}}[
+$i] xor ${$rep_sample->{'Allele'}}[$i]);
                my $tax_val = ${$tax_sample->{$difference_parameter}}[
+$i]; next if !defined $tax_val || $tax_val =~ /^\s*$/;
                my $rep_val = ${$rep_sample->{$difference_parameter}}[
+$i]; next if !defined $rep_val || $rep_val =~ /^\s*$/;
                $differences[$i] = $tax_val - $rep_val;
                $row_sum                                         +=   
+   $differences[$i]   ;
                $row_sum_abs                                     += ab
+s( $differences[$i] ) ;
                $row_sum_sq                                      +=   
+   $differences[$i]**2;
            }
            # Sanity check.
            die "Count mismatch: $cnt00 + $cnt01 + $cnt10 + $cnt11 != 
+$last_site - $first_site + 1"
                if $cnt00 + $cnt01 + $cnt10 + $cnt11 != $last_site - $
+first_site + 1;
            $differences{$difference_parameter}{$dye}{$taxon_label}{$r
+ep_label} = {
                'Differences'      => \@differences,
                'Sum'              => $row_sum,
                'Sum of Absolutes' => $row_sum_abs,
                'Sum of Squares'   => $row_sum_sq,
                'No. of Sites'     => $nchar,
                'No. of Sites without Trailing All-Zero Sites'   => ($
+{$tax_sample->{'Allele Count'}}[0]), # HACK
                'No. of Sites after Filtering'                   => ($
+last_site - $first_site + 1), # HACK
                'First Site'       => $first_site,
                'Last Site'        => $last_site,
                'No. of Internal All-Zero Sites'                 => ${
+$thresholds->{$dye}{'all-zero site count'}}[0],
                'No. of Internal All-Zero Sites after Filtering' => ${
+$thresholds->{$dye}{'all-zero site count (filtered)'}}[0],
                '(00)'             => $cnt00,
                '(01)'             =>          $cnt01,
                '(10)'             =>                   $cnt10,
                '(11)'             =>                            $cnt1
+1,
                '(01+10)'          => (         $cnt01 + $cnt10       
+  ),
                '(01+10+11)'       => (         $cnt01 + $cnt10 + $cnt
+11),
                '(00+01+10+11)'    => ($cnt00 + $cnt01 + $cnt10 + $cnt
+11),
                '(Sum of Abs)/(01+10)'      => safeDiv( $row_sum_abs, 
+         $cnt01 + $cnt10          ),
                '(Sum of Abs)/(01+10+11)'   => safeDiv( $row_sum_abs, 
+         $cnt01 + $cnt10 + $cnt11 ),
                '(Sum of Abs)/(00+01+10+11)'=> safeDiv( $row_sum_abs, 
+$cnt00 + $cnt01 + $cnt10 + $cnt11 )
                }
        }

        foreach my $first_label (keys %{$differences{$difference_param
+eter}{$dye}}) {
            foreach my $second_label (keys %{$differences{$difference_
+parameter}{$dye}{$first_label}}) {
                my $x = $differences{$difference_parameter}{$dye}{$fir
+st_label}{$second_label};
                foreach (keys %$x) {
                    if (ref $x->{$_} eq 'ARRAY') {
                        if (!exists $totals{$difference_parameter}{$dy
+e}{$_}) {
                            $totals{$difference_parameter}{$dye}{$_} =
+ [(undef) x @{$x->{$_}}];
                        }
                        for (my $i = 0; $i < @{$x->{$_}}; $i++) {
                            defined( ${$x->{$_}}[$i] ) && isReal( ${$x
+->{$_}}[$i] ) and
                            ${$totals{$difference_parameter}{$dye}{$_}
+}[$i] += ${$x->{$_}}[$i];
                        }
                    } else {
                        defined( $x->{$_} ) && isReal( $x->{$_} ) and
                        $totals{$difference_parameter}{$dye}{$_} += $x
+->{$_};
                    }
                }
            }
        }
        foreach (keys %{$totals{$difference_parameter}{$dye}}) {
            defined( $totals{$difference_parameter}{$dye}{$_} ) and
            $grand_totals{$difference_parameter}{$_} += $totals{$diffe
+rence_parameter}{$dye}{$_};
        }
    }
}

my %error_rate;
foreach (@difference_parameters) {
    $error_rate{$_} = [
        safeDiv( $grand_totals{$_}{'(01+10)'}, $grand_totals{$_}{   '(
+01+10+11)'} ),
        safeDiv( $grand_totals{$_}{'(01+10)'}, $grand_totals{$_}{'(00+
+01+10+11)'} ),
        safeDiv( $grand_totals{$_}{'(01+10)'},
            (defined $grand_totals{$_}{'No. of Internal All-Zero Sites
+ after Filtering'} ?
                $grand_totals{$_}{   '(01+10+11)'} - $grand_totals{$_}
+{'No. of Internal All-Zero Sites after Filtering'} : undef) ),
        safeDiv( $grand_totals{$_}{'(01+10)'},
            (defined $grand_totals{$_}{'No. of Internal All-Zero Sites
+ after Filtering'} ?
                $grand_totals{$_}{'(00+01+10+11)'} - $grand_totals{$_}
+{'No. of Internal All-Zero Sites after Filtering'} : undef) )
        ]
}

# Round.
if (defined $rounding) {
    my $round = sub (;$) {
        local $_ = @_ ? $_[0] : $_;
        isReal and sprintf( '%d', $_/$rounding )*$rounding;
    };
    foreach my $difference_parameter (@difference_parameters) {
        foreach my $dye (keys %{$differences{$difference_parameter}}) 
+{
            foreach my $first_rep (keys %{$differences{$difference_par
+ameter}{$dye}}) {
                foreach my $second_rep (keys %{$differences{$differenc
+e_parameter}{$dye}{$first_rep}}) {
                    my $diff = $differences{$difference_parameter}{$dy
+e}{$first_rep}{$second_rep};
                    foreach (keys %$diff) {
                        if (ref $diff->{$_} eq 'ARRAY') {
                            $_ = &$round foreach @{$diff->{$_}}
                        } else {
                            $diff->{$_} = &$round( $diff->{$_} );
                        }
                    }
                }
            }
            foreach (keys %{$totals{$difference_parameter}{$dye}}) {
                if (ref $totals{$difference_parameter}{$dye}{$_} eq 'A
+RRAY') {
                    map {&$round()} @{$totals{$difference_parameter}{$
+dye}{$_}};
                } else {
                    $totals{$difference_parameter}{$dye}{$_} = &$round
+( $totals{$difference_parameter}{$dye}{$_} );
                }
            }
            foreach (keys %{$totals{$difference_parameter}{$dye}}) {
                $grand_totals{$difference_parameter}{$_} = &$round( $g
+rand_totals{$difference_parameter}{$_} );
            }
        }
        map {$_ = &$round()} @{$error_rate{$difference_parameter}};
    }
}

# Generate the output matrices.
my %output;
foreach my $difference_parameter (@difference_parameters) {
    push @{$output{$difference_parameter}}, ["Difference table for $di
+fference_parameter"], [];
    push @{$output{$difference_parameter}}, [ map {"'$_'"}
            'First replicate',
            'Second replicate',
            'Dye',
            (0..$global_nchar+1), #HACK
            'Sum',
            'Sum of Absolutes',
            'Sum of Squares',
            'No. of Sites',
            'No. of Sites without Trailing All-Zero Sites',
            'No. of Sites after Filtering',
            'First Site',
            'Last Site',
            'No. of Internal All-Zero Sites',
            'No. of Internal All-Zero Sites after Filtering',
            '(00)',
            '(01)',
            '(10)',
            '(11)',
            '(01+10)',
            '(01+10+11)',
            '(00+01+10+11)',
            '(Sum of Abs)/(01+10)',
            '(Sum of Abs)/(01+10+11)',
            '(Sum of Abs)/(00+01+10+11)'
    ];
    foreach my $dye (keys %{$differences{$difference_parameter}}) {
        foreach my $first_rep (keys %{$differences{$difference_paramet
+er}{$dye}}) {
            foreach my $second_rep (keys %{$differences{$difference_pa
+rameter}{$dye}{$first_rep}}) {
                my $diff = $differences{$difference_parameter}{$dye}{$
+first_rep}{$second_rep};
                push @{$output{$difference_parameter}}, [
                    $first_rep,
                    $second_rep,
                    $dye,
                    map( {defined() ? $_ : ''} @{$diff->{'Differences'
+}} ),
                    map( {defined() ? $_ : ''} map( {$diff->{$_}}
                            'Sum',
                            'Sum of Absolutes',
                            'Sum of Squares',
                            'No. of Sites',
                            'No. of Sites without Trailing All-Zero Si
+tes',
                            'No. of Sites after Filtering',
                            'First Site',
                            'Last Site',
                            'No. of Internal All-Zero Sites',
                            'No. of Internal All-Zero Sites after Filt
+ering',
                            '(00)',
                            '(01)',
                            '(10)',
                            '(11)',
                            '(01+10)',
                            '(01+10+11)',
                            '(00+01+10+11)',
                            '(Sum of Abs)/(01+10)',
                            '(Sum of Abs)/(01+10+11)',
                            '(Sum of Abs)/(00+01+10+11)'
                        ) )
                ];
            }
        }
        my @totals_row = ('Sum', undef, $dye);
        foreach (
                    'Differences',
                    'Sum',
                    'Sum of Absolutes',
                    'Sum of Squares',
                    'No. of Sites',
                    'No. of Sites without Trailing All-Zero Sites',
                    'No. of Sites after Filtering',
                    'First Site',
                    'Last Site',
                    'No. of Internal All-Zero Sites',
                    'No. of Internal All-Zero Sites after Filtering',
                    '(00)',
                    '(01)',
                    '(10)',
                    '(11)',
                    '(01+10)',
                    '(01+10+11)',
                    '(00+01+10+11)',
                    '(Sum of Abs)/(01+10)',
                    '(Sum of Abs)/(01+10+11)',
                    '(Sum of Abs)/(00+01+10+11)'
        ) {
            if (ref $totals{$difference_parameter}{$dye}{$_} eq 'ARRAY
+') {
                push @totals_row, @{$totals{$difference_parameter}{$dy
+e}{$_}}
            } else {
                push @totals_row, $totals{$difference_parameter}{$dye}
+{$_}
            }
        }
        push @{$output{$difference_parameter}}, [], \@totals_row, [], 
+[];
    }

    # Transpose the table.
    $output{$difference_parameter} = transpose $output{$difference_par
+ameter} if $transpose;

    push @{$output{$difference_parameter}},
        [],    ["Totals Over All Dyes"], [],
        map( {["'$_'", '', $grand_totals{$difference_parameter}{$_}]}
            'Sum',
            'Sum of Absolutes',
            'Sum of Squares',
            'No. of Sites',
            'No. of Sites without Trailing All-Zero Sites',
            'No. of Sites after Filtering',
            'First Site',
            'Last Site',
            'No. of Internal All-Zero Sites',
            'No. of Internal All-Zero Sites after Filtering',
            '(00)',
            '(01)',
            '(10)',
            '(11)',
            '(01+10)',
            '(01+10+11)',
            '(00+01+10+11)',
        ),
        ["'(01+10)/(01+10+11)'",    '', ${$error_rate{$difference_para
+meter}}[0]],
        ["'(01+10)/(00+01+10+11)'", '', ${$error_rate{$difference_para
+meter}}[1]],
        ["'(01+10)/((01+10+11) - (No. of Internal All-Zero Sites after
+ Filtering))'",    '', ${$error_rate{$difference_parameter}}[2]],
        ["'(01+10)/((00+01+10+11) - (No. of Internal All-Zero Sites af
+ter Filtering))'", '', ${$error_rate{$difference_parameter}}[3]],
        [];
}

# Get the base file name.
$base_fname = getBaseFName( @ARGV && $ARGV[0], $comments ) unless defi
+ned $base_fname;

# Print.
{
    my $output_fname = $base_fname . '.rdc';
    *OUT = defined $output_fname ? openForWriting $output_fname : *STD
+OUT;
    local ($\,$,) = ("\n","\t");
    foreach my $difference_parameter (@difference_parameters) {
        print OUT map {defined() ? $_ : ''} @$_ foreach @{$output{$dif
+ference_parameter}};
        print OUT "";
    }
    close( OUT ),
    print( STDERR "Wrote `$output_fname'." ) if defined $output_fname;
}

#my %heights;
## Create "reps-in-row.heights" table.
#{
#    local ($,, $/) = ("\t", "\n");
#    my $file_number = 1;
#    foreach my $replicate_group (@rep_groups) {
#        next if @$replicate_group < 2;
#        my $number_of_site_for_this_group = 0;
#        $number_of_site_for_this_group += ${ $data->{(keys %$data)[0]
+}{'Samples'}{$_}{'NChar'} }[0] foreach keys %$threshold_indices;
#        foreach my $dye (keys %$threshold_indices) {
#            my $nchar = ${ $data->{(keys %$data)[0]}{'Samples'}{$dye}
+{'NChar'} }[0];
#            for (my $i = 1; $i <= $nchar; $i++) {
#                my $height_is_registered =
#                    defined( ${$replicate_group}[0] ) && isReal( ${$d
+ata->{${$replicate_group}[0]}{'Samples'}{$dye}{'Height'}}[$i] ) &&
#                    defined( ${$replicate_group}[1] ) && isReal( ${$d
+ata->{${$replicate_group}[1]}{'Samples'}{$dye}{'Height'}}[$i] );
#                my $height_meets_threshold = $height_is_registered &&
#                    ${$data->{${$replicate_group}[0]}{'Samples'}{$dye
+}{'Height'}}[$i] >= $peak_height_threshold &&
#                    ${$data->{${$replicate_group}[1]}{'Samples'}{$dye
+}{'Height'}}[$i] >= $peak_height_threshold;
#                if ($height_meets_threshold) {
#                    push @{$heights{$dye}}, [
#                        $dye,
#                        $i,
#                        @$replicate_group,
#                        (defined ${$replicate_group}[0] ? ${$data->{$
+{$replicate_group}[0]}{'Samples'}{$dye}{'Height'}}[$i] : ''),
#                        (defined ${$replicate_group}[1] ? ${$data->{$
+{$replicate_group}[1]}{'Samples'}{$dye}{'Height'}}[$i] : ''),
#                        (defined ${$replicate_group}[0] && defined ${
+$replicate_group}[1] ?
#                        abs( ${$data->{${$replicate_group}[0]}{'Sampl
+es'}{$dye}{'Height'}}[$i] -
#                             ${$data->{${$replicate_group}[1]}{'Sampl
+es'}{$dye}{'Height'}}[$i] ) : '' )
#                    ];
#                }
#            }
#        }
#    }
#    
#    $file_number = 1;
#    foreach my $dye (keys %$threshold_indices) {
#        *OUT = openForWriting $base_fname.'.reps-in-row.heights'.$fil
+e_number.'.tab.txt';
#        print OUT 'Dye', 'Site', 'Replicate 1', 'Replicate 2', 'Heigh
+t 1', 'Height 2', 'Height Difference';
#        $heights{$dye} = [sort {${$a}[6] <=> ${$b}[6]} @{$heights{$dy
+e}}];
#        map {print OUT @$_} @{$heights{$dye}};
#    close OUT;
#    print STDERR "Wrote `".$base_fname.'.reps-in-row.heights'.($file_
+number++).'.tab.txt'."'.";
#    }
#}
[download]

This is the subroutine readGTR() from the genotyping_utilities.pm script

# Reads in GTR file.
# Takes a filehandle opened for reading.  Returns two hash references.
+  One contains information specific to
# particular dyes: the threshold values and indices (the threshold det
+ermine which sites get filtered out) and
# the number of character in the filtered and unfiltered ...
sub readGTR($) {
    *GTR_IN = shift;
    local $/ = getLineEnding *GTR_IN;

    my (%thresholds, %data, %comments, @labels);
    local $_;

    while (<GTR_IN>) {
        chomp;
        $comments{$.} = $1,
            next if /^#(.*)/;
        next if /^$/;
        last if /^\%DATA%$/;
        my $dye = $_;
        while (<GTR_IN>) {
            chomp;
            $comments{$.} = $1,
                next if /^#(.*)/;
            last if /^$/;
            my ($key, $value) = /^\s*(\w.*?):\t(.*)$/ or die "Error re
+ading line $.:\n$_\n";
            # Have to append a dummy character to $value to force spli
+t() to return trailing empty fields.
            @{$thresholds{$dye}{$key}} = map {$_ eq 'undef' ? undef : 
+$_} split /\t/, $value.chr(0);
            substr( ${$thresholds{$dye}{$key}}[-1], -1 ) = ""; # Remov
+e the dummy character.
        }
    }

    while (<GTR_IN>) {
        chomp;
        next if /^$/ || /^#/;
        my $taxon_label = $_;
        push @labels, $taxon_label;    # To record the order of the ta
+xa.
        while (<GTR_IN> =~ /\s*(.*?)\t(.*)$/) {
            $data{$taxon_label}{$1} = $2;
        }
        my $dye;
        while (<GTR_IN>) {
            chomp;
            next if /^#/;
            last if /^$/;
            ($dye) = /^\s*Dye\s+(.*)/;
            while (<GTR_IN>) {
                chomp;
                next if /^#/;
                last if /^$/;
                my @values = split /\t/;
                my $key = shift @values;
                $data{$taxon_label}{'Samples'}{$dye}{$key} = \@values;
            }
        }
    }
    return \%thresholds, \%data, \%comments, \@labels;
}
[download]

Comment on Help De-bugging Select or Download Code

Replies are listed 'Best First'.
Re: Help De-bugging by 1nickt (Canon) on Sep 04, 2015 at 20:36 UTC
Hi GeneGeek, Working backwards ... `192 my ($first_site, $last_site) = $use_filtered ? @{$thresholds +->{$dye}{'indices'}} : (0, $nchar-1);` [download] `188 my $nchar = ${$tax_sample->{'NChar'}}[0];` [download] `186 my $tax_sample = $data->{$taxon_label}{'Samples'}{$dye};` [download] 158 my ($thresholds, $data, $comments); 159 { 160 if (@ARGV) { 161 my $gtr_fname = canonpath $ARGV[0]; 162 open GTR_IN, $gtr_fname or die "Cannot open `$gtr_fname' for r +eading: $!"; 163 } else { 164 print STDERR "Reading data from standard input ...\n"; 165 GTR_IN = STDIN; 166 } 167 ($thresholds, $data, $comments) = readGTR GTR_IN; 168 close GTR_IN; 169 } [download] `34 use genotyping_utilities qw( &canonpath &openForWriting &readGTR &g +etRepsTable &transpose &isReal &getReplicateGroups );` [download] So it looks like the hash is created by the `readGTR()` sub in `genotyping_utilities` . . . The way forward always starts with a minimal test.*	[reply] [d/l] [select]
Re^2: Help De-bugging by GeneGeek (Initiate) on Sep 04, 2015 at 21:00 UTC
I also have genotyping_utilities.pm. I have it in the directory with this script per the instructions in the script. Do I need to load the module or something?	[reply]
Re^3: Help De-bugging by 1nickt (Canon) on Sep 04, 2015 at 21:05 UTC
no, it is loaded by the use statement. It builds the hash, but there is no key in the hash corresponding to the one used in the code. You'd have to look at the other code to figure out why. Sorry,i am no longer at my computer. The way forward always starts with a minimal test.	[reply]
Re: Help De-bugging by 1nickt (Canon) on Sep 04, 2015 at 22:35 UTC
Basically what Perl is telling you is that there is data missing where you expect there to be some. Tracing down the code is one way to find out which bit of data it is, but a first step would be to dump the instance of the hash that you do have, using Data::Dumper right before the line that the warning refers to. This should show you which record in your dataset has the missing data, and by examining the record you'll see the problem. Then you can decide whether to ignore the warning or add code to handle the situation. Add to the top of your script: `use Data::Dumper; $Data::Dumper::Sortkeys = 1;` [download] Then add right before the line that produces the error (should be line 194 after the change above): `my ($cnt00, $cnt01, $cnt10, $cnt11) = (0) x 4; # add two lines print Dumper $data; die "missing \$nchar\n" if not $nchar; my ($first_site, $last_site) = $use_filtered ? @{$thresholds->{$dye}{' +indices'}} : (0, $nchar-1);` [download] This will print the data structure for each record and then die when you encounter the missing information, so you can compare the most recent record with the ones before it that didn't cause a warning message. Hope this helps! PS Please edit your OP and use the readmore tags! Update: Added Dumper code example The way forward always starts with a minimal test.	[reply] [d/l] [select]
Re: Help De-bugging by AnomalousMonk (Archbishop) on Sep 04, 2015 at 21:13 UTC
`<readmore> ... </readmore>` tags are always appreciated by the Monks when one posts "wall of code" listings like the AFLP Replicate Difference Calculator script. Please see Markup in the Monastery and, for more detail, Writeup Formatting Tips. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: Help De-bugging by GeneGeek (Initiate) on Sep 04, 2015 at 21:16 UTC
Will do, thanks!	[reply]
Re: Help De-bugging by stevieb (Canon) on Sep 04, 2015 at 20:23 UTC
Welcome to PerlMonks, GeneGeek! This: `my $rep_sample = $data->{$rep_label} {'Samples'}{$dye};` [download] ...shouldn't have the whitespace in it. perl ignores whitespace, but it should be: `my $rep_sample = $data->{$rep_label}{'Samples'}{$dye};` [download] That piece of code is looking deep into a hash (reference) of hashes, of hashes. Parent has been updated with full code... ~~What we need to see is the code that processes the data file and extracts the information into the hash.~~	[reply] [d/l] [select]
Re^2: Help De-bugging by GeneGeek (Initiate) on Sep 04, 2015 at 20:57 UTC
Thanks! I had an inkling it was something like that.	[reply]


No such thing as a small change
	PerlMonks