http://www.perlmonks.org?node_id=181291

dimmesdale has asked for the wisdom of the Perl Monks concerning the following question:

Hi. I have a problem. I trying to write a program to do some stuff with excel, and I'm using SpreadSheet::WriteExcel. The problem, however, lies with the data structures somewhere, I think.

I have data that looks like this:

TS21CCCFN30NT 64.77758146 63.02739802 92.64926038 90.68286 +212 0.043905373 0.061903769 TS1BL 53.5822136 62.62041357 71.03715952 71.16929716 5. +76499E-05 0.000314962 TS2BL 71.59567647 73.35932057 94.41989575 91.04652518 6 +.57045E-05 0.000587082 TS3BL 58.14568074 66.51773348 83.63006331 81.97901606 7 +.04488E-05 0.000387897 TS5BL 59.19892796 67.52430356 94.26526376 93.67353814 7 +.75011E-05 0.000316832 TS7BL 66.55656211 68.29674759 93.10938298 90.82023765 6 +.69424E-05 0.000200692 TS9BL 58.95471459 68.27327459 73.35588307 75.75919131 7 +.10676E-05 0.000211444 TS13BL 71.58954203 78.41099584 92.95413775 93.01392565 +6.37719E-05 0.000246732 TS15BL 54.77861261 57.36971063 93.07097848 90.77079334 +6.88732E-05 0.000205062
The number that follows TS is the test subject (identified by that number). Following that is the condition (one of 13). The data needs to be sorted by groups. There are three: low-hour vfr, low-hour instruments, and high-hour instruments (corresponding to certain test subjects; specifics in code). For these, I want to find the average for each condition, and display it in Excel.

This is the chunk that's giving me some problems. (the error in the tite)

for(0..3) { #here ($sums[$i])->[$_][$j] += $vals[$_+1]; if ($ts >= 21) { $sums[$i]->[4][$j] += $vals[5]; $sums[$i]->[5][$j] += $vals[6]; } } #and here $totals[$i]->[$j]->[0]++;
Here's the complete code to see how the ADTs evolve. (This is just a very preliminary 'test' to see what will work before I start doing some other things, but suggestions for better data types would be appreciated).
#!/usr/bin/perl -w use strict; use Spreadsheet::WriteExcel; my @groups = ([2,3,7,8,9,15,18,21,22,23,24,25,26,27], # Low hour vfr, low hour instruments [4,10,12,16,17,19], [1,5,6,11,13,14,20]); # high hour instruments my @cats = ('BL','BLSTDGG','PR1','PRFN1','PRFN3','PRFN30','EBG1', 'EBGFN1','EBGFN3','EBGFN30','CCFN1','CCFN30','CCFN30NT'); my $file = 'C:\WINDOWS\Profiles\chemphysio\Desktop\Test data\TEST\comp +lete_ordered2.txt'; open (TABFILE,$file) or die "$file: $!"; # Create a new Excel workbook my $workbook = Spreadsheet::WriteExcel->new('TEST.xls'); my $worksheet = $workbook->addworksheet(); my (@avgs,@sums,@totals); my @data = <TABFILE>; my $row = 0; for my $r (0..2) { for my $c (0..3) { for my $cat (0..$#cats) { ($sums[$r])->[$c][$cat] = 0; if ($r == 0) { ($sums[$r])->[4][$cat] = 0; ($sums[$r])->[5][$cat] = 0; } } } } for my $i (0..2) { for my $cat (0..$#cats) { $totals[$i]->[$cat]->[0] = 0; if ($i == 0) { $totals[$i]->[$cat]->[1] = 0; $totals[$i]->[$cat]->[1] = 0; } } } for (@data) { chomp; my @vals = split('\t', $_); $vals[0] =~ s/.txt//; my ($ts,$cat) = $vals[0] =~ /TS(\d+)(.+)$/; my ($i,$j); # Some of the data has slight errors in it (these were # the filenames, though they represent the condition) $cat =~ s/CCCFN30NT/CCFN30NT/; $cat =~ s/BLSTDGS/BLSTDGG/; $cat =~ s/BLSTDGG?/BLSTDGG/; for(0..$#cats) { if ($cats[$_] =~ /$cat/) { $j = $_ } } for(@{$groups[0]}) { if ($ts == $_) { $i = 0 } } for(@{$groups[1]}) { if ($ts == $_) { $i = 1 } } for(@{$groups[2]}) { if ($ts == $_) { $i = 2 } } # A basic representation of @sums (@avgs too) # REST|IMC ... represents one of six values (the sums) # condition is the array index for @cats # $sums[GROUP] -> [(REST|IMC)(HRT|SKT|(EMG)?) -> # [CONDITION] for(0..3) { ($sums[$i])->[$_][$j] += $vals[$_+1]; if ($ts >= 21) { $sums[$i]->[4][$j] += $vals[5]; $sums[$i]->[5][$j] += $vals[6]; } } $totals[$i]->[$j]->[0]++; if ($ts >= 21) { $totals[$i]->[$j]->[1]++ } my $col = 0; for (@vals) { $worksheet->write($row, $col, $_); $col++; } $row++; } for my $r (0..2) { for my $c (0..3) { for my $cat (0..$#cats) { $avgs[$r]->[$c][$cat] = $sums[$r]->[$c][$cat]/$totals[$r]->[$cat +]->[0]; if ($r == 0) { $avgs[$r]->[4][$cat] = $sums[$r]->[4][$cat]/$totals[$r]->[$cat +]->[1]; $avgs[$r]->[5][$cat] = $sums[$r]->[5][$cat]/$totals[$r]->[$cat +]->[1]; } } } } my $format = [ [@cats], $avgs[0]->[0], $avgs[0]->[1], $avgs[0]->[2], $avgs[0]->[3], $avgs[0]->[4], $avgs[0]->[5], ]; $worksheet->write_row('J6', $format);

I've worked with data dumper EXTENSIVELY, to a point where I'm pretty sure it has to do with the data STRUCTURE and not the data VALUES.

2002-07-12 Edit by zdog: Added READMORE tag