Pathologically Eclectic Rubbish Lister PerlMonks

to calculate mean and variance

by cdfd123 (Initiate)
 on Jan 11, 2008 at 02:29 UTC Need Help??
cdfd123 has asked for the wisdom of the Perl Monks concerning the following question:

Suppose u have a file
```A)1.8  2.5  3.8  1.9  -3.5  -3.5  3.2  -3.9  4.2  4.5  2.8
B)-1.3  -0.9  -0.7  -0.4  -0.8  -3.5  -3.5  -1.6  -4.5
<c>
continue....
That is in a file where in each  line have to calculate mean and varia
+nce where A, B can ignored as just saying that u have  separate line
+where each line have  input data points in one  file

program is
<code>
#!/usr/bin/perl -w
open( FILE, "< file_1" ) or die "Can't open file_1 : \$!";
while( <FILE> ){
@fields = split / /;
for(\$i=0;\$i < scalar(@fields); \$i++ ){
\$sum[\$i]+=\$fields[\$i];
\$sumsq[\$i]+=\$fields[\$i]*\$fields[\$i];
}
\$n++;
}

for(\$i=0;\$i < scalar(@sum); \$i++ ){
\$sum[\$i] /= \$n; \$sumsq[\$i] /= \$n;
\$stddev = sqrt( \$sumsq[\$i] - \$sum[\$i]*\$sum[\$i] );
print( \$sum[\$i]." ".\$stddev." " );
}
close FILE
</code>
But while running the program
error occured
<c>
rgument "" isn't numeric in addition (+) at meanStddev.pl line 6, <FIL
+E> line 1.
Argument "" isn't numeric in addition (+) at meanStddev.pl line 6, <FI
+LE> line 1.
Argument "" isn't numeric in addition (+) at meanStddev.pl
Argument "\n" isn't numeric in addition (+) at meanStddev.pl line 6, <
+FILE> line 3.
0.573333333333333 0.867998975933856 0 0 0.833333333333333 1.1785113019
+7758 0 0 1.26666666666667 1.79133717900592 0 0 0.633333333333333 0.89
+566858950296 0 0 -1.16666666666667 1.64991582276861 0 0 -1.1666666666
+6667 1.64991582276861 0 0 1.06666666666667 1.5084944665313 0 0 -1.3 1
+.83847763108502 0 0 1.4 1.97989898732233 0 0 1.5 2.12132034355964 0 0
+ 0.933333333333333 1.31993265821489 0 0 -0.433333333333333 0.61282587
+7028341 0 0 -0.3 0.424264068711929 0 0 -0.233333333333333 0.329983164
+553722 0 0 -0.133333333333333 0.188561808316413 0 0 -0.26666666666666
+7 0.377123616632825 0 0 -1.16666666666667 1.64991582276861 0 0 -1.166
+66666666667 1.64991582276861 0 0 -0.533333333333333 0.754247233265651
+ 0 0 -1.5 2.12132034355964
any suggestions

20080114 Janitored by Corion: Changed bold tags to code tags

Replies are listed 'Best First'.
Re: to calculate mean and variance
by davidrw (Prior) on Jan 11, 2008 at 03:01 UTC
couple general suggestions:
• use strict; At first, it'll generate a bunch of errors for you for undeclared variables, but it will help identify & reduce errors.
e.g. the warnings indicate that one of the values it's trying to add to the sum is a string .. so in the first for loop put something like print Dumper  [\$i, \$fields[\$i]] if \$fields[\$i] =~ /s/;
Hmm .. actually, it might be your split -- try split(' ') instead -- see the split() docs for full info, including that "split(/ /)" will give you as many null initial fields as there are leading spaces. so it could be that your data file has leading spaces in it somewhere.

Here's a little refactoring example, too, to demo several more "perlish" constructs:
```#!/usr/bin/perl -w
use strict;
my @sum;
my @sumsq;
my \$n = 0;
while( <DATA> ){
my @fields = split / /;
foreach my \$i ( 0..\$#fields ){
\$sum[\$i] += \$fields[\$i];
\$sumsq[\$i] += \$fields[\$i]**2;
}
\$n++;
}
\$_ /= \$n for @sum, @sumsq;
my @stddev = map { sqrt( \$sumsq[\$_] - \$sum[\$_]**2 ) } 0 .. \$#sum;

print join(" ", @stddev) . "\n";

#foreach my \$i ( 0 .. \$#sum ){
#  printf "%5s %10s %10s\n", \$i, \$sum[\$i], \$stddev[\$i];
#}

__DATA__
1.8 2.5 3.8 1.9 -3.5 -3.5 3.2 -3.9 4.2 4.5 2.8
-1.3 -0.9 -0.7 -0.4 -0.8 -3.5 -3.5 -1.6 -4.5
Thanks may be misconception atyually i want to calculate along the rows __DATA__ 1.8 2.5 3.8 1.9 -3.5 -3.5 3.2 -3.9 4.2 4.5 2.8 -----> calculate mean and variance -1.3 -0.9 -0.7 -0.4 -0.8 -3.5 -3.5 -1.6 -4.5 ---> calculate mean and variance regards
First, can you use <code></code> tags to display your data? It's hard to see exactly what you're using when it's normal text ...

Ah -- along rows is even easier .. here's an example:
```#!/usr/bin/perl -w
use strict;
my @stddev;
while( <DATA> ){
my @fields = split ' ', \$_;
my \$N = scalar(@fields);
my \$sum = 0;
\$sum += \$_ for @fields;
my \$mean = \$sum / \$N;
\$sum = 0;
\$sum += (\$_ - \$mean)**2 for @fields;
push @stddev, sqrt(\$sum/\$N);
}
print map {"\$_\n"} @stddev;

__DATA__
1.8 2.5 3.8 1.9 -3.5 -3.5 3.2 -3.9 4.2 4.5 2.8
-1.3 -0.9 -0.7 -0.4 -0.8 -3.5 -3.5 -1.6 -4.5
Re: to calculate mean and variance
by graff (Chancellor) on Jan 11, 2008 at 03:20 UTC
What davidrw said about changing the split statement is bound to be the solution. I'd also point out this part of the error message:
```... at meanStddev.pl line 6, <FILE> line 1
The message tells you not only where in your script you had a problem (line 6 of the perl code: \$sum[\$i]+=\$fields[\$i];), but also which line of data in your input data file had just been read when the error occurred. That is, the initial space is in line 1 of the data.
Re: to calculate mean and variance
by CountZero (Bishop) on Jan 11, 2008 at 06:26 UTC
Or if you do not want to reinvent a wheel: Statistics::Lite, Statistics::Basic or Statistics::Descriptive.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Create A New User
Node Status?
node history
Node Type: perlquestion [id://661777]
Approved by jettero
help
Chatterbox?
and God said, "Let Newton be!"...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (7)
As of 2017-06-28 21:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How many monitors do you use while coding?

Results (648 votes). Check out past polls.