wickedjester has asked for the
wisdom of the Perl Monks concerning the following question:
Hello!
I've got an array with 40 elements with each element having a value of '0.001'. If I add them all together and divide by 40 to get the average, I get something like:
$Avg = 0.001025
rather than 0.001, which is what it should really return.
Now, the script I'm righting is a chemical diffusion model dealing with very small numbers and this kind of inaccuracy is causing me problems. If this is a floating point issue, can anyone give me a recommendation on how to deal with this?
Many thanks!
Re: Is this odd behavior a floating point problem? by Anonymous Monk on Mar 23, 2012 at 17:02 UTC 
 [reply] 

Yes, I've seen and skimmed this document, but, not to be rude, but I'm not interested in becoming a computer scientist in order to write a script to do basic math. Adding together 0.001 40 times is pretty basic and if my calculator can do it, I not sure I understand why Perl won't.
 [reply] 

It's not a Perl problem, it's the problem of representing a nonterminating series for base 2 using a finite number of binary digits. You are accustomed to seeing it in base ten when you try to represent 1/3rd, yet I hear no complaints that your ten counting fingers are malfunctioning. Your calculator gets it right by rounding to the eight or ten significant digits that you see on its little LCD display. In other words, it really doesn't get it right; it just covers up the ugliness. And, in fact, I see a nearly identical question every day in reference to C, C++, PHP, and myriad other programming languages over on StackOverflow. It's not a problem unique to Perl.
I understand that the "What every computer scientist should know..." article is a little beyond what someone who just wants to get the job done might want to digest. That's fine, the article goes into painful details. Try this response (shameless plug, I wrote it), which tries to spell it out in less technical terms: Re: shocking imprecision.
 [reply] 

 [reply] 

Yes, I've seen and skimmed this document, but, not to be rude, but I'm not interested in becoming a computer scientist in order to write a script to do basic math. Adding together 0.001 40 times is pretty basic and if my calculator can do it, I not sure I understand why Perl won't.
:) Try site:perlmonks.org What Every Computer Scientist Should Know About FloatingPoint Arithmetic and you can learn from others who weren't satisfied with that document
Now you say you're studying chemical diffusion so I assume you would have heard of significant figures? Surely your professor, when discussing significant figures, would have explained the basic limitations of adding machines (calculators/computers)?
I was hoping, after reading that document, you would ask explicitly how to round numbers for display purposes in perl.
While you can create a calculator using perl (like Tk::Calculator::RPN::HP ), and expect it to do rounding like your pocket calculator, perl itself, not being a calculator, won't hide the details of floating point arithmetic from you, so it is good knowledge to have.
Any scientist using computers for calculations needs to know the limits of his tools.
 [reply] 


say scalar(@arr); # 40
say $#arr; # 39
 [reply] [d/l] 
Re: Is this odd behavior a floating point problem? by Eliya (Vicar) on Mar 23, 2012 at 17:14 UTC 
For reasons described in excruciating detail in the document already cited, there will be errors, but according to a quick test on my system, they are nowhere near as large as you claim:
$ perl e '$x=0.001; $sum += $x for 1..40; printf "%.20f", $sum/40'
0.00100000000000000067
Looks more like a "one off" error to me (i.e. summing over one more than you divide by):
$ perl e '$x=0.001; $sum += $x for 0..40; printf "%.20f", $sum/40'
0.00102500000000000074
 [reply] [d/l] [select] 

:D Looks like two separate offbyone error (OBOE) errors to me :)
First you start with nonzero and add 40 times (one too many), then you start with nonzero and add 41 times (one too many twice).
If you start with nonzero you need to add only 39 times, or start with zero and add 40 times :)
In short
perl MData::Dump e " @f = map { 0.001 } 1 .. 40; dd\@f; $o = 0; for(
+@f){ dd $o+=$_; } dd int @f; dd $o/int(@f); "
perl MData::Dump e " $o = 0; for(1 .. 40){ dd $o+= 0.001; } dd $o/4
+0; "
It didn't dawn on me to check wickedjesters (or your) math until ww raised the quesiton  [reply] [d/l] 

$ perl le '$sum += 1 for 1..40; print $sum'
40
So where is the problem? I think you overlooked that $sum is initially undef/zero.  [reply] [d/l] [select] 
Re: Is this odd behavior a floating point problem? by roboticus (Canon) on Mar 23, 2012 at 17:25 UTC 
wickedjester:
You don't show your code, but I'm pretty sure you're not doing what you think you're doing. Specifically, I believe you're adding 41 copies of 0.001, as that's the only way I can reproduce your results:
$ cat t.pl
#!/usr/bin/perl
my @a = (0.001) x 41;
my $sum=0;
$sum += $_ for @a;
print "Avg: ", $sum/40, "\n";
$ perl t.pl
Avg: 0.001025
...roboticus
When your only tool is a hammer, all problems look like your thumb.  [reply] [d/l] 
Re: Is this odd behavior a floating point problem? by toolic (Chancellor) on Mar 23, 2012 at 17:29 UTC 
 sprintf
 perlfaq4 Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)?
 [reply] 
Re: Is this odd behavior a floating point problem? by Khen1950fx (Canon) on Mar 23, 2012 at 19:18 UTC 
As I see it, you're performing addition, division, and averaging. You can dispense with the addition, division, and
since you have an array, just do an average of the elements.
#!/usr/bin/perl l
use strict;
use warnings;
use Array::Average;
print average(
0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001,
0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001,
0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001,
0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001, 0.001,
);
Returns: 0.001  [reply] [d/l] 

if (@data) {
my $sum=0;
$sum+=$_ foreach @data;
return $sum/scalar(@data);
} else {
return undef;
}
Anyhow, as has already been pointed out, the OP's problem has likely nothing whatsoever to do with those general floating point issues, but is presumably simply the result of having computed the sum incorrectly.  [reply] [d/l] 

