oko1 has asked for the
wisdom of the Perl Monks concerning the following question:
Hi, all: I'm getting a disturbinglylarge discrepancy in running an identical script on different machines, and hoping that someone can suggest a cause and perhaps a fix. I'm suspecting a possible bug in my perl version.
Code being executed (@ARGV = qw/data.txt 636/):
#!/usr/bin/perl w
use strict;
# Least square fit to quadratic function; input up to ~1k data pairs a
+s spaceseparated columns.
#
# $Revision: 1.1 $ $Date: 20100628 17:31:2304 $
# $Source: /home/vogelke/notebook/2010/0628/polynomialfit/RCS/pfit,v
+$
# $UUID: 631e046df07038bbb287bdd10b1d0efa $
#
# John Lapeyre Tue Jan 28 18:45:19 MST 2003
# lapeyre@physics.arizona.edu
# http://www.johnlapeyre.com/computers.html
# http://www.johnlapeyre.com/pfit
#
# $Revision: 1.2 $ $Date: 20101004 17:08:4005 $
# Revised by Ben Okopnik
# Added usage message, code strictures, input filtering/validation
#
# fit to A x^2 + B x + C
# Works by solving matrix equation M.c = v.
# c = (A,B,C)
# v = (v1,v2,v3), where vn = \sum_i y_i x_i^(n1)
# m = ( (m4,m3,m2), (m3,m2,m1), (m2,m1,m0))
# where mn = \sum_i x_i^n
die "Usage: ", $0 =~ /([^\/]+)$/, " <input_file> <target_number>\n"
unless @ARGV == 2;
my $tgt = pop;
my ($m0, $v1, $v2, $v3, $m1, $m2, $m3, $m4, $d, @x, @y) = 0;
while (<>) {
next unless /^\s*(\d+)(?:$\s+(\d+)\s*$)/;
push @y, $+;
push @x, defined $2 ? $1 : $m0;
$m0++;
}
die "Badlyformatted data (one or two numerical columns required)\n"
unless $m0;
for (0 .. $#x) {
my $x = $x[$_];
my $y = $y[$_];
$v1 += $y * $x**2;
$v2 += $y * $x;
$v3 += $y;
$m1 += $x;
$m2 += $x * $x;
$m3 += $x**3;
$m4 += $x**4;
}
# Used mathematica to invert the matrix and translated to perl
$d = $m2**3 + $m0 * $m3**2 + $m1**2 * $m4  $m2 * (2 * $m1 * $m3 + $m0
+ * $m4);
my $A = ($m1**2 * $v1  $m0 * $m2 * $v1 + $m0 * $m3 * $v2 + $m2**2 * $
+v3 
$m1 * ($m2 * $v2 + $m3 * $v3)) / $d;
my $B = (($m1 * $m2 * $v1) + $m0 * $m3 * $v1 + $m2**2 * $v2  $m0 * $
+m4 * $v2 
$m2 * $m3 * $v3 + $m1 * $m4 * $v3) / $d;
my $C = ($m2**2 * $v1  $m1 * $m3 * $v1 + $m1 * $m4 * $v2 + $m3**2 * $
+v3 
$m2 * ($m3 * $v2 + $m4 * $v3)) / $d;
# A x^2 + B x + C
printf "%f\t%f\t%f\n", $A, $B, $C;
# Correct $C for final value
$C = $tgt;
# PE solver
my $y1 = ($B + sqrt($B**2  4 * $A * $C)) / (2 * $A);
my $y2 = ($B  sqrt($B**2  4 * $A * $C)) / (2 * $A);
printf "Roots: %.3f or %.3f\n\n", $y1, $y2;
The data being processed (CO2 measurements at Mauna Loa):
1962 318
1982 341
2002 373
The results:
# i686 running Linux with perl5.10.0
0.011256 43.244877 41833.442623
Roots: 2093.870 or 1747.905
# IBM x3400 running Linux with perl5.10.1
0.011250 43.220086 41809.478563
Roots: 2083.912 or 1757.866
# Sparc running Solaris10 with perl5.10.1
0.011250 43.220086 41809.478563
Roots: 2083.912 or 1757.866
# After adding "use bignum;" on either of the last two
0.011250 43.220000 41809.395000
Roots: 2083.912 or 1757.866
Any help or suggestions would be appreciated.

"Language shapes the way we think, and determines what we can think about."
 B. L. Whorf
Re: Calculation discrepancy between Perl versions by ig (Vicar) on Oct 05, 2010 at 05:22 UTC 
I suspect the differences result from differences in how floating point is handled on the different systems affecting the cumulated error from the sequence of calculations.
You could try reimplementing using Math::BigFloat and setting a sufficiently large (or is that small?) precision so that the cumulated error is sufficiently small.
If my guess is correct, you might find What Every Computer Scientist Should Know About FloatingPoint Arithmetic helpful.
 [reply] 
Re: Calculation discrepancy between Perl versions by GrandFather (Sage) on Oct 05, 2010 at 06:34 UTC 
Following on from ig's comment a couple of alarm bell stimulating items in relation to your code, data and results should be mentioned:
 sums and differences of floats can easily lead to large errors if not managed correctly, especially in the context of exponentiation
 output values with more significant digits than seem reasonable given the input data is a cause for concern
This sort of code can be very tricky to get right with careful attention needing to be paid to expected ranges of values at each step and consideration given to changing calculation order to minimise errors propagating through the calculation chain. Using higher precision numerical representation is a quick way around what can otherwise often be a difficult problem.
True laziness is hard work
 [reply] 
Re: Calculation discrepancy between Perl versions by Anonymous Monk on Oct 05, 2010 at 11:51 UTC 
An alternative to running with more significant digits, is to rewrite your code to make use of Number::WithError. This module knows how to propagate error through most common kinds of operations. This module will work with BigFloat numbers as well.
 [reply] 
Re: Calculation discrepancy between Perl versions by BrowserUk (Pope) on Oct 05, 2010 at 13:22 UTC 
FWIW: I get the same results on I686 Windows, as you get on I686 linux, which suggests to me that the difference if down to differences in the floating point hardware rather either the way Perl is built, or the underlying CRT math functions.
I thought for a while that it might be down to the IEEE rounding mode is use, but I tried it with all 4 modes and whilst the results do vary, the differences are far less than you are seeing:
C:\test>junk54 MaunaLoa 636 1
0.011256 43.244877 41833.442623
Roots: 2093.870 or 1747.905
C:\test>junk54 MaunaLoa 636 2
0.011256 43.245005 41833.967213
Roots: 2093.804 or 1747.982
C:\test>junk54 MaunaLoa 636 3
0.011256 43.245005 41833.442623
Roots: 2093.939 or 1747.847
C:\test>junk54 MaunaLoa 636 4
0.011256 43.245005 41833.967213
Roots: 2093.804 or 1747.982
If I add use bignum; I get the same result regardless of rounding mode (as you'd expect with infinite precision):
C:\test>junk54 MaunaLoa 636 0
0.011250 43.220000 41809.395000
Roots: 2093.969 or 1747.809
But the result is still not far from the standard double precision results, and far away from your results on those other platforms.
So then I decided to "check the math".
I fed your sample data into wolfram/alpha's quadratic fit and got a = 0.01125 b=43.33 c=41809.39500009608 (your results are close enough!)
I then walked manually through the calculation:
The small results are done using calc.exe, the extended numbers in brackets are produced using wolfram alpha.
All of which suggests that your I686 results are correct, and the "bug" is on the other platforms.
Perhaps they're using single precision? Or at least, perhaps lookup tables for sqrt() & pow() that are only single precision?
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
 [reply] [d/l] [select] 

Thanks, all, for the very useful responses (I'm responding to BrowserUk's reply specifically because it's so detailed and helpful in so many aspects)  this was very useful for both confirmation and more direction in deciding where to look for the error. I'm not all that familiar with the guts of Perl, but it's looking likely that there's a major compilation difference responsible here: either singleprecision lookup tables, or perhaps a radically different math lib. Annoying that something like that could affect a Perl program... but on the other hand, good to learn that it can.
Again, thank you all very much.

"Language shapes the way we think, and determines what we can think about."
 B. L. Whorf
 [reply] 

either singleprecision lookup tables, or perhaps a radically different math lib. Annoying that something like that could affect a Perl program...
I'm very unsure of my ground here as I'm not familiar with the other platforms, but it might not be softwarePerl or the libsbut the floating point hardware. If their FPU's are only single precision, that might account for the results.
That said, my best efforts to perform the calculation in single precision doesn't get close to the inaccuracies you;re seeing:
#include <stdio.h>
float sqrtf( float n ) {
float g = n / (float)2.0;
while( ( ( g*g )  n ) > 0.000001 ) {
g = ( g + n/g ) / (float)2.0;
}
return g;
}
void main( void ) {
float a = (float)0.01125, b = (float)43.22, c = (float)41173.3950
+0009608;
float two = 2.0, four = 4.0;
float b2 = b * b;
float ac4 = four * a * c;
float sqrtbit = sqrtf( b2  ac4 );
float a2 = two * a;
float r1 = ( b + sqrtbit ) / a2;
float r2 = ( b  sqrtbit ) / a2;
printf( "b2:%f ac4:%f sqrt:%f a2:%f\n", b2, ac4, sqrtbit, a2 );
printf( "roots: %f , %f\n", r1, r2 );
}
Produces: C:\test>quads
b2:1867.968506 ac4:1852.802856 sqrt:3.894310 a2:0.022500
roots: 2093.969238 , 1747.808472
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
 [reply] [d/l] [select] 
Re: Calculation discrepancy between Perl versions by JavaFan (Canon) on Oct 05, 2010 at 13:26 UTC 
It looks to me all of your @x, @y, $v?s, $m?s and $d are integers. $A, $B, and $C are all integers divided by the same number ($d). You should be able to rewrite your calculations so you factor out dividing by $d  if I'm not mistaken, skipping the three divisions by $d, and replacing the adjustment of $C by doing $C = $tgt * $d; should do this.
This reduces the amount of floating point calculations, and hence the rounding errors. Of course, there still may be floating point calculations if any of intermediate integers becomes "too large". (Print out $A, $B, $C and $d to make sure).  [reply] [d/l] 
Re: Calculation discrepancy between Perl versions by sundialsvc4 (Abbot) on Oct 06, 2010 at 17:50 UTC 
The adage that I heard was: “Floating point numbers like piles of dirt on a beach. Every time you pick one up and move it around, you lose a little dirt and you pick up a little sand.”
Every implementation ought to produce the same answer, within the useful number of significantdigits, for most calculations. But, the more calculations you do (and depending on exactly how you do it), the more the results will “drift” toward utter nonsense.
And I truly think that you should expect this from any binary floatingpoint implementation. There are two classic ways that applications (such as, accounting applications in particular) counter this:

BinaryCoded Decimal (BCD): The calculations truly are performed in decimal mode, using 4 bits per decimal integer. (COBOL turned this into a science.)

Scaled Integers: The calculations are performed using integer arithmetic, and the result is understood to be “multiplied by (say...) 10,000,” giving you a fixed precision of (say...) 4 digits to the right of the decimal. (Microsoft Access uses this strategy in their “Currency” datatype.)
Even so, errors can accumulate. This can be further addressed by algorithms such as “banker’s rounding.” There is, of course, the (probably apocryphal) tale of an intrepid computerprogrammer who found a way to scoop all of those minuscule roundingerrors into his own bank account...
Floatbinary can never be a “pure” data representation. It is wellunderstood that the fraction 1/3 cannot be precisely expressed as a decimal number. Similar artifacts occur for other fractions in other bases, and, so they tell me, for base2 floats, one of those unfortunate numbers is 1/10. (“So I have been told.” I don’t have enough geekknowledge to actually know for sure...)
 [reply] 

