spikeheap has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to carry out a simple comparison of two numbers in Perl, but I have encountered some interesting results. If someone could point me to a foolish mistake somewhere I would be more than grateful!
I understand that eq carries out a string comparison, whereas == carries out a numerical comparison, however when I try to compare numbers after a simple arithmetic calculation == returns false when I would expect true, while eq returns true correctly.
$num1 = 523.20;
$num2 = 23.2;
$eq = 500;
print ("$num1, $num2, ($eq, ". ($num1  $num2) ."), ". ($n
+um1  $num2 eq $eq) .", ". (($num1  $num2) == $eq). ".\n");
prints
523.2, 23.2, (500, 500), 1, .
I have come to the conclusion that this is because of floating point arithmetic (correct me if I'm wrong), as printing $num1  $num2 yeilds a value of 1.13686837721616e13.
My confusion comes because if I alter the above script from 23.2 to 0.2, the result for == is true (correctly), and I can't understand where the difference lies that should cause == to deviate in such a manner.
Of course the ultimate question is: is there something I'm doing (or not doing) which can account for this? Is there a better way of doing this simple comparison which will yield consistent results?
Re: eq vs ==
by LanX (Sage) on Nov 10, 2009 at 12:06 UTC

The binary floating point representation produces this asymmetry. It's not only what you see as decimal fraction after the point which determines how the binary mantissa is calculated, the digits before also get into the mantissa.
UPDATE:
From 23.2 to 0.2 you already skipped 2 decimal steps in exponentiation.
These precision errors resulting from base transformations are periodic, if you just shift the exponent long enough you get the same mantissa at the lower end, such that the difference is again 0 at the "tail".
EXAMPLE:
0.2 = 1/5 can't be represented as a binary fraction without infinite repetition, the same way you can hardly represent 1/7 in a decimal fraction. (NOTE: 10 and 7 are coprime, like 2 and 5 are!)
Now try to calculate (10+1/7)1/7 to a decimal base but with a finite mantissa!
You'll get
10 + 1/7= 1.01428571428571 * 10^1
1/7= 1.42857142857143 * 10^1
Clearly, the lower ends of the mantissa to the base ten are not symmetric ...
... but to the base 7 both numbers would be highly symmetric again, no matter what you add before the point. Hence no error!
13.1  0.1 (base 7)
Hope this calculation helps to visualize what happens with "obvious calculations" when switching bases!
PS: Further reading: Humans have too many fingers!  [reply] [d/l] [select] 
Re: eq vs ==
by moritz (Cardinal) on Nov 10, 2009 at 12:35 UTC

I think the difference that confuses you is that some numbers can be represented by floating points without any error, while others can't. For example 0.1 is an infinite fraction in the binary system.
So while you can safely compare 0.5 * 2.0 == 1.0, you can't safely compare 0.1 * 2.0 == 0.2 (the former is true, the latter false).
The correct way to compare floating point numbers is
if (abs($x$y) < $epsilon) {...}
Where $epsilon is a small number.
Perl 6  links to (nearly) everything that is Perl 6.
 [reply] [d/l] [select] 

The correct way to compare floating point numbers is ...
Moritz, in general that's problematic.
The calculation error doubles with every calculation step, no matter how big epsilon is, you can construct a counterexample, where your rule breaks.
Consequently you have to limit the numbers of calculations before you can tell which epsilon is save!
IMHO that's a ridiculous big effort compared to the normal possibility to stay integer right away by multiplicating the fractions away.
532.2  23.2
=>
5232  232 = 5000
=> 500.0
 [reply] [d/l] 

The calculation error doubles with every calculation step, no matter how big epsilon is, you can construct a counterexample, where your rule breaks.
The error only doubles with every step if your algorithm is numerically badly conditioned.
For usual nonnumeric calculations such errors tend to stay very small, unless you do things like substracting two numbers of nearly the same size, and dividing by the result.
If you do numerics, you should inform yourself in more detail about floating point arithmetics and error propagation (a standard subject in applied mathematics).
IMHO that's a ridiculous big effort compared to the normal possibility to stay integer right away by multiplicating the fractions away.
That might be true for literal constants, but when you read several numbers from user input, it gets very quickly very tedious to track the smallest possible multiplier, and outright stops working if you need to divide by a usersupplied number.
Another solution is to use a data type which stores numerator and denominator separately as integers, as Perl 6 does.
Perl 6  links to (nearly) everything that is Perl 6.
 [reply] 




 [reply] 

Would you care to explain how you would do that, in the general case?
For example in one of my programs I have a few matrices in which I put parameters, solve some equation systems and in the end I calculate a trace over some of these matrices, and as a result I get another matrix. In that final matrix the sum over all rows and all columns should be equal, modulo numeric errors.
How would I compare (or even compute) these sums without using or comparing floating point numbers?
Perl 6  links to (nearly) everything that is Perl 6.
 [reply] 

not using float or not compare ==.
(or using abs($x$y) < $epsilon as you already shown which is not the compare anyway)
I know my position is a bit strong, but using == on floats is going to make troubles, which depends on values, host, compiler, libc and so on. very hard to predict and replicate.
maybe a nice gift would be to have a warning?
Update: ops! i've replied to my node instead of moritz's 806454.
 [reply] [d/l] 
Re: eq vs ==
by gmargo (Hermit) on Nov 10, 2009 at 14:12 UTC

What you need is a set of comparison functions that implement a "close enough" concept of equality,
like moritz says above, and then use them consistently.
Here are a few routines I wrote ages ago.
Modify the cmp_eq routine for your desired level of "close enough".
#
# Greater Than ($a > $b)
#
sub cmp_gt
{
my ($a, $b) = @_;
return 1 if !cmp_eq($a,$b) && $a > $b;
return 0;
}
#
# Less Than ($a < $b)
#
sub cmp_lt
{
my ($a, $b) = @_;
return 1 if !cmp_eq($a,$b) && $a < $b;
return 0;
}
#
# Equal To ($a == $b)
#
sub cmp_eq
{
my ($a, $b) = @_;
return 1 if $a == $b; # Short circuit if rounding is not a p
+roblem
# OneThousandth of 1 percent of bigger value magnitude. (10^5)
my ($ma,$mb) = (abs($a),abs($b));
my $piddly = 0.00001 * ($ma > $mb ? $ma : $mb);
return 1 if abs($a$b) <= $piddly;
return 0;
}
#
# Greater Than or Equal To ($a >= $b)
#
sub cmp_gteq
{
my ($a, $b) = @_;
return cmp_gt($a,$b)  cmp_eq($a,$b);
}
#
# Less Than or Equal To ($a <= $b)
#
sub cmp_lteq
{
my ($a, $b) = @_;
return cmp_lt($a,$b)  cmp_eq($a,$b);
}
 [reply] [d/l] [select] 
Re: eq vs ==
by AnomalousMonk (Bishop) on Nov 10, 2009 at 20:07 UTC

 [reply] 
Re: eq vs ==
by spikeheap (Novice) on Nov 10, 2009 at 17:03 UTC

Thanks everyone for the quick and complete replies. The binary fractions problem had not even occurred to me (I come from a Java background), but at least now I understand the irregularity of the errors and why I have not seen it before.
It makes sense that different users have differing opinions on what "acceptable precision" is, hence why there isn't a standard equality operator.
Thanks again
 [reply] 

please note that this is not a perl problem, all languages suffer from these same problems. see Java double traps as an example.
 [reply] 

This is a language implementation problem. Some languages are lucky enough in their specifications to be able to handle infinitely long numbers and use ratios rather than floats.
 [reply] 


 [reply] 

I'd forgotten about the need to flame on forums. Thanks for reminding me, and I'll be sure to make an effort to cut someone else down when I get the chance.
 [reply] 


