When is a 2 not a 2?

RockM66 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: When is a 2 not a 2? by Fletch (Bishop) on Jan 30, 2008 at 20:53 UTC
Yes, you're missing the fact that IEEE floating point numbers aren't exact representations. Update: See for instance Equality checking for strings AND numbers for links and discussion. Another Update: After actually running it and poking with Devel::Peek this is strange because that's showing a 2 for the NV part. `DB<8> x Dump( $x ) SV = PVNV(0x8e98f0) at 0x800e64 REFCNT = 2 FLAGS = (PADBUSY,PADMY,NOK,POK,pIOK,pNOK,pPOK) IV = 2 NV = 2 PV = 0x435c40 "2"\0 CUR = 1 LEN = 36 empty array` [download] More poking: If you look at the number the representation's different under the hood, so it's definitely a float precision issue somewhere. `DB<11> x unpack( "h", pack( "F", 2 ) ) 0 0000000000000004 DB<12> x unpack( "h", pack( "F", $x ) ) 0 2000000000000004` [download] The cake is a lie. The cake is a lie. The cake is a lie.	[reply] [d/l] [select]
Re: When is a 2 not a 2? by ikegami (Patriarch) on Jan 30, 2008 at 21:07 UTC
Change your the print to `printf "%.16e\n", $x;` [download] and you get `1.0000000000000000e+000 1.1000000000000001e+000 1.2000000000000002e+000 1.3000000000000003e+000 1.4000000000000004e+000 1.5000000000000004e+000 1.6000000000000005e+000 1.7000000000000006e+000 1.8000000000000007e+000 1.9000000000000008e+000 Stop: x=2.0000000000000009e+000 No, x is not 2.` [download] 1/10 is a periodic number in binary, just like 1/3 is a periodic number in decimal. You'll get better results if you avoid accumulating the error: `for ( 110 .. 210 ) { my $x = $_/10; printf "%.16e\n", $x; }` [download] `1.0000000000000000e+000 1.1000000000000001e+000 1.2000000000000000e+000 1.3000000000000000e+000 1.3999999999999999e+000 1.5000000000000000e+000 1.6000000000000001e+000 1.7000000000000000e+000 1.8000000000000000e+000 1.8999999999999999e+000 2.0000000000000000e+000` [download] Basically, introduce decimal numbers as late as possible. For example, work with cents instead of dollars.	[reply] [d/l] [select]
Re^2: When is a 2 not a 2? by Argel (Prior) on Jan 31, 2008 at 00:50 UTC
That's a good explanation and sound advice. So, is there any harm in doing "$x" ne "2"? Or would it be better to use sprintf "%.02f", $x to convert it?	[reply]
Re^3: When is a 2 not a 2? by ikegami (Patriarch) on Jan 31, 2008 at 02:14 UTC
So, is there any harm in doing "$x" ne "2"? When, to end the loop? It'll work for this loop. What about longer loops? The loop counter would accumulate more and more error, potentially skewering your numbers and potentially making the loop into an infinite loop. Seems to me to be plain bad practice to knowingly accumulate an error.	[reply]
Re: When is a 2 not a 2? by grinder (Bishop) on Jan 30, 2008 at 22:08 UTC
One way out of the problem is to use rational numbers. `#! /usr/local/bin/perl use strict; use warnings; use Math::BigRat; my $x = Math::BigRat->new; my $delta = Math::BigRat->new('1/10'); for ($x=1; $x < 2; $x += $delta) { print "$x\n"; } print "Stop: x=$x\n"; print "No, x is not 2.\n" if $x != 2; __PRODUCES__ 1 11/10 6/5 13/10 7/5 3/2 8/5 17/10 9/5 19/10 Stop: x=2` [download] Oddly, I had to change the less-than-or-equal comparator to less-than. I'm not sure why this is different to your code, but then again I never use C-style for loops, so there may be some subtlety I'm overlooking. I would have written it as follows: `my $x = Math::BigRat->new(1); while ($x < 2) { $x += $delta; print "$x\n"; }` [download] • another intruder with the mooring in the heart of the Perl	[reply] [d/l] [select]
Re: When is a 2 not a 2? by swampyankee (Parson) on Jan 31, 2008 at 01:03 UTC
OK; it's not weird; it's normal behavior for floating point numbers. Back when I started programming (on a Univac 1108, which used 48 bit reals, and could distinguish 0 from -0), one of the first comments by the instructors was "Do not test floating point numbers for equality." It's still good advice; see "What Every Computer Scientist Should Know About Floating-Point Arithmetic"(Goldberg, David, "What Every Computer Scientist Should Know About Floating-Point Arithmetic", Computing Surveys, March 1991). As a (believe it or not) relevant aside: Fortran-77 permitted floating point numbers be used for do loop counters (== Perl's `for(($x=1; $x <= 2; $x += 0.1)`); I believe that they're being removed from the latest standard because there was a) no way they could get them to work portably (a given loop may iterate 1000 times on one platform and 1003 on another) and b) they make it difficult to parallelize the do-loop. emc Information about American English usage here and here. Any Northeastern US area jobs? I'm currently unemployed.	[reply]
Re: When is a 2 not a 2? (eq) by tye (Sage) on Jan 31, 2008 at 02:29 UTC
My Math::BigApprox is smarter than regular floating point numbers. It knows that $x == $y iff $x eq $y so you won't have that problem when using it: `use Math::BigApprox 'c'; my $x= c(1); while( $x < 2 ) { print "$x\n"; $x += 0.1; } print "Stop: x=$x\n"; print "Not 2\n" if $x != 2;` [download] and such numbers have about the same precision as floating point when using such small numbers, they don't have a big performance impact (mostly just the cost of overload.pm), and you can compute 500,000! (factorial, about 1.0228e+2632341) without overflowing nor taking all day. ...which gives me another idea for a very simple module. Math::Eq ? - tye	[reply] [d/l]
Re^2: When is a 2 not a 2? (eq) by halley (Prior) on Jan 31, 2008 at 13:43 UTC
tye, you may already know of this, but I'll reiterate that anyone who is interested in implementing a "comparing floating point numbers" feature should read this article. http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm In the olden days, people would compare using a chosen EPSILON. `if (abs($a - $b) < $EPSILON) { ... }` If you know about the uneven resolution of floats, you learn that EPSILON must be chosen carefully for each comparison. Better to know something about the format of IEEE floats (the most common implementation on modern computers) and some fast and flexible ways to make a suitable `AlmostEqual()` function that lets you use a tolerance that is tied to the resolution, not the decimal position of the error. Comparing floats is a huge gotcha for newcomers or writers of quick ad-hoc code, and easy to do wrong. I would rather that high-level programming languages melt such comparison features into the language, say, with an `A =~ B` floating point operator. (Whether regex or float, that can be read as "does A bind with B.") Then, the default tolerance can be set to a suitable "about one decimal place" and overridden through a pragma or language variable. But I digress. -- `[ e d @ h a l l e y . c c ]`	[reply] [d/l] [select]
Re: When is a 2 not a 2? by cdarke (Prior) on Jan 31, 2008 at 09:31 UTC
This is real (pardon the pun) interesting. I thought: they have missed the obvious, use int. Wrong: `my $x; for ($x=1; int($x) <= 2; $x += 0.1) { print "$x\n"; } print "Stop: x=$x\n"; print "No, x is not 2.\n" if $x != 2;` [download] Gives: `1 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Stop: x=3 No, x is not 2.` [download] So then I RTFM on int. Changing the comparisons to text (the comparison operators stringify both sides) the result is correct, but of course this is not a solution for numbers greater than 9.	[reply] [d/l] [select]
Re^2: When is a 2 not a 2? by ikegami (Patriarch) on Jan 31, 2008 at 10:24 UTC
int(2.1) is 2, which is <= 2, so the loop won't end then like it should. This is a completely different bug than the OP's. You were maybe thinking something more along the lines of `int($x10)/10 <= 2`, but that would only work with some combination of numbers. `int($x10+0.5)/10 <= 2` might be ok, but you'd still be needlessly accumulating error in your variable. As already mentioned (although not in so many words), if you want to loop a certain number of times, count the number of loops passes and derive the decimal number from the loop pass number.	[reply] [d/l] [select]
Re: When is a 2 not a 2? by Anonymous Monk on Jan 31, 2008 at 10:29 UTC
I tryd this script and realized that is the serious error. I modified yours script and it seem to me that float == integer do not work properly. `my $x; for ($x=1.0; $x <= 3.0; $x += 0.2) { print "$x\n"; } print "Stop: x=$x\n"; print "No, x is not 3.\n" if $x != 3.0;` [download] This is perl, v5.10.0 built for MSWin32-x86-multi-thread (with 1 registered patch, see perl -V for more detail) Binary build 1001 283495 provided by ActiveState http://www.ActiveState.com	[reply] [d/l]
Re^2: When is a 2 not a 2? by ikegami (Patriarch) on Jan 31, 2008 at 10:40 UTC
It has nothing to do with floats being compared to integers. In fact, you're comparing floats with floats. `>perl -e"use Devel::Peek; Dump(3.0)" SV = NV(0x184968c) at 0x225318 <-- NV = float REFCNT = 1 FLAGS = (PADBUSY,PADTMP,NOK,READONLY,pNOK) NV = 3` [download] Also, it has nothing to do with Perl or Windows. Were you to write the same thing in C or BASIC, you'd get the same results. `#include <stdio.h> int main() { double x; for (x=1.0; x <= 3.0; x += 0.2) { printf("%lf\n", x); } printf("Stop: x=%lf\n", x); if (x != 3.0) { printf("No, x is not 3.\n"); } return 0; }` [download] `1.000000 1.200000 1.400000 1.600000 1.800000 2.000000 2.200000 2.400000 2.600000 2.800000 Stop: x=3.000000 No, x is not 3.` [download] 2/10th is also a periodic number in binary. It would take an infinite amount of storage to store it and an infinite amount of time to process it. There is some precision loss. If you were to print the entire number instead of rounding it (which numerical stringification does), you'd find that you're comparing 3.0000000000000004 to 3.0. Update: Added C code.	[reply] [d/l] [select]
A reply falls below the community's threshold of quality. You may see it by logging in.


Don't ask to ask, just ask
	PerlMonks