Perl: the Markov chain saw  
PerlMonks 
Fun with Tiny Numbersby Porculus (Hermit) 
on Aug 26, 2010 at 00:59 UTC ( #857346=perlmeditation: print w/replies, xml )  Need Help?? 
We've all seen dozens of threads where someone expected two floatingpoint numbers calculated in different ways to be equal, and was shocked to find that they weren't. They generally assume there's a bug in the language. Here's something a little different. Consider a very small number: 1.234567e302. This is tiny, but can easily be approximated within the precision of the 64bit type Perl uses for numbers on i686 and x86_64.
Now let's make it just a little larger. Adding a digit to the end will have that effect: 1.2345678e302.
Wait, what? Perhaps it's just a formatting issue.
Okay, perhaps we're trying to produce a number Perl can't actually represent?
Nope, looks like this value can be approximated by a scalar just fine, when it's produced by a C function. I actually ran into this at work while I was fuzzing some numerical code. At the time, I was most interested in finding a workaround, so the strtod approach got me round the problem  but the oddness intrigued me, so I decided to try and figure it out. I quickly realised it had to be a parsing issue. After all, the problem is triggered by the number of decimal places, not the digits involved nor the size of the exponent  even adding a trailing zero can affect the way a literal is parsed:
I was about to jump into the Perl source code when I paused for a little googling, and found a nice article by brian_d_foy explaining how Perl parses scientific notation, which goes through the relevant functions line by line. And it's easy enough to identify where everything's going pearshaped. It is, of course, a floatingpoint rounding error  but this one is indirect. brian even identifies the code in question as a place where precision can be lost. It turns out that perl essentially parses 5.5000000000e298 by taking the two sides of the decimal point separately, and calculating 5 / 1e298 + 5,000,000,000 / 1e308. When we add the extra zero, Perl tries to calculate 50,000,000,000 / 1e309 instead  but that exponent does exceed the floatingpoint precision, the number overflows to infinity, and we end up with 5e298 + 0. So this might actually count as a bug in perl for once, though it's probably a known consequence of the algorithm and I certainly don't expect anyone to scramble to patch it. Either way, it's still interesting: unlike most floatingpoint gotchas it's not something inherent to working with inexact numbers, since other implementations of atof can and do parse these literals as expected  it's just Perl's that doesn't. Bonus observation: the above results were acquired on x86_64. Anyone who tries the examples above on a 32bit x86 platform may have found that the problem does not manifest itself there  unless you build Perl yourself with Doptimize=g, in which case the problem suddenly appears. I guess 32bit gcc is managing to perform the whole calculation in 80bit registers, which can handle e309 just fine, and debug compilation forces it to store intermediate values in 64bit variables.
Back to
Meditations

