Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

We've all seen dozens of threads where someone expected two floating-point numbers calculated in different ways to be equal, and was shocked to find that they weren't. They generally assume there's a bug in the language.

Here's something a little different.

Consider a very small number: 1.234567e-302. This is tiny, but can easily be approximated within the precision of the 64-bit type Perl uses for numbers on i686 and x86_64.

$ perl -E 'say 1.234567e-302' 1.234567e-302

Now let's make it just a little larger. Adding a digit to the end will have that effect: 1.2345678e-302.

$ perl -E 'say 1.2345678e-302' 1e-302

Wait, what?

Perhaps it's just a formatting issue.

$ perl -E 'say "help" if 1.2345678e-302 == 1e-302' help $ perl -E 'say "what" if 1.2345678e-302 < 1.234567e-302' what

Okay, perhaps we're trying to produce a number Perl can't actually represent?

$ perl -MPOSIX -E 'say scalar strtod "1.2345678e-302"' 1.2345678e-302

Nope, looks like this value can be approximated by a scalar just fine, when it's produced by a C function.

I actually ran into this at work while I was fuzzing some numerical code. At the time, I was most interested in finding a workaround, so the strtod approach got me round the problem -- but the oddness intrigued me, so I decided to try and figure it out.

I quickly realised it had to be a parsing issue. After all, the problem is triggered by the number of decimal places, not the digits involved nor the size of the exponent -- even adding a trailing zero can affect the way a literal is parsed:

$ perl -E 'say 5.5000000000e-298; say 5.50000000000e-298' 5.5e-298 5e-298

I was about to jump into the Perl source code when I paused for a little googling, and found a nice article by brian_d_foy explaining how Perl parses scientific notation, which goes through the relevant functions line by line. And it's easy enough to identify where everything's going pear-shaped. It is, of course, a floating-point rounding error -- but this one is indirect. brian even identifies the code in question as a place where precision can be lost.

It turns out that perl essentially parses 5.5000000000e-298 by taking the two sides of the decimal point separately, and calculating 5 / 1e298 + 5,000,000,000 / 1e308. When we add the extra zero, Perl tries to calculate 50,000,000,000 / 1e309 instead -- but that exponent does exceed the floating-point precision, the number overflows to infinity, and we end up with 5e-298 + 0.

So this might actually count as a bug in perl for once, though it's probably a known consequence of the algorithm and I certainly don't expect anyone to scramble to patch it. Either way, it's still interesting: unlike most floating-point gotchas it's not something inherent to working with inexact numbers, since other implementations of atof can and do parse these literals as expected -- it's just Perl's that doesn't.

Bonus observation: the above results were acquired on x86_64. Anyone who tries the examples above on a 32-bit x86 platform may have found that the problem does not manifest itself there -- unless you build Perl yourself with -Doptimize=-g, in which case the problem suddenly appears. I guess 32-bit gcc is managing to perform the whole calculation in 80-bit registers, which can handle e-309 just fine, and debug compilation forces it to store intermediate values in 64-bit variables.


In reply to Fun with Tiny Numbers by Porculus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others chanting in the Monastery: (8)
    As of 2014-11-29 00:16 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My preferred Perl binaries come from:














      Results (200 votes), past polls