Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

We've all seen dozens of threads where someone expected two floating-point numbers calculated in different ways to be equal, and was shocked to find that they weren't. They generally assume there's a bug in the language.

Here's something a little different.

Consider a very small number: 1.234567e-302. This is tiny, but can easily be approximated within the precision of the 64-bit type Perl uses for numbers on i686 and x86_64.

$ perl -E 'say 1.234567e-302' 1.234567e-302

Now let's make it just a little larger. Adding a digit to the end will have that effect: 1.2345678e-302.

$ perl -E 'say 1.2345678e-302' 1e-302

Wait, what?

Perhaps it's just a formatting issue.

$ perl -E 'say "help" if 1.2345678e-302 == 1e-302' help $ perl -E 'say "what" if 1.2345678e-302 < 1.234567e-302' what

Okay, perhaps we're trying to produce a number Perl can't actually represent?

$ perl -MPOSIX -E 'say scalar strtod "1.2345678e-302"' 1.2345678e-302

Nope, looks like this value can be approximated by a scalar just fine, when it's produced by a C function.

I actually ran into this at work while I was fuzzing some numerical code. At the time, I was most interested in finding a workaround, so the strtod approach got me round the problem -- but the oddness intrigued me, so I decided to try and figure it out.

I quickly realised it had to be a parsing issue. After all, the problem is triggered by the number of decimal places, not the digits involved nor the size of the exponent -- even adding a trailing zero can affect the way a literal is parsed:

$ perl -E 'say 5.5000000000e-298; say 5.50000000000e-298' 5.5e-298 5e-298

I was about to jump into the Perl source code when I paused for a little googling, and found a nice article by brian_d_foy explaining how Perl parses scientific notation, which goes through the relevant functions line by line. And it's easy enough to identify where everything's going pear-shaped. It is, of course, a floating-point rounding error -- but this one is indirect. brian even identifies the code in question as a place where precision can be lost.

It turns out that perl essentially parses 5.5000000000e-298 by taking the two sides of the decimal point separately, and calculating 5 / 1e298 + 5,000,000,000 / 1e308. When we add the extra zero, Perl tries to calculate 50,000,000,000 / 1e309 instead -- but that exponent does exceed the floating-point precision, the number overflows to infinity, and we end up with 5e-298 + 0.

So this might actually count as a bug in perl for once, though it's probably a known consequence of the algorithm and I certainly don't expect anyone to scramble to patch it. Either way, it's still interesting: unlike most floating-point gotchas it's not something inherent to working with inexact numbers, since other implementations of atof can and do parse these literals as expected -- it's just Perl's that doesn't.

Bonus observation: the above results were acquired on x86_64. Anyone who tries the examples above on a 32-bit x86 platform may have found that the problem does not manifest itself there -- unless you build Perl yourself with -Doptimize=-g, in which case the problem suddenly appears. I guess 32-bit gcc is managing to perform the whole calculation in 80-bit registers, which can handle e-309 just fine, and debug compilation forces it to store intermediate values in 64-bit variables.


In reply to Fun with Tiny Numbers by Porculus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (13)
    As of 2015-07-30 17:28 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









      Results (273 votes), past polls