Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: RFC: Large Floating Point Numbers - Rounding Errors

by salva (Canon)
on Sep 09, 2011 at 06:58 UTC ( [id://925024]=note: print w/replies, xml ) Need Help??


in reply to RFC: Large Floating Point Numbers - Rounding Errors

This seems to work (on x86, at least):
printf("%.5f\n", sprintf("%.6f1", $_)) for (0.000005, 0.000015, 0.000025, 0.000035, 0.000045, 0.000055, 0 +.000065)

Replies are listed 'Best First'.
Re^2: RFC: Large Floating Point Numbers - Rounding Errors (expectations)
by tye (Sage) on Sep 09, 2011 at 18:06 UTC
    printf("%.­5f\n", sprintf("%­.6f1", $_))

    Sure. But that is worse in a lot of ways as well. What does it do to 0.0000346 ?

    But you can use that approach to at least very closely match (some) human expectations. It is mostly just your choice of "6" (or $n+1) that is at fault. A fairly reasonable thing to do would be more like:

    printf( "%.5f\n", sprintf("%.12e",$_) ) # ^^^

    Find an example that violates your expectations for that. I'll wait...

    The obvious one is, of course, 0.000034999999999999. But try this:

    % perl -le "print 0.00003499999999999999" 3.5e-05

    The above number is so very close to 0.000035 that Perl itself intentionally decides to display it as being exactly 0.000035. So, if that many '9's on the end doesn't violate your personal sense of perspective such that you don't mind that it gets rounded up to 0.00004, then I claim that it is your perspective when it comes to mundane computer performance that is to blame. It means that you expect your lowly computer to be able to weigh a large beach (one with 1000 volleyball courts) and not be off by a single grain of sand.

    Perl has this concept of "so close as to be displayed as equal" because without it you get things like sum( (0.01)x50 ) showing not as 0.5 but as 0.50000000000000022. So it is good, even important, to ignore the last several bits of floating point values when displaying them. (It is also good to do so when comparing them, but I'll just mention that can of worms without opening it.)

    So, why don't we just fix Perl's sprintf so 0.000034999999999999 rounds to 0.00004 ? Wait! Why should exactly 0.000035 (not floating point) round to 0.00004 ? It is exactly equally between 0.00003 and 0.00004. So either answer is equally appropriate. So always rounding such values up induces a small imbalance. That is why there is the superior "banker's rounding" where half of such values round up and half of such values round down. So, why don't we just fix Perl's sprintf so it uses banker's rounding?

    We don't need to:

    my %c; for my $i ( '00' .. '99' ) { my $f= 0 + ".${i}5"; my $r= sprintf "%.2f", $f; my $d= $r < $f ? '-' : '+'; $c{$d}++; print "$f $r $d\n"; } print "$c{'-'} rounded down, $c{'+'} rounded up.\n"; __END__ 0.005 0.01 + 0.015 0.01 - 0.025 0.03 + 0.035 0.04 + 0.045 0.04 - 0.055 0.06 + ... 0.905 0.91 + 0.915 0.92 + 0.925 0.93 + 0.935 0.94 + 0.945 0.94 - 0.955 0.95 - 0.965 0.96 - 0.975 0.97 - 0.985 0.98 - 0.995 0.99 - 50 rounded down, 50 rounded up.

    So the real problem here is some people's naive expectations left over from 4th-grade math class, not Perl's more accurate and practical floating point behavior that it inherits from C.

    Banker's rounding is fine when doing calculations with pencil and paper. Let's not slow down Perl to conform to naive or out-dated human expectations.

    P.S. Yes, I know that 0.000034999999999999 is not the same as 0.00003499­9999999999­99. If you didn't immediately notice the discrepancy, then I'm not surprised. The difference was intentional and, in the end, the similarity between the two seems so obvious that I decided to not even comment on it beyond this postscript.

    Update: The printf( "%.5f\n", sprintf("%­.12e",$_) ) code neglects to insert the '1' that compensates for numbers sometimes being converted to base-2 values ever so slightly smaller than the number that the string represents. s/e/1e/i would be a reasonable way to insert that '1' but that doesn't fit easily into a simple, single-expression example.

    - tye        

      I'll take your rant and raise you one...

      Flippant disregard for naïveté may be geek-chic, but I live in the real world. That is, while you really do make a valid point, those fourth grade expectations are common, and are consistent in an easily observable way. If I went and told my users that their expectations are merely naive, I would lose.

      ... shirt, respect, credibility and eventually job.

      The best answer noted elsewhere in this thread, is a hardware limitation. The SPARC architecture does not have a way to set the rounding mode for floating point operations. However, x86 Operating Systems (including recent MacOS, but still not Solaris) seems to have a way to reach into the floating point hardware to ask for a NEAR rounding mode. I can sell broken architecture as an excuse (since it is demonstrably true) far better than I can sell broken user expectations (which are not demonstrable at all).

      Or, I can cheat, use string_round(), and look like I got a bug fixed where nobody else could do it efficiently. I don't mind cheating, as long as it doesn't cause any real harm. This is such a case, since as BrowserUK dismissively notes above, all rounding is cosmetic anyway.

        I'll take your rant and raise you one...

        Rant? Well, if you want to read it that way, then I can't stop you.

        The best answer noted elsewhere in this thread, is a hardware limitation. The SPARC architecture does not have a way to set the rounding mode for floating point operations. However, x86 Operating Systems (including recent MacOS, but still not Solaris) seems to have a way to reach into the floating point hardware to ask for a NEAR rounding mode.

        Um, no. I already replied to that claim showing that a much simpler explanation is that the inconsistent rounding is simply a consequence of converting between base 10 and base 2 and that BrowserUk's results are more due to him computing 0.000035 through repeated additions.

        I'll be adding a reply shortly that clearly shows that I was correct on that point. The rounding mode makes absolutely no difference in how sprintf rounds 0.000035 and the rounding mode was already at NEAR anyway!

        Flippant disregard for naïveté may be geek-chic

        Yeah, again, if you want to read it that way, then I can't stop you. I actually tried to give a useful solution for meeting naive 4th-grade expectations:

        printf( "%.5f\n", sprintf("%­.12e",$_) )

        (except I left off inserting the '1' by mistake, which would complicate the code a bit, but nowhere near as much as your "nobody else could do it efficiently" code.) I decided not to go into it in much detail such as noting that the value for 12 can be somewhat different on different builds. I'd hoped Config exposed the number of digits Perl considered significant but it doesn't seem to. Replace the 12 with a value perhaps equal or slightly less than Perl's "significant digits" count.

        - tye        

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://925024]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2024-04-18 16:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found