in reply to Re^3: Integers sometimes turn into Reals after substraction (printed accuracy and float subtraction)
in thread Integers sometimes turn into Reals after substraction

Liked your post! I've been looking at illustrated perlguts,

If my limited understanding is correct, then Perl keeps Integer Values IV's as a separate type from floats (NV's).
This 4/25 stuff creates a NV. (1000*4/25) would still be an integer, an IV. So aside from the output precision issues, a root problem is creating an NV in the first place? If the math never used anything but all integer values, then this output precision issue doesn't appear? Right or not?

  • Comment on Re^4: Integers sometimes turn into Reals after substraction (printed accuracy and float subtraction)

Replies are listed 'Best First'.
Re^5: Integers sometimes turn into Reals after substraction (printed accuracy and float subtraction)
by davido (Cardinal) on May 14, 2016 at 23:00 UTC

    The order of operations is undefined when there is no sequence point, and there is no inherent difference in the precedence of the operators in the expression. At least if it had been 1000+4/25, it would be pretty obvious that the 4/25 has to be evaluated before 1000 can be added to it. In a language like C or C++ we would say it's up to the implementation. It's not so different with Perl, in this example, except that the implementation is a little more like a functioning language specification on matters that aren't explicit in documentation or tests.

    With C or C++ (or other typed languages) it would be reasonably trivial to declare all the types in question to be int's and then there would be no risk of the unexpected, unless you expect int(4/25) to be a non-zero value.

    If you want to exercise a little more caution you could re-order the operations by using explicit precedence; (1000*4)/25 ought to get you a pretty sensible result.

    There is an article that becomes a bit of a dense read, but it otherwise quite good, and will serve to shed light on the issue: What Every Computer Scientist Should Know About Floating Point Math. But if that's a little too much to get through, consider this:

    We grow up and in elementary school are taught that 1/3rd is just about 0.33. We know that it isn't exactly 0.33, and in fact we know that no matter how many 3's we add to the end of the number we'll still never quite get to 1.3rd in our decimal representation. And yet when we see 33%, we naturally conclude that we're talking about 1/3rd of some entity. It comes pretty natural to us, and we accept it. ...mostly because it's taught from an early age, and because with our ten fingers we tend to think in terms of decimal.

    As you're certainly aware computers don't think in decimal. The fraction 1/3rd is impossible for us to represent in decimal. But in binary, the fraction 1/10th is impossible to represent accurately. In fact, any number that cannot be represented evenly by n/(2^m) becomes as problematic for a computer to represent as 1/3rd becomes for us with our decimal system. 4/25th's could be 8/50's, or 16/100ths, 32/200ths, 64/400ths, 128/800's, 256/1600ths. All very easy to look at in fractional form, all a perfectly clean 0.16 in base 10, but in its binary representation (converted back to base-10), about the closest our computers can come is 1600000000000000033306690738754696212708950042724609375. Even if you remove division from it, and try the following:

    $ perl -E 'printf "%.56f\n", .16'

    You will see something like...

    0.16000000000000000333066907387546962127089500427246093750

    There are various strategies for overcoming this limitation in how computers represent floating point internally. One is to do all your math at a multiple of 100 or 1000, and then round results to integers. Another might be to simply forge ahead with floating point but sprintf "%.0f" with the result before outputting it. Or store all your floats as fractions (keep track of the numerator and denominator). Whatever strategy you need, just be consistent within your application.


    Dave

Re^5: Integers sometimes turn into Reals after substraction (printed accuracy and float subtraction)
by LanX (Sage) on May 14, 2016 at 13:42 UTC
    don't know much about perlguts, but

    > If the math never used anything but all integer values, then this output precision issue doesn't appear?

    Yes!

    (well you could still knock at integer limits and need to resort to bigint, but that's another problem)

    This phenomenon already caused much sorrow in financial calculations where cents where inaccurate, and fiscal authorities can be very nasty (or nazi) about inaccurate cents.

    The rule of thumb is, if you want n decimal points accuracy, do a n points left shift before starting and an n point right shift of the results.

    Roughly speaking: If you need accurate cents, calculate in cents and only convert the result into dollars. (I'd do 1/100th cents to be sure)

    Like this all relevant problems between decimal and binary calculations will stay beyond error margin.

    Nota Bene: This won't solve the problem of error propagation if you are doing loads of calculations, but financial businesses normally define explicit rounding rules to normalize this.

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

Re^5: Integers sometimes turn into Reals after substraction (printed accuracy and float subtraction)
by LanX (Sage) on May 14, 2016 at 14:00 UTC
    > (1000*4/25) would still be an integer, an IV

    Doesn't seem so:

    $ perl -MDevel::Peek -e '$x=(4000/25); Dump($x)' SV = NV(0x9f2f570) at 0x9f286d8 REFCNT = 1 FLAGS = (NOK,pNOK) NV = 160

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

Re^5: Integers sometimes turn into Reals after substraction (printed accuracy and float subtraction)
by ikegami (Pope) on May 14, 2016 at 15:07 UTC
    [Ignore. I just noticed that both expressions are completely constant.]

    So aside from the output precision issues, a root problem is creating an NV in the first place?

    Not quite.

    The root problem is trying to store a number that's periodic in binary into a floating point number.

    ____________________ 0.16 base 10 = 0.00101000111101011100 base 2

    That means that it can't be stored exactly as a float. (Obviously, it can't be stored exactly as an integer either.)

    The compiler apparently uses an alternate means of calculating 1000*(4/25) such that it produces exactly 160. Whether that's stored as an IV, UV or NV is irrelevant.