in reply to Re: Integers sometimes turn into Reals after substraction
in thread Integers sometimes turn into Reals after substraction

Well, I understand that. The problem is that it is inconsistent. The result depends not only on the type of calculation which led to $x2:
perl -e '$x1=256080; $x2 = 1000 * ( 4*60 + 18 + 4/25 ); $diff=$x2-$x1; print "x1=$x1, x2=$x2, diff=$diff\n";'
x1=256080, x2=258160, diff=2080.00000000003
perl -e '$x1=256080; $x2 = 1000 * ( 4*60 + 18 ) + 1000*(4/25); $diff=$x2-$x1; print "x1=$x1, x2=$x2, diff=$diff\n";'
x1=256080, x2=258160, diff=2080

but also on the value of $x1 which has never been calculated.

perl -e '$x1=25608; $x2 = 1000 * ( 4*60 + 18 + 4/25 ); $diff=$x2-$x1; print "x1=$x1, x2=$x2, diff=$diff\n";'
x1=25608, x2=258160, diff=232552
  • Comment on Re^2: Integers sometimes turn into Reals after substraction

Replies are listed 'Best First'.
Re^3: Integers sometimes turn into Reals after substraction (printed accuracy and float subtraction)
by LanX (Sage) on May 14, 2016 at 11:54 UTC
    Have you seen the update in my reply?

    Floats are automatically adjusted to avoid leading zeros in the mantissa.

    update

    Think float, this

    x1 = 256080 x2 = 258160 diff = 2080.00000000003

    really just means (in slightly inaccurate decimal interpretation)

    x1 = 2.56080 e5 x2 = 2.5816000000000003 e5 (error ignored when printed) diff = 2.08000000000003 e3 (error visible when printed)

    As you can see the error (here decimal 3) is shifted to the left and out of error margin into printed "visibility".

    update

    after firing up my laptop, here a proof of concept

    DB<115> $x1 = 2.56080e5 => 256080 DB<116> $x2 = 2.5816000000000003e5 => 258160 # within error margin, ignored in normal di +splay DB<117> printf '%.11f', $x2 258160.00000000003 # forced into visibility, hence $x2 NOT an +integer DB<118> $x2-$x1 => 2080.00000000003 # error can't be ignored any more

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

      Liked your post! I've been looking at illustrated perlguts,

      If my limited understanding is correct, then Perl keeps Integer Values IV's as a separate type from floats (NV's).
      This 4/25 stuff creates a NV. (1000*4/25) would still be an integer, an IV. So aside from the output precision issues, a root problem is creating an NV in the first place? If the math never used anything but all integer values, then this output precision issue doesn't appear? Right or not?

        The order of operations is undefined when there is no sequence point, and there is no inherent difference in the precedence of the operators in the expression. At least if it had been 1000+4/25, it would be pretty obvious that the 4/25 has to be evaluated before 1000 can be added to it. In a language like C or C++ we would say it's up to the implementation. It's not so different with Perl, in this example, except that the implementation is a little more like a functioning language specification on matters that aren't explicit in documentation or tests.

        With C or C++ (or other typed languages) it would be reasonably trivial to declare all the types in question to be int's and then there would be no risk of the unexpected, unless you expect int(4/25) to be a non-zero value.

        If you want to exercise a little more caution you could re-order the operations by using explicit precedence; (1000*4)/25 ought to get you a pretty sensible result.

        There is an article that becomes a bit of a dense read, but it otherwise quite good, and will serve to shed light on the issue: What Every Computer Scientist Should Know About Floating Point Math. But if that's a little too much to get through, consider this:

        We grow up and in elementary school are taught that 1/3rd is just about 0.33. We know that it isn't exactly 0.33, and in fact we know that no matter how many 3's we add to the end of the number we'll still never quite get to 1.3rd in our decimal representation. And yet when we see 33%, we naturally conclude that we're talking about 1/3rd of some entity. It comes pretty natural to us, and we accept it. ...mostly because it's taught from an early age, and because with our ten fingers we tend to think in terms of decimal.

        As you're certainly aware computers don't think in decimal. The fraction 1/3rd is impossible for us to represent in decimal. But in binary, the fraction 1/10th is impossible to represent accurately. In fact, any number that cannot be represented evenly by n/(2^m) becomes as problematic for a computer to represent as 1/3rd becomes for us with our decimal system. 4/25th's could be 8/50's, or 16/100ths, 32/200ths, 64/400ths, 128/800's, 256/1600ths. All very easy to look at in fractional form, all a perfectly clean 0.16 in base 10, but in its binary representation (converted back to base-10), about the closest our computers can come is 1600000000000000033306690738754696212708950042724609375. Even if you remove division from it, and try the following:

        $ perl -E 'printf "%.56f\n", .16'

        You will see something like...

        0.16000000000000000333066907387546962127089500427246093750

        There are various strategies for overcoming this limitation in how computers represent floating point internally. One is to do all your math at a multiple of 100 or 1000, and then round results to integers. Another might be to simply forge ahead with floating point but sprintf "%.0f" with the result before outputting it. Or store all your floats as fractions (keep track of the numerator and denominator). Whatever strategy you need, just be consistent within your application.


        Dave

        don't know much about perlguts, but

        > If the math never used anything but all integer values, then this output precision issue doesn't appear?

        Yes!

        (well you could still knock at integer limits and need to resort to bigint, but that's another problem)

        This phenomenon already caused much sorrow in financial calculations where cents where inaccurate, and fiscal authorities can be very nasty (or nazi) about inaccurate cents.

        The rule of thumb is, if you want n decimal points accuracy, do a n points left shift before starting and an n point right shift of the results.

        Roughly speaking: If you need accurate cents, calculate in cents and only convert the result into dollars. (I'd do 1/100th cents to be sure)

        Like this all relevant problems between decimal and binary calculations will stay beyond error margin.

        Nota Bene: This won't solve the problem of error propagation if you are doing loads of calculations, but financial businesses normally define explicit rounding rules to normalize this.

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

        > (1000*4/25) would still be an integer, an IV

        Doesn't seem so:

        $ perl -MDevel::Peek -e '$x=(4000/25); Dump($x)' SV = NV(0x9f2f570) at 0x9f286d8 REFCNT = 1 FLAGS = (NOK,pNOK) NV = 160

        Cheers Rolf
        (addicted to the Perl Programming Language and ☆☆☆☆ :)
        Je suis Charlie!

        [Ignore. I just noticed that both expressions are completely constant.]

        So aside from the output precision issues, a root problem is creating an NV in the first place?

        Not quite.

        The root problem is trying to store a number that's periodic in binary into a floating point number.

        ____________________ 0.16 base 10 = 0.00101000111101011100 base 2

        That means that it can't be stored exactly as a float. (Obviously, it can't be stored exactly as an integer either.)

        The compiler apparently uses an alternate means of calculating 1000*(4/25) such that it produces exactly 160. Whether that's stored as an IV, UV or NV is irrelevant.

      Thank you for the very detailed replies in this thread. I will now use sprintf "%$x.0f" much more often to not be bitten again...