Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: May I be bitten by floating point arithmetic in the following restricted case?

by kennethk (Abbot)
on Feb 23, 2010 at 18:46 UTC ( #824918=note: print w/replies, xml ) Need Help??

in reply to May I be bitten by floating point arithmetic in the following restricted case?

There are two limiting cases for failure here. Assuming LIMIT is a power of two, either SIZE = LIMIT could be binned too low or SIZE = LIMIT-1 could be binned too high. For the latter to happen, 1/LIMIT would have to be smaller than your number of significant figures (you can show this with a Taylor series). Obviously where that falls depends on your machine and build, but on mine that starts failing at 2**48 and will always precede the transition to floating point representation for $size.

for (1 .. 2**10) { my $size = 2**$_; print "Fail $_ high\n" if (log((2**$_ - 1)) / log(2) ) == $_; }

Regarding what I perceive as the main question, note that you are getting fortunate. If you run the code printf("$_: %.16e\n", log(2**$_)/log(2)) for (1 .. 2**10); (assuming native doubles) you will see that the result of your division is not exactly correct - your last digit is high in a large fraction of the offerings. This is a function of the logarithm as implemented. If instead of the above, I explore the powers of 3, all inexact cases are low, not high. This implies to me that the internal representation of log(2) is ever so slightly lower than the true value.

The better question is if you should care about this inaccuracy. If you are just gathering file statistics, inaccuracy in the absolute position of the boundary should not significantly skew your results assuming a smooth file size p.d.f. By the time your algorithm fails, you are nearly to a point where you can no longer identify file sizes with integers. However, if it is mission critical to be literally correct, you could use a hash to build a look-up table.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://824918]
usemodperl mourns the monastery

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (6)
As of 2018-06-19 19:23 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (114 votes). Check out past polls.