Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

There are two limiting cases for failure here. Assuming LIMIT is a power of two, either SIZE = LIMIT could be binned too low or SIZE = LIMIT-1 could be binned too high. For the latter to happen, 1/LIMIT would have to be smaller than your number of significant figures (you can show this with a Taylor series). Obviously where that falls depends on your machine and build, but on mine that starts failing at 2**48 and will always precede the transition to floating point representation for $size.

for (1 .. 2**10) { my $size = 2**$_; print "Fail $_ high\n" if (log((2**$_ - 1)) / log(2) ) == $_; }

Regarding what I perceive as the main question, note that you are getting fortunate. If you run the code printf("$_: %.16e\n", log(2**$_)/log(2)) for (1 .. 2**10); (assuming native doubles) you will see that the result of your division is not exactly correct - your last digit is high in a large fraction of the offerings. This is a function of the logarithm as implemented. If instead of the above, I explore the powers of 3, all inexact cases are low, not high. This implies to me that the internal representation of log(2) is ever so slightly lower than the true value.

The better question is if you should care about this inaccuracy. If you are just gathering file statistics, inaccuracy in the absolute position of the boundary should not significantly skew your results assuming a smooth file size p.d.f. By the time your algorithm fails, you are nearly to a point where you can no longer identify file sizes with integers. However, if it is mission critical to be literally correct, you could use a hash to build a look-up table.

In reply to Re: May I be bitten by floating point arithmetic in the following restricted case? by kennethk
in thread May I be bitten by floating point arithmetic in the following restricted case? by rubasov

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others imbibing at the Monastery: (5)
    As of 2017-10-18 09:52 GMT
    Find Nodes?
      Voting Booth?
      My fridge is mostly full of:

      Results (244 votes). Check out past polls.