Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^2: [OT] Stats problem

by BrowserUk (Patriarch)
on Feb 26, 2015 at 12:06 UTC ( [id://1117939]=note: print w/replies, xml ) Need Help??


in reply to Re: [OT] Stats problem
in thread [OT] Stats problem

Hm. Not convinced.

If you write random bytes to the whole 4GB, then inspect them as 4-byte aligned U32s, then only 1/4 of the possible values or less will appear.

Only inspect every other 4-byte aligned U32 and only 1/8th or less of the possible values will appear.

And if you write the the same value to every single slot, it could only match the offset at one position.

So the 7/8th or more of the possible values will not appear any where; and of the rest, the chances that the value will appear at an offset that matches the value have to be slim. Bordering on impossible.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

Replies are listed 'Best First'.
Re^3: [OT] Stats problem
by RichardK (Parson) on Feb 26, 2015 at 13:08 UTC
    If you're not convinced then why not build yourself a small monte carlo simulation and try it out?

      Atfer 5000 runs of 1/2 billion checks per run; the result is:

      ... Run: 2000 buks:231 stds:273 ... Run: 3000 buks:346 stds:393 ... Run: 4000 buks:466 stds:520 ... Run: 5000 buks:577 stds:663 Run: 5001 buks:578 stds:664 Run: 5002 buks:579 stds:664 Run: 5003 buks:579 stds:664 Run: 5004 buks:579 stds:665

      So the odds are:

      5000 * 536870912 = 2684354560000 total checks / false hits = odds of a false hit / 579 = 4636190949 offset / 665 = 4036623398 0xdeadbeef Expected odds = 4294967296

      Which mean you are right!

      However, as the statistics above reflect, and as I observed from several (very) short runs whilst sanity checking the code; the offset seems to beat the odds every time; whilst the fixed magic number seems to come a little shy of it every time.

      There are not enough observations and not a sufficiently big difference between them to conclude that this is anything other than expected variation. But it does seem consistent.

      I've started another (low priority) run with some sanity check code enabled that counts the occurrences of each random value seen. The extra code means it runs much more slowly; and the restrictions of my physical memory mean I've had to limit the counts to unsigned bytes; but by outputting when those counts rollover it should give a clear indication of whether all values are being generated, as 96 % of them should rollover within a few dozen runs of each other -- if my calculations are correct I should see the bulk of them at around 1024 runs mark.

      All of which goes to reinforce my long standing observation that -- for me -- statistics is the second most unintuitive thing -- after quantum mechanics -- that I know just-enough-to-be-dangerous about.

      At least with QM I'm in good company when it comes to finding it spooky :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

        Definitely unintuitive! Isn't quantum mechanics just statistics with extra difficulty and a physics PHD ? ;)

        At least with QM I'm in good company when it comes to finding it spooky :)

        I resemble that remark!

        -QM
        --
        Quantum Mechanics: The dreams stuff is made of

      build yourself a small monte carlo simulation

      Well, its running, but it'll need to run for a while to be statistically valid.

      In the meantime are you saying that the fact that the value at any given offset has to match both the value and the offset has no influence upon the chances of a false positive?

      The UK national lottery picks 6 balls from 49: 49!/(6!*(49-6)!) = 1:13,983,816 chance.

      And once all 6 balls are out of the machine; they reorder them by ascending value; so the result is always shown as B1 < B2 < B3 < B4 < B5 < B6.

      But if players had to match both the numbers and their draw order, it would be a lot harder. The odds would be 6! * 13,983,816 = 1:10,068,347,520.

      So, value and position: highly increased odds; but you're saying that's not a factor here?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I'm with torvalds on this
      In the absence of evidence, opinion is indistinguishable from prejudice. Agile (and TDD) debunked

        As I understand it the 2 events are completely independent. At the time the corruption occurs there a 100% percent chance that each cell has a fixed value, so it's just not relevant how that value was arrived at.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1117939]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (7)
As of 2024-04-23 07:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found