P is for Practical PerlMonks

### Re: [OT] Statistics question.

by moritz (Cardinal)
 on Jan 30, 2013 at 09:18 UTC ( #1016005=note: print w/replies, xml ) Need Help??

in reply to [OT] Statistics question.

I'll do a small simplification in order to use a much simpler model: I assume that we have one list (duplicates allowed) and one set (no duplicates allowed).

Then for each member of the list, the probability of having a match in the set is P(1) = 1e6/2**32.

Since we've assumed a list, all the probabilities of having matches are independent, and the expectation value is simply 1e6 * P(1) = 1e6 * 1e6/2**32 = 232.83.

If the number of matches is a Poisson distribution (and I suspect it is, in this example), then the standard deviation is simply the square root of the expectation value, so 15.5.

It is hard for me to estimate how big an error I've made by this simplification; I'll update the node if I get an idea of how to estimate it.

Replies are listed 'Best First'.
Re^2: [OT] Statistics question.
by BrowserUk (Pope) on Jan 30, 2013 at 11:55 UTC
the expectation value is simply 1e6 * P(1) = 1e6 * 1e6/2**32 = 232.83 ... the standard deviation is simply the square root of the expectation value, so 15.5.

Based upon a run of 100 samples, that seems to match quite nicely:

```C:\test>bitvec2 -N=100
100
Mean: 230.95 stddev:14.75

I'm doing a run of 1000 samples, and if that shows no surprises, I'll be taking your figures as read and basing my testing upon it.

Thank you.

Update: The 1000 sample run marries well (good enough):

```C:\test>bitvec2 -N=1000
1000
Mean: 233.39 stddev:15.79

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Create A New User
Node Status?
node history
Node Type: note [id://1016005]
help
Chatterbox?
 [stevieb]: At least that's what I'm working on right now. I've also been updating my automated test software Test::BrewBuild so it handles this work properly. It's not really easy meshing high level languages with low-level hardware :) [stevieb]: this link is probably better for an overview of my test software [stevieb]: the top-level Raspberry Pi distribution, that sucks in all sub modules. All of this software have imminent updated releases coming

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2017-06-25 23:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How many monitors do you use while coding?

Results (572 votes). Check out past polls.