in reply to Re^2: [OT] Statistics question. in thread [OT] Statistics question.
BrowserUk:
I know what you mean. Prob & Stat are always confusing to me, and I always have to slog through some reading before I feel comfortable with writing anything about it.
I found how to compute the standard deviation, but either I'm doing it wrong (most likely), or it's pretty useless because of loss of precision (For the expected value, we subtract one huge number from another and magically get the result. It seems the same occurs with the standard deviation but the numbers are *much* larger, so for interesting problem sizes, the interesting bits get pushed off the right end of the float).
I currently don't have anything useful for standard deviation yet...
Update: I like the idea of knowing the standard deviation, though, as it may make it possible to estimate of the number of "positive matches" from your document & bloom filter project. ("Hmmm, we expect 200 false positives with a variance of 20, but we're seeing 400, so we've probably got about 180 to 220ish matches.")
...roboticus
When your only tool is a hammer, all problems look like your thumb.
Re^4: [OT] Statistics question. by BillKSmith (Curate) on Jan 30, 2013 at 15:25 UTC 
Are your *much* larger numbers perfect squares? If so, you can avoid the problem by factoring into the sum and difference of their square roots.
 [reply] 

Bill:
I was getting ginormous values like 1.6...E30 and the like. Since I'm using exp(...) to generate them, I never have nice integers to play with, so I doubt they form perfect squares or such. But I'm not current on things like that, so I could be wrong.
...roboticus
When your only tool is a hammer, all problems look like your thumb.
 [reply] 

 [reply] [d/l] [select] 
Re^4: [OT] Statistics question. by BrowserUk (Pope) on Jan 30, 2013 at 14:42 UTC 
Update: I like the idea of knowing the standard deviation, ...
For this problem  testing a sparse bitvector implementation  moritz' simplified calculation appears to be 'good enough' for my purpose. I'm not capable of assessing how applicable it would be to the math you came up with for my multivector hashing (ie. weird bloom filter) project.
However, there is one way that these two projects may become connected. In that, if my sparse bitvector implementation proves to be sufficiently speedy, I may recode the multivector hashing algorithm to use it because: a) it would great;y reduce the memory requirement; b) it would opne up the possibility of increasing the discrimination by using more than 10 vectors. (But that's a project for another day :)
With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks  Silence betokens consent  Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
 [reply] 
