http://www.perlmonks.org?node_id=477227


in reply to Re: standard deviation accuracy question
in thread standard deviation accuracy question

Just to get this info in PM for future lookups in case that link dies:


1. The binomial standard deviation

The binomial standard deviation applies to events with two outcomes: win or lose. For example, betting on heads in coin tossing can lead to win (the appearance of heads) or loss (the appearance of the opposite; tails, in this case). The binomial standard deviation is calculated by the following formula: Standard deviation = SQR{(N*p*(1-p)}

That is, the square root of: the number of trials (events) N, multiplied by the probability p, multiplied by the opposite probability (or 1 minus p). (where p is the probability of appearance and N represents the number of trials).


Suppose we toss a coin 100 times (N=100). The probability of heads is p=1/2=0.5. The standard deviation is SQR{100 * 0.5 * 0.5} = SQR(100 * .25) =SQR(25) = 5. The expected number of heads in 100 tosses is 0.5 * 100 = 50. The rule of normal probability proves that in 68.2% of the cases, the number of heads will fall within one standard deviation from the number of expected successes (50). That is, if we repeat 1000 times the event of tossing a coin 100 times, in 682 cases we'll encounter a number of heads between 45 and 55.


2. The statistical standard deviation

There is no formula to calculate the statistics standard deviation directly (?) That's what they told you in school. That's what they say in other public places with the self-proclaimed goal of education. Only an algorithm can lead to the standard deviation of a data series. Indeed, the algorithm is always available. The following are the steps of the algorithm implemented in my freeware Super Formula. Sum up data; calculate the mean average (sum total divided by the number of elements); deduct each element of the collection from the average; raise each difference to the power of 2; add up the squared differences; divide the new sum total by the number of elements in the data series; the result represents the variance; the square root of the variance represents the famous standard deviation.


A data series like 1, 2, 3, 6 has a mean average (mu) equal to: ? = (1+2+3+6)/4=3.

The differences from the mean are: -2, -1, 0, +3. The variance (sigma squared) is the measurement of the squared deviations. The variance is calculated as: ?˛ = {(-2)2 + (-1)2 + 0 + 32}/4=14/4=3.5. Finally, the standard deviation (sigma) is equal to the positive square root of the variance: ? = SQR(3.5)=1.87.


Nevertheless, there are formulae (plural, indeed) to calculate the statistical deviation in advance. There is a dominant deviation parameter in all the stochastic (probabilistic) events. In fact, all events are stochastic, since randomness is present in everything-there-is. Nothing-there-is can exist with absolute certainty (see the mathematics of the absurdity of absolute certainty: www.saliu.com/formula.htm). The elements of a stochastic phenomenon deviate from one another following mathematical rules. The difference is in the probability of the event (phenomenon). The probability then determines subsequent parameters, such as volatility, standard deviation, FFG deviation, etc.


In 2003 I announced that I had discovered a formula for a very important measure in the fluctuation of probability events: FFG deviation. See “New pairing research”. Soon thereafter I have been bombarded with requests to present the formula for FFG deviation and the statistical standard deviation. Of course, I was asked (in strong terms sometimes) to release also free software to accompany the formulae calculations. The requests were also presented in public forums, sometimes strongly worded.


At this time, I do not publish the formulae to calculate the FFG deviation and the statistical standard deviation. Such an act would serve people I do not want to serve. They belong to the following categories: gambling developers and high rollers; lottery systems and software developers; stock traders. I know exactly whom I am talking about. I have received many a message from them. They inundated my hou' with correspondence, including postal mail. They would be the ones that would charge serious money out of my effort. The vast majority of people do not really need to know exactly all the formulas involved in standard deviation calculations. Suffice to say that my software does incorporate standard deviation calculations. Also, the greatest random number/combination generator — IonSaliuGenerator — makes extraordinarily good usage of the standard deviation.



-Waswas