http://www.perlmonks.org?node_id=466322

Commander Salamander has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone, I'm rather new to perl, and have been trying to impelement a calculation of hypergeometric distribution probabilities into some code. For those of you that aren't familiar, these statistics address the following type of question:

"If I have a box with 300 white balls and 700 black balls and I draw 100 balls from the box, what are the odds that I draw 40 or more white balls".

I've encountered two (related) problems:

1 These calculations require very large numbers, apparently necessitating the use of a module such as BigInt or BigFloat (though I don't fully understand at what size number these modules become necessary). By playing with someone else's publicly available code and clumsily implementing this functionality, I was able to get things to work... however:

2 Especially when I'm using very large samples, the process of calculating the probability of is excruciatingly slow due to the large number of factorial calculations necessary.

I would greatly appreciate advice on which "big number" module is the most efficient for factorials, and whether there is any conventional wisdom as to how I could best tackle this problem.

Thanks!
-ollie-