Problems? Is your data what you think it is? PerlMonks

### Multinomial Distribution Test in Perl

 on Aug 08, 2013 at 15:17 UTC Need Help??
ZWcarp has asked for the wisdom of the Perl Monks concerning the following question:

Hey Monks,

I am trying to do a multinomial test in Perl --> http://en.wikipedia.org/wiki/Multinomial_distribution.

First I was wondering if anyone knows a good stats package for this kind of thing

If not ( or in addition) if anyone knows of a good way to deal with factorials (which seem to be the main issue). Calculating 35,000! isn't really ideal. Any recommendations on trick's to deal with large factorials?

The formula is (n!/(Xsub1!.....Xsubk!)) *p1^(x1)....pk^(xk)

Replies are listed 'Best First'.
Re: Multinomial Distribution Test in Perl
by Loops (Curate) on Aug 08, 2013 at 17:05 UTC

In my limited testing Math::BigInt is two orders of magnitude faster than the next module I tried. At least it was on my crummy ole laptop. Just make sure to install the multicore module Math::BigInt::GMP as well which it uses to good advantage.

```# Runtime:  0.15s user 0.02s system 41% cpu 0.418 total
use Math::BigInt lib => 'GMP';
my \$b = Math::BigInt->new(35000);
print \$b->bfac(),"\n";
Re: Multinomial Distribution Test in Perl
by Laurent_R (Abbot) on Aug 08, 2013 at 17:12 UTC

I guess you'll need to simplify the fraction in the first term before doing the calculations, as it is often the case in probability calculations. This makes the calculations much faster and more accurate.

I don't remember enough about multinomial distributions, but let me take a simple example: computing the number of permutations of 5 items in a population of 1000. The formula is: n!/(n-k)! or, in the example, 1000!/995! . You don't want to calculate 1000! (a number with 2570 digits) and then calculate 995! (only 2552 digits...) and then proceed with the division between the two huge numbers, whereas you really need only: 996 * 997 * 998 * 999 * 1000. In Perl, this could be expressed simply as follows:

```my \$result = 1;
\$result *= \$n-- while \$k--;

Some form of similar simplication should be possible in your case.

Otherwise, there are on the CPAN many probability and statistics modules, including some for calculating scores of different distributions.

Update 17:35 UTC: added the permutation example.
Re: Multinomial Distribution Test in Perl
by McA (Priest) on Aug 08, 2013 at 15:32 UTC

Probably an entry point

Best regards
McA

Re: Multinomial Distribution Test in Perl
by pachydermic (Beadle) on Aug 08, 2013 at 17:16 UTC

Do you really have to use perl for this? I mean, I love the language and everything, but there are much better ways to do this kind of thing. Like:

http://stat.ethz.ch/R-manual/R-patched/library/stats/html/Multinom.html

or: http://www.mathworks.com/help/stats/mnrnd.html

I don't know your situation, but I don't think mathematics and statistics is really perl's strong point. Maybe it makes sense to use something designed for that?

Of course Perl is per(l)fect for this task: I regularly use Perl to write R-scripts.

Or have a look at the following modules: R::YapRI, R::Writer or Statistics::R.

CountZero

A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

My blog: Imperial Deltronics

Create A New User
Node Status?
node history
Node Type: perlquestion [id://1048588]
Front-paged by Corion
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2017-06-29 10:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
How many monitors do you use while coding?

Results (659 votes). Check out past polls.