Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method

by monkfan (Curate)
on Mar 23, 2005 at 10:03 UTC ( #441728=perlquestion: print w/ replies, xml ) Need Help??
monkfan has asked for the wisdom of the Perl Monks concerning the following question:

I am attempting to compute a binomial distribution based on formulation here.

Basically binomial distributions return the probability of an event occuring $k times. In $n attempts, where the probability of it occuring in a single attempt is $p.

I tried two methods that should give the identical result. One subroutine using Math::Pari via its 'binomial' function plus logarithmic function and the other one using brute combinatorial method. In the end it's the log-Math::Pari method which I intended to use, since it is able to handle large number.

But however the result given by log-Math::Pari function is different from the correct combinatorial Method.
Sub Binom Comb = 0.3125 #This is correct Sub Binom Log = 0.06868298952623189587 #wrong
What's wrong with my log-Math::Pari subroutine? It seems to me I have constructed log version mathematically in a right way. Or have I used the Math::Pari function wrongly?
#!/usr/bin/perl -w use strict; use Math::Pari qw(binomial); my $n = 6; my $k = 3; my $p = 0.5; my $sub_binom_comb = binomial_comb($k,$n,$p); my $sub_binom_log = binomial_log($k,$n,$p); print "Sub Binom Comb = $sub_binom_comb\n"; print "Sub Binom Log = $sub_binom_log\n"; #----My Subroutines ------------ sub binomial_log{ #Find binomial distributions using Log #and Math::Pari my ($k, $n, $p) = @_; my $first = log(binomial($n,$k)); #The above binomial function from Math::Pari my $second = $k * log($p); my $third = ($n-$k) * log (1-$p); my $log_Prob = $first + $second + $third; my $Prob = 10 ** ($log_Prob); return $Prob; } sub binomial_comb { #With combinatorial method #using brute factorial my ($x, $n, $p) = @_; return unless $x >= 0 && $x == int $x && $n > 0 && $n == int $n && $p > 0 && $p <1; return choose($n,$k) * ($p ** $x) * ((1-$p) ** ($n-$x)); } sub choose { my ($n,$k) = @_; my ($result,$j) = (1,1); return 0 if $k>$n||$k<0; $k = ($n - $k) if ($n - $k) <$k; while ($j <= $k ) { $result *= $n--; $result /= $j++; } return $result; }

Comment on Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
Select or Download Code
Re: Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
by tlm (Prior) on Mar 23, 2005 at 11:16 UTC
    Perl's log is natural log. Change
    my $Prob = 10 ** ($log_Prob);
    my $Prob = exp $log_Prob;

    the lowliest monk

Re: Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
by polettix (Vicar) on Mar 23, 2005 at 11:51 UTC
    As a side note to the previous comment, perldoc -f log also tells you that you have to implement the log10 by yourself if you ever need it:
    sub log10 { my $n = shift; return log($n)/log(10); }
    This isn't really useful in your case, anyway.


    -- Don't fool yourself.
Re: Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
by tall_man (Parson) on Mar 23, 2005 at 16:22 UTC
    You could get more accuracy from the Math::Pari version if you did the powers and multiplies on the Pari side and didn't use logs:
    sub binomial_pari { my ($k, $n, $p) = @_; my $first = binomial($n,$k); # gpui is imported from Math::Pari. It is x ** y. my $second = gpui($p,$k); my $third = gpui(1.0 - $p, $n - $k); my $Prob = $first * $second * $third; return $Prob; }
    Here are the different results I get:
    Sub Binom Comb = 0.3125 Sub Binom Log = 0.3125000000000001128 Sub Binom Pari = 0.3125000000000000000
      When speed matters (apart from ability to deal with *BIG* number), the performance are:
      Rate binomial_log binomial_pari binomial_comb binomial_log 6164/s -- -61% -87% binomial_pari 15709/s 155% -- -66% binomial_comb 45985/s 646% 193% --
        If you just want an approximation, then performance may be better still if you use the fact that from Stirling's formula the log of n! is approximately
        log(n**n * exp(-n) * sqrt(2*PI*n)) = n*log(n) - n + (log(2 + log(PI) + log(n))/2
        Plug that into the fact that n choose m is n!/(m!*(n-m)!) and you can come up with a good approximation that uses Perl's native arithmetic.

        This approximation may well turn out to be the fastest approach for large numbers.

        Update: From Mathworld I just found out that a significantly better approximation of n! is sqrt(2*n*PI + 1/3)*n**n/exp(n). Algebraically this is less convenient, but the improved accuracy may matter for whatever you're trying to do.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://441728]
Approved by davidj
Front-paged by metadoktor
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (15)
As of 2014-07-14 14:43 GMT
Find Nodes?
    Voting Booth?

    When choosing user names for websites, I prefer to use:

    Results (262 votes), past polls