Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method

by monkfan (Curate)
on Mar 23, 2005 at 10:03 UTC ( #441728=perlquestion: print w/ replies, xml ) Need Help??
monkfan has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I am attempting to compute a binomial distribution based on formulation here.

Basically binomial distributions return the probability of an event occuring $k times. In $n attempts, where the probability of it occuring in a single attempt is $p.

I tried two methods that should give the identical result. One subroutine using Math::Pari via its 'binomial' function plus logarithmic function and the other one using brute combinatorial method. In the end it's the log-Math::Pari method which I intended to use, since it is able to handle large number.

But however the result given by log-Math::Pari function is different from the correct combinatorial Method.
Sub Binom Comb = 0.3125 #This is correct Sub Binom Log = 0.06868298952623189587 #wrong
What's wrong with my log-Math::Pari subroutine? It seems to me I have constructed log version mathematically in a right way. Or have I used the Math::Pari function wrongly?
#!/usr/bin/perl -w use strict; use Math::Pari qw(binomial); my $n = 6; my $k = 3; my $p = 0.5; my $sub_binom_comb = binomial_comb($k,$n,$p); my $sub_binom_log = binomial_log($k,$n,$p); print "Sub Binom Comb = $sub_binom_comb\n"; print "Sub Binom Log = $sub_binom_log\n"; #----My Subroutines ------------ sub binomial_log{ #Find binomial distributions using Log #and Math::Pari my ($k, $n, $p) = @_; my $first = log(binomial($n,$k)); #The above binomial function from Math::Pari my $second = $k * log($p); my $third = ($n-$k) * log (1-$p); my $log_Prob = $first + $second + $third; my $Prob = 10 ** ($log_Prob); return $Prob; } sub binomial_comb { #With combinatorial method #using brute factorial my ($x, $n, $p) = @_; return unless $x >= 0 && $x == int $x && $n > 0 && $n == int $n && $p > 0 && $p <1; return choose($n,$k) * ($p ** $x) * ((1-$p) ** ($n-$x)); } sub choose { my ($n,$k) = @_; my ($result,$j) = (1,1); return 0 if $k>$n||$k<0; $k = ($n - $k) if ($n - $k) <$k; while ($j <= $k ) { $result *= $n--; $result /= $j++; } return $result; }
Regards,
Edward

Comment on Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
Select or Download Code
Re: Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
by tlm (Prior) on Mar 23, 2005 at 11:16 UTC
    Perl's log is natural log. Change
    my $Prob = 10 ** ($log_Prob);
    to
    my $Prob = exp $log_Prob;

    the lowliest monk

Re: Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
by polettix (Vicar) on Mar 23, 2005 at 11:51 UTC
    As a side note to the previous comment, perldoc -f log also tells you that you have to implement the log10 by yourself if you ever need it:
    sub log10 { my $n = shift; return log($n)/log(10); }
    This isn't really useful in your case, anyway.

    Flavio

    -- Don't fool yourself.
Re: Getting Binomial Distribution under Math::Pari (log) and Combinatorial Method
by tall_man (Parson) on Mar 23, 2005 at 16:22 UTC
    You could get more accuracy from the Math::Pari version if you did the powers and multiplies on the Pari side and didn't use logs:
    sub binomial_pari { my ($k, $n, $p) = @_; my $first = binomial($n,$k); # gpui is imported from Math::Pari. It is x ** y. my $second = gpui($p,$k); my $third = gpui(1.0 - $p, $n - $k); my $Prob = $first * $second * $third; return $Prob; }
    Here are the different results I get:
    Sub Binom Comb = 0.3125 Sub Binom Log = 0.3125000000000001128 Sub Binom Pari = 0.3125000000000000000
      When speed matters (apart from ability to deal with *BIG* number), the performance are:
      Rate binomial_log binomial_pari binomial_comb binomial_log 6164/s -- -61% -87% binomial_pari 15709/s 155% -- -66% binomial_comb 45985/s 646% 193% --
      Regards,
      Edward
        If you just want an approximation, then performance may be better still if you use the fact that from Stirling's formula the log of n! is approximately
        log(n**n * exp(-n) * sqrt(2*PI*n)) = n*log(n) - n + (log(2 + log(PI) + log(n))/2
        Plug that into the fact that n choose m is n!/(m!*(n-m)!) and you can come up with a good approximation that uses Perl's native arithmetic.

        This approximation may well turn out to be the fastest approach for large numbers.

        Update: From Mathworld I just found out that a significantly better approximation of n! is sqrt(2*n*PI + 1/3)*n**n/exp(n). Algebraically this is less convenient, but the improved accuracy may matter for whatever you're trying to do.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://441728]
Approved by davidj
Front-paged by metadoktor
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2015-07-04 22:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls