Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Kullback–Leibler divergence Module?

by ZWcarp (Beadle)
on May 16, 2012 at 15:17 UTC ( #970857=perlquestion: print w/replies, xml ) Need Help??
ZWcarp has asked for the wisdom of the Perl Monks concerning the following question:

Is anyone aware of a Kullback–Leibler divergence (symmetrized) perl implementation or module ? I basically coded one myself that takes in a file and does all the variables but its really not very efficient. Thanks for your time.

Replies are listed 'Best First'.
Re: Kullback–Leibler divergence Module?
by kennethk (Abbot) on May 16, 2012 at 15:43 UTC

    A quick search on CPAN yielded no useful results, though Kullback Leibler Perl yielded a few results.

    If you've already gone through the trouble of coding, it might be worth while to post the code here for some aid in optimization. Have you run it through Devel::NYTProf to see where your bottle necks are?

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Kullback–Leibler divergence Module?
by snape (Pilgrim) on May 16, 2012 at 19:19 UTC

    You may use a shell script, that calls perl (do the computations in perl) and then call R. R has a FNN package/module which has Kl.dist function which you may use for K-L distance.

    Update: I have written the KL Divergence for discrete probability distribution samples. It is very naive but might be useful.

    #!/usr/bin/perl -w ## for Discrete Probability Distribution ## considering that sum of probability values in both arrays equals to + 1 my $dist = 0; my @terms = ('a', 'b','c','d'); my @P = (0, 0.3, 0.4, 0.3); my @Q = (0.2, 0.2, 0.3, 0.3); if ( (scalar(@P) != scalar(@terms) ) && (scalar(@Q) != scalar(@terms)) + ){ print " The size should be same \n"; exit; } else{ for(my $i = 0; $i<= $#P; $i++){ my $temp = 0 if($P[$i] == 0 || $Q[$i] == 0); $temp = $P[$i]*log($P[$i]/$Q[$i]) if($P[$i] != 0 && $Q[$i] != +0); $dist = $dist + $temp; } } print "The Kullback Distance symmetric for discrete Distribution is :" +, $dist,"\n";

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://970857]
Approved by ww
[Corion]: Meh. SQL window functions would even make pagination easy/trivial (but not performant), as rank() over (partition by user order by timestamp) / 10 as page would give me a page number for each item, with 10 items per page.
[Corion]: Of course, the query performance for "all items on page 10" is likely worse than rank() between 100 and 109 , but if that means I can write 15 lines of SQL instead of needing to think about how to partition things and how to encode the page size...
[Corion]: ... that would be nice. But alas, I'm currently tied to SQLite as minimum implementation, and it doesn't implement window functions :-(
[Corion]: And I'm not aware of any other serverless SQL implementation that even reaches the capability of SQLite, not to mention surpasses it

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (13)
As of 2018-03-22 12:17 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (274 votes). Check out past polls.