Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Fisher's Exact test (probability of 2x2 tables)

by jeroenes (Priest)
on Oct 03, 2001 at 12:19 UTC ( #116378=sourcecode: print w/replies, xml ) Need Help??
Category: Misc
Author/Contact Info Jeroen Elassaiss-Schaap, just /msg me at jeroenes

Authors of the original code: Kristopher J. Preacher and Nancy E. B

Description: This is a plain translation of this page into perl, kept close to the original code.
package Statistics::Fishersexact;

use strict;
use warnings;

use constant DECIMALS=>8;

sub round{
  my $frac= shift; my $dec = shift;

sub fac {
  my $z = shift;
  if ($z==0) { return 1; }
  if ($z>1) { return $z*fac($z-1); }
  else { return $z; }

sub sum {
  my $sum = 0;
  $sum += $_ for @_;

sub p_fisher {
  my $f11=shift;
  my $f12=shift;
  my $f21=shift;
  my $f22=shift;
  my $row1 = sum($f11,$f12);
  my $row2 = sum($f21,$f22);
  my $col1 = sum($f11,$f21);
  my $col2 = sum($f12,$f22);
  my $r1=sum($f11,$f12);
  my $r2=sum($f21,$f22);
  my $c1=sum($f11,$f21);
  my $c2=sum($f12,$f22);
  my $total = sum($r1,$r2);
  my ($a,$b,$c,$d,$z,$temp,$p,$ratio,$p1,$aaron,$flag);
  if ($r1<=$r2) { if ($c1<=$c2) { $a=$f11;$b=$f12;$c=$f21;$d=$f22; } }
  if ($r1<=$r2) { if ($c2<=$c1) { $a=$f12,$b=$f22;$c=$f11;$d=$f21; } }
  if ($r2<=$r1) { if ($c1<=$c2) { $a=$f21;$b=$f11;$c=$f22;$d=$f12; } }
  if ($r2<=$r1) { if ($c2<=$c1) { $a=$f22;$b=$f21;$c=$f12;$d=$f11; } }
  if ($b<=$a) { $z=$c;$c=$a;$a=$b;$b=$d;$d=$z; }
  elsif ($c<=$a) { $z=$b;$b=$a;$a=$c;$c=$d;$d=$z; }
  my $i;
  my ($ab,$ac,$bd,$cd,$abcd,$x1,$x2,$x3,$x4,$y1,$y2,$y3,$y4,$ra1,$ra2,
  for ($i=0;$i<10000;$i++) {
    $y1=fac($a); $y2=fac($b); $y3=fac($c); $y4=fac($d);
    for (my $i=0;$i<2;$i++) {
      if ($x2>$x1) { $z=$x1;$x1=$x2;$x2=$z }
      if ($x3>$x2) { $z=$x2;$x2=$x3;$x3=$z }
      if ($x4>$x3) { $z=$x3;$x3=$x4;$x4=$z }
    for (my $i=0;$i<2;$i++) {
      if ($y2>$y1) { $z=$y1;$y1=$y2;$y2=$z }
      if ($y3>$y2) { $z=$y2;$y2=$y3;$y3=$z }
      if ($y4>$y3) { $z=$y3;$y3=$y4;$y4=$z }
    $ra1=$x1/$y1;$ra2=$x2/$y2;$ra3=$x3/$y3; $ra4=$x4/$y4;
    if ($p1==0) {$aaron=$ratio }
    $a=sum($a,-1); $b=sum($b,1); $c=sum($c,1);$d=sum($d,-1);
    if ($a<0) { last }
  my $cp1=round($p1,DECIMALS);
  if ($p1<.00000001) { if ($p1>0) { $cp1=".00000001" } }
  if ($p1>.99999999) { if ($p1!=1) { $cp1=".99999999" } }
  if ($r1<=$r2) { if ($c1<=$c2) { $a=$f11;$b=$f12;$c=$f21;$d=$f22 } }
  if ($r1<=$r2) { if ($c2<=$c1) { $a=$f12,$b=$f22;$c=$f11;$d=$f21 } }
  if ($r2<=$r1) { if ($c1<=$c2) { $a=$f21;$b=$f11;$c=$f22;$d=$f12 } }
  if ($r2<=$r1) { if ($c2<=$c1) { $a=$f22;$b=$f21;$c=$f12;$d=$f11 } }
  if ($b<=$a) { $z=$c;$c=$a;$a=$b;$b=$d;$d=$z }
  elsif ($c<=$a) { $z=$b;$b=$a;$a=$c;$c=$d;$d=$z }
  my $p2=0;$ratio=0;$flag=0;
  for (my $i=0;$i<10000;$i++) {
    if ($b<0) { last }
    if ($c<0) { last }
    for ($i=0;$i<2;$i++) {
      if ($x2>$x1) { $z=$x1;$x1=$x2;$x2=$z }
      if ($x2>$x1) { $z=$x1;$x1=$x2;$x2=$z }
      if ($x3>$x2) { $z=$x2;$x2=$x3;$x3=$z }
      if ($x4>$x3) { $z=$x3;$x3=$x4;$x4=$z }
    for ($i=0;$i<2;$i++) {
      if ($y2>$y1) { $z=$y1;$y1=$y2;$y2=$z }
      if ($y3>$y2) { $z=$y2;$y2=$y3;$y3=$z }
      if ($y4>$y3) { $z=$y3;$y3=$y4;$y4=$z }
    if ($ratio<=$aaron) { $flag=1 }
    if ($flag==1) { $p2=sum($p2,$ratio) }
  my $cp2=round(sum($p1,$p2),DECIMALS);
  if ($p2<.00000001) { if ($p2>0) { $cp2=".00000001" } }
  if ($p2>.99999999) { if ($p2!=1) { $cp2=".99999999" } }
  return ($cp1,$cp2);
This is perlization of It calculat
fisher's Exact test, one and two tailed, from a 2x2 table.
($p1, $p2) = p_fisher( $cell11, $cell12, $cell21, $cell22);

Replies are listed 'Best First'.
Re: Fisher's Exact test (probability of 2x2 tables)
by Zaxo (Archbishop) on Oct 03, 2001 at 12:57 UTC

    I'd like to recommend a non-recursive &fishersexact::fac, for both speed and memory usage. Something like

    fac { my ($arg, $res) = ($_[0], 1); die "Negative argument to factorial\n" if $arg < 0; $res *= $_ for 1..$arg; return $res; }

    Additionally factorials soon overflow perl int(). An implementation in terms of Math::BigInt, Math::GMP or Math::Pari might be in order. Overflow might be abated by rewriting the formulas in terms of binomial coefficients.

    After Compline,

      If I look at the code, there is a very lot more to improve. It seems to neglect an indecent number of software design rules. Right now, I'm just happy to have kept to my promise in Re: Fisher exact - one tailed

      "We are not alone"(FZ)

Re: Fisher's Exact test (probability of 2x2 tables)
by merlyn (Sage) on Oct 03, 2001 at 18:30 UTC
    The package name should follow the rules for module package names. You need an uppercase character (lowercase are reserved for pragmas), and you should probably get a name for this module from the module naming czars before distributing it (even posting on perlmonks is a form of "distribution").

    -- Randal L. Schwartz, Perl hacker

      I obviously didn't go that official. Even worse, I have never named any module that I wrote in the official way. Will do from now one, of course.

      Thanx for the tip.

      "We are not alone"(FZ)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://116378]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2016-10-23 20:37 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (302 votes). Check out past polls.