Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Fisher's Exact test (probability of 2x2 tables)

by jeroenes (Priest)
on Oct 03, 2001 at 12:19 UTC ( #116378=sourcecode: print w/replies, xml ) Need Help??
Category: Misc
Author/Contact Info Jeroen Elassaiss-Schaap, just /msg me at jeroenes

Authors of the original code: Kristopher J. Preacher and Nancy E. B

Description: This is a plain translation of this page into perl, kept close to the original code.
package Statistics::Fishersexact;

use strict;
use warnings;

use constant DECIMALS=>8;

sub round{
  my $frac= shift; my $dec = shift;

sub fac {
  my $z = shift;
  if ($z==0) { return 1; }
  if ($z>1) { return $z*fac($z-1); }
  else { return $z; }

sub sum {
  my $sum = 0;
  $sum += $_ for @_;

sub p_fisher {
  my $f11=shift;
  my $f12=shift;
  my $f21=shift;
  my $f22=shift;
  my $row1 = sum($f11,$f12);
  my $row2 = sum($f21,$f22);
  my $col1 = sum($f11,$f21);
  my $col2 = sum($f12,$f22);
  my $r1=sum($f11,$f12);
  my $r2=sum($f21,$f22);
  my $c1=sum($f11,$f21);
  my $c2=sum($f12,$f22);
  my $total = sum($r1,$r2);
  my ($a,$b,$c,$d,$z,$temp,$p,$ratio,$p1,$aaron,$flag);
  if ($r1<=$r2) { if ($c1<=$c2) { $a=$f11;$b=$f12;$c=$f21;$d=$f22; } }
  if ($r1<=$r2) { if ($c2<=$c1) { $a=$f12,$b=$f22;$c=$f11;$d=$f21; } }
  if ($r2<=$r1) { if ($c1<=$c2) { $a=$f21;$b=$f11;$c=$f22;$d=$f12; } }
  if ($r2<=$r1) { if ($c2<=$c1) { $a=$f22;$b=$f21;$c=$f12;$d=$f11; } }
  if ($b<=$a) { $z=$c;$c=$a;$a=$b;$b=$d;$d=$z; }
  elsif ($c<=$a) { $z=$b;$b=$a;$a=$c;$c=$d;$d=$z; }
  my $i;
  my ($ab,$ac,$bd,$cd,$abcd,$x1,$x2,$x3,$x4,$y1,$y2,$y3,$y4,$ra1,$ra2,
  for ($i=0;$i<10000;$i++) {
    $y1=fac($a); $y2=fac($b); $y3=fac($c); $y4=fac($d);
    for (my $i=0;$i<2;$i++) {
      if ($x2>$x1) { $z=$x1;$x1=$x2;$x2=$z }
      if ($x3>$x2) { $z=$x2;$x2=$x3;$x3=$z }
      if ($x4>$x3) { $z=$x3;$x3=$x4;$x4=$z }
    for (my $i=0;$i<2;$i++) {
      if ($y2>$y1) { $z=$y1;$y1=$y2;$y2=$z }
      if ($y3>$y2) { $z=$y2;$y2=$y3;$y3=$z }
      if ($y4>$y3) { $z=$y3;$y3=$y4;$y4=$z }
    $ra1=$x1/$y1;$ra2=$x2/$y2;$ra3=$x3/$y3; $ra4=$x4/$y4;
    if ($p1==0) {$aaron=$ratio }
    $a=sum($a,-1); $b=sum($b,1); $c=sum($c,1);$d=sum($d,-1);
    if ($a<0) { last }
  my $cp1=round($p1,DECIMALS);
  if ($p1<.00000001) { if ($p1>0) { $cp1=".00000001" } }
  if ($p1>.99999999) { if ($p1!=1) { $cp1=".99999999" } }
  if ($r1<=$r2) { if ($c1<=$c2) { $a=$f11;$b=$f12;$c=$f21;$d=$f22 } }
  if ($r1<=$r2) { if ($c2<=$c1) { $a=$f12,$b=$f22;$c=$f11;$d=$f21 } }
  if ($r2<=$r1) { if ($c1<=$c2) { $a=$f21;$b=$f11;$c=$f22;$d=$f12 } }
  if ($r2<=$r1) { if ($c2<=$c1) { $a=$f22;$b=$f21;$c=$f12;$d=$f11 } }
  if ($b<=$a) { $z=$c;$c=$a;$a=$b;$b=$d;$d=$z }
  elsif ($c<=$a) { $z=$b;$b=$a;$a=$c;$c=$d;$d=$z }
  my $p2=0;$ratio=0;$flag=0;
  for (my $i=0;$i<10000;$i++) {
    if ($b<0) { last }
    if ($c<0) { last }
    for ($i=0;$i<2;$i++) {
      if ($x2>$x1) { $z=$x1;$x1=$x2;$x2=$z }
      if ($x2>$x1) { $z=$x1;$x1=$x2;$x2=$z }
      if ($x3>$x2) { $z=$x2;$x2=$x3;$x3=$z }
      if ($x4>$x3) { $z=$x3;$x3=$x4;$x4=$z }
    for ($i=0;$i<2;$i++) {
      if ($y2>$y1) { $z=$y1;$y1=$y2;$y2=$z }
      if ($y3>$y2) { $z=$y2;$y2=$y3;$y3=$z }
      if ($y4>$y3) { $z=$y3;$y3=$y4;$y4=$z }
    if ($ratio<=$aaron) { $flag=1 }
    if ($flag==1) { $p2=sum($p2,$ratio) }
  my $cp2=round(sum($p1,$p2),DECIMALS);
  if ($p2<.00000001) { if ($p2>0) { $cp2=".00000001" } }
  if ($p2>.99999999) { if ($p2!=1) { $cp2=".99999999" } }
  return ($cp1,$cp2);
This is perlization of It calculat
fisher's Exact test, one and two tailed, from a 2x2 table.
($p1, $p2) = p_fisher( $cell11, $cell12, $cell21, $cell22);

Replies are listed 'Best First'.
Re: Fisher's Exact test (probability of 2x2 tables)
by Zaxo (Archbishop) on Oct 03, 2001 at 12:57 UTC

    I'd like to recommend a non-recursive &fishersexact::fac, for both speed and memory usage. Something like

    fac { my ($arg, $res) = ($_[0], 1); die "Negative argument to factorial\n" if $arg < 0; $res *= $_ for 1..$arg; return $res; }

    Additionally factorials soon overflow perl int(). An implementation in terms of Math::BigInt, Math::GMP or Math::Pari might be in order. Overflow might be abated by rewriting the formulas in terms of binomial coefficients.

    After Compline,

      If I look at the code, there is a very lot more to improve. It seems to neglect an indecent number of software design rules. Right now, I'm just happy to have kept to my promise in Re: Fisher exact - one tailed

      "We are not alone"(FZ)

Re: Fisher's Exact test (probability of 2x2 tables)
by merlyn (Sage) on Oct 03, 2001 at 18:30 UTC
    The package name should follow the rules for module package names. You need an uppercase character (lowercase are reserved for pragmas), and you should probably get a name for this module from the module naming czars before distributing it (even posting on perlmonks is a form of "distribution").

    -- Randal L. Schwartz, Perl hacker

      I obviously didn't go that official. Even worse, I have never named any module that I wrote in the official way. Will do from now one, of course.

      Thanx for the tip.

      "We are not alone"(FZ)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: sourcecode [id://116378]
[LanX]: The King is dead. .. :(

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2017-08-20 18:28 GMT
Find Nodes?
    Voting Booth?
    Who is your favorite scientist and why?

    Results (317 votes). Check out past polls.