in reply to recursive formula.

Howdy!

#!/usr/bin/perl use strict; use warnings; sub P { my @r = @_; my $n = $#r; return $r[1] if $n == 1; # inferred end to recursion my $sum = 0; for my $i (1..$n) { my @r_prime = @r; splice(@r_prime, $n-$i+1, 1); $sum += ($r[$n-$i+1] - $r[$n-$i]) * P(@r_prime); } return $sum; } while (<DATA>) { chomp; my @r = split(/,/, $_); unshift @r, 0; # sets r-naught print "P($_) = ", P(@r), "\n"; } __DATA__ 0.11, 0.07, 0.19 0.43, 0.31, 0.37 0.93, 0.78, 0.82 0.91, 0.12, 0.15 0.52, 0.18, 0.32
gave the following output:
P(0.11, 0.07, 0.19) = 0.001595 P(0.43, 0.31, 0.37) = 0.046225 P(0.93, 0.78, 0.82) = 0.548235 P(0.91, 0.12, 0.15) = 0.439894 P(0.52, 0.18, 0.32) = 0.010192

The expansion looks similar to computing a determinant, but not quite...this is a straight-up translation of the formula into Perl. No trickery; no premature optimization.

Update:

I elected to prepend a zero to the array of r values to simplify life. That also means that the indices in the formula now correspond to the array indices in the Perl code, making the Perl read more nearly like the formula. I elected to construct the array for the recursion explicitly, by directly striking the omitted r value.

This is meant to make it easier to audit the code to verify that it says what it is meant to say...

yours,
Michael

Replies are listed 'Best First'.
Re^2: recursive formula.
by BioGeek (Hermit) on Aug 05, 2004 at 15:37 UTC
    Your results look sound to me.

    The above formula is used in extreme value statistics, where outliers are important. You wouldn't catch this when you just take the mean of the values:

    M() = mean

    M(0.11, 0.07, 0.19) = 0.12 M(0.43, 0.31, 0.37) = 0.37 M(0.93, 0.78, 0.82) = 0.84 M(0.91, 0.12, 0.15) = 0.39 M(0.52, 0.18, 0.32) = 0.34
    Now, with our formula: all low scores, also give a low P-value:
    P(0.11, 0.07, 0.19) = 0.001595and
    P(0.52, 0.18, 0.32) = 0.010192
    all high scores give a high P-value:
    P(0.93, 0.78, 0.82) = 0.548235
    both as expected. But now: low scores with a high outlier are also important for us,
    and indeed it also gives a high P-value:
    P(0.91, 0.12, 0.15) = 0.439894
    Only for P(0.43, 0.31, 0.37) = 0.046225 I would intitutivly have tougth of a lower value,
    but it's probably correct.

    Thanks for the effort you put into it.