http://www.perlmonks.org?node_id=68077


in reply to Binomial Expansion

The nCr function above is horribly inaccurate. While n!/r!(n-r)! is indeed a valid formula for binomial coefficients, it's hardly a good way to calculate them. Here's a better way:
sub nCr { my ($n, $r) = @_; my $res = 1; for my $i (1..$r) { $res *= $n--; $res /= $i; } $res; }
Why $res remains an integer (for "reasonable" parameter values) remains an exercise for the reader...

Replies are listed 'Best First'.
Re: Re: Binomial Expansion
by japhy (Canon) on Mar 29, 2001 at 21:39 UTC
    Ok, for fun, and because I like mathematical proofs (I made it to the USA Mathematic Olympiad twice), here's the proof I use. In pseudo-code, the n-choose-r function looks like:
    nCr (n,r) { res = 1; for i in 1 .. r { res *= n; n--; res /= i; } return res; }
    The question is, why does res remain an integer? It is not difficult to show that res /= 1 is an integer, but how can we prove that by the time we get to res /= 7, res is a multiple of 7?

    The proof is that by the time we have gotten to res /= X, we have multiplied res's original value, 1, by X continuous integer, and in every series of X continuous integers, there will be one number divisible by X:

    X = 2, series = (y, y+1); one is div. by 2 X = 3, series = (y, y+1, y+2); one is div. by 3 ... in nCr(17,7), n is of the series (17,16,15,14,13,12,11) i is of the series ( 1, 2, 3, 4, 5, 6, 7) (17/1) (17/1) (16/2) (17/1) (16/2) (15/3) (17/1) (16/4) (15/3) (14/2) (17/1) (16/4) (15/(3*5)) (14/2) (13/1) (17/1) (16/4) (15/(3*5)) (14/2) (13/1) (12/6) (17/1) (16/4) (15/(3*5)) (14/(2*7)) (13/1) (12/6) (11/1)
    We're allowed to move the denominators around like I did, because we've already showed the product will already be an integer. The product looks like:
    r ----- | | n - (i-1) | | --------- | | i i=1


    japhy -- Perl and Regex Hacker
(crazyinsomniac) Re: (2) Binomial Expansion
by crazyinsomniac (Prior) on Mar 31, 2001 at 04:41 UTC
    I take great offense in calling my nCr function horribly inaccurate.

    You do say right away that is a valid formula for binomial coefficients, but then you say it's hardly a good one?

    It is a formula for calculating r-combinations, not binomial coefficients.

    The formula for binomial expansion is:

              n
              _
    (x+y)^n = \ C(n,j) * x^(n-j) *y^j
              /
              ¯
              j=0
    

    And btw, it is the definitive formula for calculating r-combinations. It's like saying 4/2 is not the same as 2. While 2 is a better way to write 4/2, not every one recognizes 2 right away, and I feel, for the benefit of the reader of me 'craft', n!/(r!(n-r)! is a better way to go.

    As for your function, it is shorter, but a litle mysterious when it comes to the math.

    
                   9!         9*8*7* 6!      9*8*7   504
     nCr(9,3) = ---------   = ----------- = ------ = --- = 84
                3!*(9-3)!     3*2*1*(6!)     3*2*1    6
    
     which follows from the fact that nCr(n,r) = nCr(n,n-r)
    
     n!              n!               n!         _    n!
    ------- = ----------------- = -------------  = ---------
    r!(n-r)    (n-r)!(n-(n-r))!   (n-r)!(n-n+r)!   (n-r)!(r)!
    
    This however only works for n>r, as long as n and r are both non-negative integers, but shouldn't be a problem in this case.

    As for things being integer, since the input is an integer, it follows that any integer multiple of that integer, will also be an integer.

    update:
    I give, I give, theory v. practice, there is a difference.

     
    ___crazyinsomniac_______________________________________
    Disclaimer: Don't blame. It came from inside the void

    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

      Those formulas are all the same in mathematical terms. But we are talking floating point arithmatic here. Computing the factorials and then dividing is going to introduce errors into the computation for fairly small values of n here.

              - tye (but my friends call me "Tye")
      If you want to understand the math, I fully agree with your use of the formula. Furthermore for more complex combinatorial problems, understand how to get those formulas is extremely important.

      That said, tye is right as well. While in theory that calculation gets the right answer, in practice it starts to be inaccurate for fairly small numbers. You can get around this by using a big integer package. (Or a language that understands integers of arbitrary size.) But that will be somewhat slow.

      If you want to understand this, go with the formula. If you want to calculate it, go with the tricks. If you want to calculate answers to arbitrary problems, be prepared to manipulate big integers and hope that performance doesn't matter.

      And if you want an even better example some day, bug me about why you should always use row-reduction to find the inverses of matrices rather than Cramer's formula. (Hint: Think numerically unstable and exponential running time, vs n**3 and much better behaved.)

        OK, time to eat some crow.

        While Perl no longer display integers without using exponential notation at 18!, when I tried to get a mistake in the display, I couldn't. Except for the fact that Perl did not like displaying large numbers, the results are right.

        In retrospect I shouldn't be surprised at this. After all the internal calculation is carried out at a higher accuracy than what is displayed. Since the calculation is pretty short, roundoff errors are not readily visible. And testing this is pretty simple:

        #! /usr/bin/perl my ($n, $m) = @ARGV; my $ans = fact($n)/fact($m)/fact($n-$m); my $err = $ans - sprintf("%.0f", $ans); print "$n choose $m: $ans (error $err)\n"; sub fact { my $fact = 1; $fact *= $_ for 1..$_[0]; $fact; }
        To find a disagreement between theory and practice I had to go to 28 choose 14. Which was off from being an integer by about 1 part in a hundred million.

        Sorry tye, but I find that error tolerable. For a simple example the formula works in this case, even though theoretically in practice it should show some difference between theory and practice.

        OTOH I can guarantee you that trying to invert a 10x10 matrix using Cramer's formula is going to be a better example. For an nxn matrix you get n! terms being added and subtracted from each other. It doesn't take a large n to make that rather slow, and to get insane roundoff errors...