Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

In theory, theory and practice are the same...

by tilly (Archbishop)
on Mar 31, 2001 at 23:32 UTC ( #68698=note: print w/replies, xml ) Need Help??

in reply to (crazyinsomniac) Re: (2) Binomial Expansion
in thread Binomial Expansion

If you want to understand the math, I fully agree with your use of the formula. Furthermore for more complex combinatorial problems, understand how to get those formulas is extremely important.

That said, tye is right as well. While in theory that calculation gets the right answer, in practice it starts to be inaccurate for fairly small numbers. You can get around this by using a big integer package. (Or a language that understands integers of arbitrary size.) But that will be somewhat slow.

If you want to understand this, go with the formula. If you want to calculate it, go with the tricks. If you want to calculate answers to arbitrary problems, be prepared to manipulate big integers and hope that performance doesn't matter.

And if you want an even better example some day, bug me about why you should always use row-reduction to find the inverses of matrices rather than Cramer's formula. (Hint: Think numerically unstable and exponential running time, vs n**3 and much better behaved.)

  • Comment on In theory, theory and practice are the same...

Replies are listed 'Best First'.
Re (tilly) 1: In theory, theory and practice are the same...
by tilly (Archbishop) on Apr 01, 2001 at 05:26 UTC
    OK, time to eat some crow.

    While Perl no longer display integers without using exponential notation at 18!, when I tried to get a mistake in the display, I couldn't. Except for the fact that Perl did not like displaying large numbers, the results are right.

    In retrospect I shouldn't be surprised at this. After all the internal calculation is carried out at a higher accuracy than what is displayed. Since the calculation is pretty short, roundoff errors are not readily visible. And testing this is pretty simple:

    #! /usr/bin/perl my ($n, $m) = @ARGV; my $ans = fact($n)/fact($m)/fact($n-$m); my $err = $ans - sprintf("%.0f", $ans); print "$n choose $m: $ans (error $err)\n"; sub fact { my $fact = 1; $fact *= $_ for 1..$_[0]; $fact; }
    To find a disagreement between theory and practice I had to go to 28 choose 14. Which was off from being an integer by about 1 part in a hundred million.

    Sorry tye, but I find that error tolerable. For a simple example the formula works in this case, even though theoretically in practice it should show some difference between theory and practice.

    OTOH I can guarantee you that trying to invert a 10x10 matrix using Cramer's formula is going to be a better example. For an nxn matrix you get n! terms being added and subtracted from each other. It doesn't take a large n to make that rather slow, and to get insane roundoff errors...

      I stand by my (quite conservative) statement. It does introduce errors for fairly small values of n. It is also at least twice as slow (it can be orders of magnitude slower).

      But the greatest weakness to my mind is the following output:

      200 choose 1: 200 vs. -1.#IND (1.#QNAN). Rate easy nice easy 1464/s -- -96% nice 34687/s 2270% --
      Now that is quite a large error term, don't you think?

      Yes, there is a useful space over which its accuracy is quite good (though not always as good), and thanks for exploring that.

      Here is the code I used:

      #!/usr/bin/perl -w use strict; use Benchmark qw( cmpthese ); sub fact { my $fact = 1; $fact *= $_ for 1..$_[0]; return $fact; } sub nice { my ($n, $r) = @_; my $res = 1; for my $i (1..$r) { $res *= $n--; $res /= $i; } return $res; } while( <> ) { my( $n, $m )= split ' '; my $nice= nice($n,$m); my $easy= fact($n)/fact($m)/fact($n-$m); print "$n choose $m: $nice vs. $easy (",abs($nice-$easy),").\n"; cmpthese( -3, { nice=>sub{nice($n,$m)}, easy=>sub{fact($n)/fact($m)/fact($n-$m)}, } ); }
              - tye (but my friends call me "Tye")
        It is a valid performance note. True But it isn't producing (visibly) wrong answers to take the naive approach.

        And the improvement is subtle. For instance consider the following slightly different version of your code:

        sub not_so_nice { my ($n, $r) = @_; my $res = 1; for my $i (1..$r) { $res *= ($n-- / $i); } return $res; }
        Someone was trying to simplify. But oops. In Perl you have just slowed down instead. In other languages you are going to get wrong answers. It takes a fair amount of knowledge to understand what changed. In fact it takes a fair amount of knowledge to understand why the original formula can be sped up by using your alternate approach.

        Now I appreciate that you have both pieces of knowledge. In fact the necessary analysis is probably almost a reflex for you, just like it is for me. But not everyone has the math background that we do...

      Perl computes floating-point using doubles, which typically have a 48 bit mantissa. 4-byte ints have just 32 bits of precision, so it's natural that doubles have a larger integer range than ints! So you can pretend Perl has 48-bit integers for most purposes.

      It's still not a good idea to compute huge numbers and try to cancel them out. Especially when not doing this takes fewer operations.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://68698]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (2)
As of 2018-05-26 16:15 GMT
Find Nodes?
    Voting Booth?