http://www.perlmonks.org?node_id=1074563


in reply to Re: Karatsuba and Toom-cook multiplication
in thread Karatsuba and Toom-cook multiplication

No . . . this was for a pragmatic reason. The recent recursive function challenge on golf.shinh.org, lacked any perl solutions, as of a week or two ago. I thought "WTF ?" surely that would only be about 130 bytes if I used Math::BigInt; which is installed on the golf server.

Problem was, that it took about 7 or 8 seconds to execute, using Math::BigInt; and the execution timeout was only 3 seconds or so. Math::BigInt::GMP would solve it almost instantly; but it wasn't installed on the golf server; and I lacked module install permission, or patience to beg for it to be installed.

My first attempt was actually to use unpack() to print the solution from binary digits. And found that there's a 20 kb size limit on solutions; which prevents using unpack() to print out a 53,000 digit number.

(Execution times might be misremembered).
Karatsuba, cut the time down from 7 or 8 seconds, to about 6.5 seconds. A change that I initially didn't understand the value of

from $blog10 = int($mindigits / 2) to my $low = $mindigits - 1; my $high = int($maxdigits / 2); my $blog10 = $low <= $high ? $low : $high;
cut the execution time, from about 6.5 seconds, to about 4.5 seconds.

A lot of benchmarking, profiling, and false leads later, I finally had a version (more complex and less readable than above) which finished in 3.3 seconds or so.

Which was still, just barely greater than the execution timeout.

In it, carrying is done manually. 999999999 * 999999999 can be added to itself, 18 times, before it begins to lose precision. So, I did a carry every 16 additions.

But, how often does 999999999 * 999999999 really occur in the input data? Isn't stuff like 333333333 * 333333333 statistically, a lot more likely?

So, I wrote a script to assist in performing a binary search for the maximum length of time I could postpone carrying, within each multiplication of the input. It printed out stuff like

Sample id 29, 38863124 ... 04402533 (2503 digits) With carry_lim at 50, result 38863124 ... 04402533 (2503 digits) is +Succeeded. With carry_lim at 75, result 38863124 ... 04402533 (2503 digits) is +Failed.
At which point, I'ld tweak the parameters, and rerun.

Then, copying the results (carry_lim just 1 below failure, for elements 29 - 37), back to the first script, there was a substantial performance improvement -- from 3.3 seconds to about 2.8 seconds.

Toom-cook, took so long to comprehend, write, profile, and benchmark; it wasn't ready for use, before the challenge deadline had already passed.