Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Karatsuba and Toom-cook multiplication

by ohcamacj (Sexton)
on Feb 11, 2014 at 22:36 UTC ( #1074529=CUFP: print w/ replies, xml ) Need Help??

I recently implemented two of the algorithms that GMP uses for "fast" multiplication, in perl.

Karatsuba, was really easy to implement. It's a little faster if it's recursive, with a recursion threshold of about 1200 digits (and using plain multiplication, below that); but for demonstration; it's clearer to have neither a threshold nor recursion.

Toom-cook, was a difficult and fun challenge to implement. Initially, both of the descriptions of it, were totally incomprehensible. There's a speedup if all of the *1 *-1 and *0 multiplications are eliminated; but for demonstration, it's clearer to leave them in.

The test script,

#!/usr/bin/perl -w use strict; use Getopt::Long; my $gmp = 0; my $perl = 1; my $verbose = 1; my $normal = 1; my $extended = 0; my $extended_2 = 0; my $check = 1; GetOptions("gmp" => sub { $gmp = 1; $perl = 0; }, "perl" => sub { $gmp = 0; $perl = 1; }, "verbose!" => \$verbose, "quiet" => sub {$verbose = 0}, "normal-tests!" => \$normal, "extended-tests" => \$extended, "second-extended-tests" => \$extended_2, "check!" => \$check, ); #It was hard enough to debug without the test cases changing every inv +ocation. srand(3); if($gmp){ #eval required instead of plain "use", because of #umm . . . not sure why. eval 'use Math::BigInt lib => "GMP";' } if($perl){ eval 'use Math::BigInt;'; } my @tests_sym = map( { [ Math::BigInt->new($$_[0]), Math::BigInt->new( +$$_[1]) ] } ([7, 5], [27, 9], [9, 27], [33, 41], [78, 61], [49, 34], [12, 345], [123, 34], [123, 456], [688, 549], [449, 941], [509829, 632449], [404404, 101049], [747523, 148580], [482693, 713702], [qw(973394745225717708155 243871059976131889949) +], [qw(722507641183546983116 660835710456936257744) +], [8910357, 5457354], #length not evenly divisible + by 3 [4986839, 7764259], ) ); my @tests_asym = map( { [ Math::BigInt->new($$_[0]), Math::BigInt->new +($$_[1]) ] } ([9369, 23], [31, 5873], [62396136, 9608], [59641089, 8338], [4277, 88260312], [4662, 40180375], [qw(67051724350761 4470074254322332022626889194 +)], [qw(5308191357528912569265023332 57514777441177 +)], [23190, 91306063], #assymetric assymetric, leng +ths don't [45591478, 16201], #differ by exactly a factor +of 2. [1095, 874850417], [482548840, 4481] ) ); if($extended){ for(1..3){ push(@tests_sym, [ Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 400) ) ), Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 400) ) ) ]); push(@tests_asym, [ Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 533) ) ), Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 266) ) ) ]); } for(1..2){ push(@tests_sym, [ Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 1575) ) ), Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 1575) ) ) ]); push(@tests_asym, [ Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 2100) ) ), Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 1050) ) ) ]); } } if($extended_2){ for(1..10){ push(@tests_sym, [ Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 3500) ) ), Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 3500) ) ) ]); push(@tests_asym, [ Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 4666) ) ), Math::BigInt->new( join("", int(rand(9) + 1), map( {int(rand(10))} 2 + .. 2333) ) ) ]); } } sub karatsuba_mul { (my $x, my $y) = @_; my $expected; printf("Karatsuba.\n"); printf("Sample: $x (%d digits) * $y (%d digits)", length($x), leng +th($y)); if($check){ $expected = $x * $y; printf(", expecting $expected (%d digits)\n", length($expected +)); }else{ print("\n"); } if($x->length < 2 || $y->length < 2){ printf(" Skipping; length %d or %d too short. \n\n", length($ +x), length($y)); return $x * $y; } #decide upon b my $mindigits = $x->length <= $y->length ? $x->length : $y->length +; my $blog10 = int($mindigits / 2); my $x1 = Math::BigInt->new(substr($x->bstr, 0, $x->length - $blog1 +0)); my $y1 = Math::BigInt->new(substr($y->bstr, 0, $y->length - $blog1 +0)); my $x0 = Math::BigInt->new(substr($x->bstr, $x->length - $blog10, +$blog10)); my $y0 = Math::BigInt->new(substr($y->bstr, $y->length - $blog10, +$blog10)); if($verbose){ printf(" b is %d digits, 1%s, splitting input into:\n", $blog +10, "0" x $blog10); printf(" (%s,%s)(%s,%s)\n", $x1,$x0,$y1,$y0); } my $x1y1 = $x1 * $y1; my $x0y0 = $x0 * $y0; my $mid = ($x1 + $x0) * ($y1 + $y0) - $x1y1 - $x0y0; my $top_ex = Math::BigInt->new($x1y1->bstr . ("0" x ($blog10 * 2)) +); my $mid_ex = Math::BigInt->new($mid->bstr . ("0" x $blog10)); my $sum = $top_ex + $mid_ex + $x0y0; if($verbose){ printf(" x1y1: %s, x0y0: %s\n", $x1y1, $x0y0); print(" mid: ($x1 + $x0)($y1 + $y0) - $x1y1 - $x0y0 => $x1*$y +0 + $y1*$x0\n"); printf(" Intermediate: %s, %s, %s\n", $top_ex, $mid_ex, $x0y0 +); } my $res_res = $check ? ($expected->bcmp($sum) == 0 ? "Succeeded" : + "Failed") : "N/A -- check skipped"; printf(" Final result is %s (%d digits), (%s)\n", $sum->bstr, sca +lar($sum->length), $res_res); print("\n"); return $sum; } sub toom_cook_mul { (my $x, my $y) = @_; my $expected; printf("Toom-cook.\n"); printf("Sample: $x (%d digits) * $y (%d digits)", length($x), leng +th($y)); if($check){ $expected = $x * $y; printf(", expecting $expected (%d digits)\n", length($expected +)); }else{ print("\n"); } (my $s, my $l) = length($x) <= length($y) ? ($x, $y) : ($y, $x); if( (length($s) + length($l) < 6) || length($s) < 2 ){ printf(" Skipping; length %d or %d too short. \n\n", length($ +x), length($y)); return $x * $y; } if(length($l) / length($s) <= 1.5 ){ #Not sure what this cutoff po +int is supposed #to be. Clearly, equal-length args get symmetric, and 2x lengt +h difference args #get toom-asym; but what should the cutoff be? print(" Selected symmetric path.\n") if $verbose; #There are three steps. Calculate, interpolate, and assemble. #Here, we'll be using the matrixes # 1 0 0 # 1 1 1 # 1 -1 1 # 1 2 4 # 0 0 1 #and # 1 0 0 0 0 # -0.5 1 -0.333 -0.167 2 # -1 0.5 0.5 0 -1 # 0.5 -0.5 -0.167 0.167 -2 # 0 0 0 0 1 my $blog10 = int(length($s) / 3); if($verbose){ printf(" b is %d digits, 1%s, splitting input into:\n", $ +blog10, "0" x $blog10); } my $s_str = $s->bstr; my $l_str = $l->bstr; my @parts_s = map( { Math::BigInt->new($_) } (substr($s_str, length($s_str) - $blog10, $ +blog10), substr($s_str, length($s_str) - $blog10 * +2, $blog10), substr($s_str, 0, length($s_str) - $blog10 + * 2)) ); my @parts_l = map( { Math::BigInt->new($_) } (substr($l_str, length($l_str) - $blog10, $ +blog10), substr($l_str, length($l_str) - $blog10 * +2, $blog10), substr($l_str, 0, length($l_str) - $blog10 + * 2)) ); my @s_inter = ( $parts_s[0] * 1 + $parts_s[1] * 0 + $parts_s[2 +] * 0, $parts_s[0] * 1 + $parts_s[1] * 1 + $parts_s[2 +] * 1, $parts_s[0] * 1 + $parts_s[1] * -1 + $parts_s +[2] * 1 , $parts_s[0] * 1 + $parts_s[1] * 2 + $parts_s[2 +] * 4, $parts_s[0] * 0 + $parts_s[1] * 0 + $parts_s[2 +] * 1); my @l_inter = ( $parts_l[0] * 1 + $parts_l[1] * 0 + $parts_l[2 +] * 0, $parts_l[0] * 1 + $parts_l[1] * 1 + $parts_l[2 +] * 1, $parts_l[0] * 1 + $parts_l[1] * -1 + $parts_l +[2] * 1 , $parts_l[0] * 1 + $parts_l[1] * 2 + $parts_l[2 +] * 4, $parts_l[0] * 0 + $parts_l[1] * 0 + $parts_l[2 +] * 1); #We need one extra digit to handle 2/3rds and suchlike occurin +g as #intermediate values and getting truncated off, causing errors +. my @mul_i = ($s_inter[0] * $l_inter[0] * 10, $s_inter[1] * $l_inter[1] * 10, $s_inter[2] * $l_inter[2] * 10, $s_inter[3] * $l_inter[3] * 10, $s_inter[4] * $l_inter[4] * 10); my @mul_res = ($mul_i[0]*1 + $mul_i[1]*0 + $mul_i[2]*0+ $mul_i +[3]*0 + $mul_i[4]*0, $mul_i[0]/-2 + $mul_i[1]*1 + $mul_i[2]/-3 + $mu +l_i[3]/-6 + $mul_i[4]*2 , $mul_i[0]*-1 + $mul_i[1]/2 + $mul_i[2]/2 + $mul +_i[3]*0 + $mul_i[4]*-1 , $mul_i[0]/2 + $mul_i[1]/-2 + $mul_i[2]/-6 + $mu +l_i[3]/6 + $mul_i[4]*-2 , $mul_i[0]*0 + $mul_i[1]*0 + $mul_i[2]*0 + $mul_ +i[3]*0 + $mul_i[4]*1 ); if($verbose){ print(" short: (" . join(",", reverse(@parts_s)) . ")\n +"); print(" long : (" . join(",", reverse(@parts_l)) . ")\n +"); printf(" Intermediate results (\@mul_i) are:\n"); printf(" %8s\n", $_) for @mul_i; printf(" Intermediate results (\@mul_res) are:\n"); printf(" %8s\n", $_) for @mul_res; } my @rmul_res = map( { my $rval = $_->bround($_->length - 1)->b +str; substr($rval, 0, -1); } @mul_res); if($verbose){ printf(" Intermediate results (\@round_mul_res) are:\n"); printf(" %8s\n", $_) for @rmul_res; } my @add_i = map( { Math::BigInt->new($_) } ($rmul_res[0] . "" x $blog10, $rmul_res[1] . "0" x $blog10, $rmul_res[2] . "00" x $blog10, $rmul_res[3] . "000" x $blog10, $rmul_res[4] . "0000" x $blog10 ) ); my $res = $add_i[0] + $add_i[1] + $add_i[2] + $add_i[3] + $add +_i[4]; my $res_res = $check ? ($expected == $res ? "Succeeded" : "Fai +led") : "N/A -- check skipped"; printf(" Final result is $res (%d digits), ($res_res)\n", len +gth($res)); printf("\n"); return $res; }else{ print(" Selected assymmetric path.\n"); #There are three steps. Calculate, interpolate, and assemble. #Here, we'll be using the matrixes # 1 0 0 0 # 1 1 1 1 # 1 -1 1 -1 # 1 2 4 8 # 0 0 0 1 #and # 1 0 # 1 1 # 1 -1 # 1 2 # 0 1 #and # 1 0 0 0 0 # -0.5 1 -0.333 -0.167 2 # -1 0.5 0.5 0 -1 # 0.5 -0.5 -0.167 0.167 -2 # 0 0 0 0 1 my $if_s = int(length($s) / 2); my $if_l = int(length($l) / 4); my $blog10 = $if_s <= $if_l ? $if_s : $if_l; if($verbose){ printf(" b is %d digits, 1%s, splitting input into:\n", $ +blog10, "0" x $blog10); } my $s_str = $s->bstr; my $l_str = $l->bstr; my @parts_s = map( { Math::BigInt->new($_) } (substr($s_str, length($s_str) - $blog10, $ +blog10), substr($s_str, 0, length($s_str) - $blog10 +)) ); my @parts_l = map( { Math::BigInt->new($_) } (substr($l_str, length($l_str) - $blog10, $ +blog10), substr($l_str, length($l_str) - $blog10 * +2, $blog10), substr($l_str, length($l_str) - $blog10 * +3, $blog10), substr($l_str, 0, length($l_str) - $blog10 + * 3)) ); if($verbose){ print(" short: (" . join(",", reverse(@parts_s)) . ")\n +"); print(" long : (" . join(",", reverse(@parts_l)) . ")\n +"); } my @s_inter = ( $parts_s[0] * 1 + $parts_s[1] * 0, $parts_s[0] * 1 + $parts_s[1] * 1, $parts_s[0] * 1 + $parts_s[1] * -1, $parts_s[0] * 1 + $parts_s[1] * 2, $parts_s[0] * 0 + $parts_s[1] * 1,); my @l_inter = ( $parts_l[0]*1 + $parts_l[1]*0 + $parts_l[2]*0 ++ $parts_l[3]*0, $parts_l[0]*1 + $parts_l[1]*1 + $parts_l[2]*1 ++ $parts_l[3]*1, $parts_l[0]*1 + $parts_l[1]*-1 + $parts_l[2]* +1 + $parts_l[3]*-1, $parts_l[0]*1 + $parts_l[1]*2 + $parts_l[2]*4 ++ $parts_l[3]*8, $parts_l[0]*0 + $parts_l[1]*0 + $parts_l[2]*0 ++ $parts_l[3]*1); my @mul_i = ($s_inter[0] * $l_inter[0] * 10, $s_inter[1] * $l_inter[1] * 10, $s_inter[2] * $l_inter[2] * 10, $s_inter[3] * $l_inter[3] * 10, $s_inter[4] * $l_inter[4] * 10); if($verbose){ printf(" Intermediate results (\@mul_i) are:\n"); printf(" %8s\n", $_) for @mul_i; } my @mul_res = ($mul_i[0]*1 + $mul_i[1]*0 + $mul_i[2]*0+ $mul_i +[3]*0 + $mul_i[4]*0, $mul_i[0]/-2 + $mul_i[1]*1 + $mul_i[2]/-3 + $mu +l_i[3]/-6 + $mul_i[4]*2 , $mul_i[0]*-1 + $mul_i[1]/2 + $mul_i[2]/2 + $mul +_i[3]*0 + $mul_i[4]*-1 , $mul_i[0]/2 + $mul_i[1]/-2 + $mul_i[2]/-6 + $mu +l_i[3]/6 + $mul_i[4]*-2 , $mul_i[0]*0 + $mul_i[1]*0 + $mul_i[2]*0 + $mul_ +i[3]*0 + $mul_i[4]*1 ); if($verbose){ printf(" Intermediate results (\@mul_res) are:\n"); printf(" %8s\n", $_) for @mul_res; } my @rmul_res = map( { my $rval = $_->bround($_->length - 1)->b +str; substr($rval, 0, -1); } @mul_res); if($verbose){ printf(" Intermediate results (\@round_mul_res) are:\n"); printf(" %8s\n", $_) for @rmul_res; } my @add_i = map( { Math::BigInt->new($_) } ($rmul_res[0] . "" x $blog10, $rmul_res[1] . "0" x $blog10, $rmul_res[2] . "00" x $blog10, $rmul_res[3] . "000" x $blog10, $rmul_res[4] . "0000" x $blog10 )); my $res = $add_i[0] + $add_i[1] + $add_i[2] + $add_i[3] + $add +_i[4]; my $res_res = $check ? ($expected == $res ? "Succeeded" : "Fai +led" ) : "N/A -- check skipped"; printf(" Final result is $res (%d digits), ($res_res)\n", len +gth($res)); print("\n"); return $res; } } foreach my $specref (@tests_sym, @tests_asym){ karatsuba_mul(@$specref); toom_cook_mul(@$specref); }

References.
https://gmplib.org/manual/Karatsuba-Multiplication.html
https://gmplib.org/manual/Toom-3_002dWay-Multiplication.html
https://gmplib.org/manual/Unbalanced-Multiplication.html
http://en.wikipedia.org/wiki/Karatsuba_algorithm
http://en.wikipedia.org/wiki/Toom%E2%80%93Cook_multiplication

Comment on Karatsuba and Toom-cook multiplication
Download Code
Re: Karatsuba and Toom-cook multiplication
by choroba (Abbot) on Feb 11, 2014 at 22:58 UTC
    Cool! I implemented Karatsuba in a programming competition years ago, too. Any plans to add Schönhage–Strassen?
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      No . . . this was for a pragmatic reason. The recent recursive function challenge on golf.shinh.org, lacked any perl solutions, as of a week or two ago. I thought "WTF ?" surely that would only be about 130 bytes if I used Math::BigInt; which is installed on the golf server.

      Problem was, that it took about 7 or 8 seconds to execute, using Math::BigInt; and the execution timeout was only 3 seconds or so. Math::BigInt::GMP would solve it almost instantly; but it wasn't installed on the golf server; and I lacked module install permission, or patience to beg for it to be installed.

      My first attempt was actually to use unpack() to print the solution from binary digits. And found that there's a 20 kb size limit on solutions; which prevents using unpack() to print out a 53,000 digit number.

      (Execution times might be misremembered).
      Karatsuba, cut the time down from 7 or 8 seconds, to about 6.5 seconds. A change that I initially didn't understand the value of

      from $blog10 = int($mindigits / 2) to my $low = $mindigits - 1; my $high = int($maxdigits / 2); my $blog10 = $low <= $high ? $low : $high;
      cut the execution time, from about 6.5 seconds, to about 4.5 seconds.

      A lot of benchmarking, profiling, and false leads later, I finally had a version (more complex and less readable than above) which finished in 3.3 seconds or so.

      Which was still, just barely greater than the execution timeout.

      In it, carrying is done manually. 999999999 * 999999999 can be added to itself, 18 times, before it begins to lose precision. So, I did a carry every 16 additions.

      But, how often does 999999999 * 999999999 really occur in the input data? Isn't stuff like 333333333 * 333333333 statistically, a lot more likely?

      So, I wrote a script to assist in performing a binary search for the maximum length of time I could postpone carrying, within each multiplication of the input. It printed out stuff like

      Sample id 29, 38863124 ... 04402533 (2503 digits) With carry_lim at 50, result 38863124 ... 04402533 (2503 digits) is +Succeeded. With carry_lim at 75, result 38863124 ... 04402533 (2503 digits) is +Failed.
      At which point, I'ld tweak the parameters, and rerun.

      Then, copying the results (carry_lim just 1 below failure, for elements 29 - 37), back to the first script, there was a substantial performance improvement -- from 3.3 seconds to about 2.8 seconds.

      Toom-cook, took so long to comprehend, write, profile, and benchmark; it wasn't ready for use, before the challenge deadline had already passed.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://1074529]
Approved by choroba
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (14)
As of 2014-10-20 10:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (75 votes), past polls