comment on

So I'm playing around with recursive subroutines and have results that are a bit confusing. Consider a normal recursive function:

sub factorial {
    my $num = shift;
    return 1 if 1 == $num;
    return $num * factorial($num = 1);
}

Ordinarily, we expect this to have quite a bit of overhead. Part of the reason for this is that we have deferred part of the calculation. For example, here's what this looks like if we try to calculate 5!


1:  factorial(5)
2:  (5 * factorial(4))
3:  (5 * (4 * factorial(3)))
4:  (5 * (4 * (3 * factorial(2))))
5:  (5 * (4 * (3 * (2 * factorial(1)))))
6:  (5 * (4 * (3 * (2 * 1))))
7:  (5 * (4 * (3 * 2)))
8:  (5 * (4 * 6))
9:  (5 * 24)
10: (120)

The problem with this is that we must defer part of the calculation until we get to the terminal condition (1 == $num, or step 5) of the recursive function. Then we go back through the call stack and finish the calculations. In theory, we get around this by building our our recursive function in such a way so that we don't need to walk back through the call stack, but instead have the value calculated every step of the way and return directly from the last call, which could allow us to skip steps 6 through 9 above. We can do this by rewriting our factorial function to keep track of the current value of the factorial and using the magical goto &sub notation.

sub factorial {
    my ($num) = @_;
    @_ = (1, 1, $num);
    goto &_factorial;
}
                                                                      
+                                   
sub _factorial {
    my ($result, $counter, $max) = @_;
    return $result if $counter > $max;
    @_ = (($result * $counter), ($counter + 1), $max);
    goto &_factorial;
}
[download]

After playing with this, I realized that I could do even better by using a lexically scoped subroutine and skipping the symbol table lookup, though it winds up being more cumbersome to program.

my $_factorial;
$_factorial = sub {
    my ($result, $counter, $max) = @_;
    return $result if $counter > $max;
    @_ = (($result * $counter), ($counter + 1), $max);
    goto $_factorial;
};

sub factorial {
    my ($num) = @_;
    @_ = (1, 1, $num);
    goto $_factorial;
}
[download]

While my benchmarks show that the lexically scoped subroutine with goto is slightly faster than the first version using goto, it's considerably slower than the straight recursive function. pdcawley noted in Pure Perl tail call optimization that using this form of goto is not very fast; I'm curious to know why it's so ridiculously slow. I've tested this on 5.8.0 and 5.9.1 (I mention 5.9.1 to make it clear that the memory leak associated with assignment to @_ and calling goto is not the culprit). Benchmark code and results below. Did I do something stupid?

I might add that I know an iterative solution is faster. I'm just trying to figure out why the goto is slow.

#!/usr/local/bin/perl
use warnings;
use strict;
use Benchmark;
use Test::More tests => 3;
 
my $_fact3;
$_fact3 = sub {
    my ($result, $counter, $max) = @_;
    return $result if $counter > $max;
    @_ = (($result * $counter), ($counter + 1), $max);
    goto $_fact3;
};
 
my $number = shift or die "No number supplied";
 
validate_factorial_routines($number);
 
print "Timing factorial of $number\n";
 
timethese(10000, {
    'recursion' => sub { fact1($number) },
    'typeglob'  => sub { fact2($number) },
    'lexical'   => sub { fact3($number) },
});
 
sub validate_factorial_routines {
    my $num = shift;
    my @result = (fact1($num), fact2($num), fact3($num));
    is(fact1(5), 120, 'fact1 should return the correct amount');
    is(fact2(5), 120, 'fact2 should return the correct amount');
    is(fact3(5), 120, 'fact3 should return the correct amount');
}
 
sub fact1 {
    my ($num) = @_;
    return _fact1(1,1,$num);
}
 
sub _fact1 {
    my ($result, $counter, $max) = @_;
    return $result if $counter > $max;
    return _fact1(($result * $counter), ($counter + 1), $max);
}
 
sub fact2 {
    my ($num) = @_;
    @_ = (1, 1, $num);
    goto &_fact2;
}
 
sub _fact2 {
    my ($result, $counter, $max) = @_;
    return $result if $counter > $max;
    @_ = (($result * $counter), ($counter + 1), $max);
    goto &_fact2;
}
 
sub fact3 {
    my ($num) = @_;
    @_ = (1, 1, $num);
    goto $_fact3;
}

__END__
1..3
ok 1 - fact1 should return the correct amount
ok 2 - fact2 should return the correct amount
ok 3 - fact3 should return the correct amount
Testing factorial of 30
120
Benchmark: timing 10000 iterations of lexical, recursion, typeglob...
   lexical:  4 wallclock secs ( 4.05 usr +  0.00 sys =  4.05 CPU) @ 24
+69.14/s (n=10000)
 recursion:  1 wallclock secs ( 1.45 usr +  0.00 sys =  1.45 CPU) @ 68
+96.55/s (n=10000)
  typeglob:  5 wallclock secs ( 4.28 usr +  0.00 sys =  4.28 CPU) @ 23
+36.45/s (n=10000)
[download]

Update: I realized that I did one thing different in the tests. If I (uselessly) duplicate the assignment to @_ in &_fact1, it's not much faster, but it's still faster.

1..3
ok 1 - fact1 should return the correct amount
ok 2 - fact2 should return the correct amount
ok 3 - fact3 should return the correct amount
Timing factorial of 30
Benchmark: timing 10000 iterations of lexical, recursion, typeglob...
   lexical:  4 wallclock secs ( 4.05 usr +  0.00 sys =  4.05 CPU) @ 24
+69.14/s (n=10000)
 recursion:  4 wallclock secs ( 3.87 usr +  0.00 sys =  3.87 CPU) @ 25
+83.98/s (n=10000)
  typeglob:  4 wallclock secs ( 4.28 usr +  0.00 sys =  4.28 CPU) @ 23
+36.45/s (n=10000)
[download]

Cheers,
Ovid

New address of my CGI Course.

In reply to Why is goto &sub slow? by Ovid

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


good chemistry is complicated, and a little bit messy -LW
	PerlMonks