Lies, Damn Lies and Benchmarks

Sometimes you feel like you need a slap to the head.

Yesterday was such a day when I finally figured out that a strange phenomenon, which has been discussed at one Perl Mongers meeting, as well as in a BOF on a Perl Workshop, is actually caused by a very bad benchmark. Lies and Damn Lies, indeed.

The phenomenon was that a benchmark showed that it was faster to do:

my $self = shift;
my %param = @_;
[download]

rather than:

my ($self,%param) = @_;
[download]

This seems counter-intuitive, but the benchmark^* was repeatable and showed that using a shift and a seperate assignment of the hash was almost three times as fast as doing in one list assignment. Just to make sure, I ran the benchmark with both 5.8.3 and 5.6.2. The benchmark and the result:

use Benchmark qw(cmpthese);
cmpthese( -2,{
 list => sub { my ($self,%param) = @_ },
 shiftit => sub { my $self = shift; my %param = @_ },
} );
__END__
5.8.3       Rate    list shiftit
list     78392/s      --    -73%
shiftit 292680/s    273%      --
5.6.2       Rate    list shiftit
list     94936/s      --    -68%
shiftit 297717/s    214%      --
[download]

While preparing a some articles about micro-optimizations for Perl Monks, I decided to test this some more. Because I suddenly realised that I was testing this without parameters actually being passed. So I figured I'd do a run with parameters actually being passed. And everything changed. Observe:

sub list { my ($self,%param) = @_ }
sub shiftit { my $self = shift; my %param = @_ }

cmpthese( -2,{
 list => sub { list( qw(foo bar baz) ) },
 shiftit => sub { shiftit( qw(foo bar baz) ) },
} );
__END__
5.8.3      Rate shiftit    list
shiftit 70621/s      --    -11%
list    79125/s     12%      --
5.6.2      Rate shiftit    list
shiftit 89341/s      --    -10%
list    98866/s     11%      --
[download]

Huh? What's this? So doing it in one list assignment apparently is more efficient if you pass enough parameters to the subroutine. Hmmm... but what if we don't have parameters for the hash assignment. Surely then it would be faster to use the approach using shift()? Nope.

cmpthese( -2,{
 list => sub { list( qw(foo) ) },
 shiftit => sub { shiftit( qw(foo) ) },
} );
__END__
5.8.3       Rate shiftit    list
shiftit 164545/s      --    -19%
list    204158/s     24%      --
5.6.2       Rate shiftit    list
shiftit 179408/s      --    -12%
list    203990/s     14%      --
[download]

The list assignment was still more efficient. Huh? Had I been testing wrong. Ok, surely without any parameters passed, it would be faster to use shift()? Again, nope!

cmpthese( -2,{
 list => sub { list() },
 shiftit => sub { shiftit() },
} );
__END__
5.8.3       Rate shiftit    list
shiftit 217350/s      --    -12%
list    247334/s     14%      --
5.6.2       Rate shiftit    list
shiftit 212031/s      --    -14%
list    246447/s     16%      --
[download]

So what was the difference between my original benchmark and this one, apart from the overhead of calling an extra subroutine for each iteration? What was I missing?

The thing I was missing was in this little piece of documentation in perlsub:

To call subroutines:

           NAME(LIST);    # & is optional with parentheses.
           NAME LIST;     # Parentheses optional if predeclared/import
+ed.
           &NAME(LIST);   # Circumvent prototypes.
           &NAME;         # Makes current @_ visible to called subrout
+ine.
[download]

and indeed, if I changed the call from foo() to &foo, the following benchmark came about:

cmpthese( -2,{
 list => sub { &list },
 shiftit => sub { &shiftit },
} );
__END__
5.8.3       Rate    list shiftit
list    126315/s      --    -62%
shiftit 333577/s    164%      --
5.6.2       Rate    list shiftit
list    135814/s      --    -61%
shiftit 343986/s    153%      --
[download]

And indeed, that's a lot closer to the original benchmark.

What further conclusions can be drawn from this? Not sure, I guess I'll leave that as an excercise to the reader. ;-)

This just goes to show that you should always check, doublecheck and triplecheck your benchmarks.

Liz

^*Please note that benchmarks can be off by 5 to 10% between runs. I've run the each benchmark multiple times, but some of them were run while running on battery power, and others were run when my iBook was plugged in. Within one benchmark, I always had the situation consistent, so the results between 5.6.2 and 5.8.3 of a benchmark can be compared (keeping in mind the 5 - 10% uncertainty for each run, of course).

Comment on Lies, Damn Lies and Benchmarks Select or Download Code

Replies are listed 'Best First'.
Re: Lies, Damn Lies and Benchmarks by ysth (Canon) on Mar 21, 2004 at 23:59 UTC
I'm not sure what your final point is? Doing this: `cmpthese( -2,{ list => sub { &list }, shiftit => sub { &shiftit }, } );` [download] seems just plain broken, and I wouldn't have been surprised if it even made Benchmark roll over and die. You shouldn't be making any assumptions about what is in @_ in your outer sub {}, much less modifiying it. Trying this: `use Benchmark 'cmpthese'; cmpthese(1, { tryit => sub { $save = \@_ }}); use Data::Dumper; $Data::Dumper::Deparse = 1; print Dumper $save; __END__ $VAR1 = [ 1, sub { $save = \@_; } ];` [download] shows that in fact, cmpthese's parameters are still in @_, and doing: `cmpthese 3, { tryit => sub { push @save, [@_]; shift } }; print Dumper \@save; $VAR1 = [ [ 3, sub { push @save, [@_]; shift @_; } ], [ $VAR1->[0][1] ], [] ];` [download] shows that your shift has blown them away after the first 2 iterations.	[reply] [d/l] [select]
Re: Re: Lies, Damn Lies and Benchmarks by xdg (Monsignor) on Mar 22, 2004 at 15:38 UTC
The final example isn't a recommendation -- it's an explanation of why the original benchmark was doing what it did and why it was giving counterintuitive results. Though I'll agree that it probably could have used a "don't try this at home" kind of warning, given the implications of the PERLSUB excerpt. -xdg Code posted by xdg on PerlMonks is public domain. It has no warranties, express or implied. Posted code may not have been tested. Use at your own risk.	[reply]
Re: Lies, Damn Lies and Benchmarks by graff (Chancellor) on Mar 22, 2004 at 02:49 UTC
This just goes to show that you should always check, doublecheck and triplecheck your benchmarks. Or, perhaps it means that people should limit their use of Benchmark::cmpthese()... they should only compare alternatives that actually have some specific relevance in the context of a given application. I think you've shown that by trying to isolate a couple of syntactic variants -- stripping away all "confounding factors" -- in order to benchmark their "intrinsic" speed, you end up testing some obscure aspect of the perl interpreter whose impact becomes irrelevant once you put those test cases back into the real world.	[reply]
Re: Lies, Damn Lies and Benchmarks by Juerd (Abbot) on Mar 22, 2004 at 08:32 UTC
I think the strangest phenomenon is that nobody of the many people involved saw the obvious... :) /me doesn't remember having seen any &, though.	[reply]
Re: Lies, Damn Lies and Benchmarks by Aristotle (Chancellor) on Mar 24, 2004 at 20:08 UTC
If you run your code under warnings, Perl complains about `Odd number of elements in hash assignment`. Nothing unexpected. But if you benchmark just one of the versions at a time, there's an interesting nugget to be found. The `list` bench is predictable and boring. The screen fills with warnings emitted at a constant rate. But the `shift` benchmark exhibits a rather inexplicable pattern. The fact that the warnings eventually stop is easy to understand: modifying the caller's `@_` is a sideeffect that persists across iterations. What is really strange is that the warnings are emitted at progressively slower rate. Why? I don't know. I clawed around in the bowels of Benchmark.pm briefly but didn't find it very pleasant to read, so I gave up. Maybe someone more motivated than me wants to pick up this riddle. Makeshifts last the longest.	[reply]


We don't bite newbies here... much
	PerlMonks