Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: Benchmarking Quiz

by Arguile (Hermit)
on Nov 11, 2001 at 16:50 UTC ( #124661=note: print w/replies, xml ) Need Help??

in reply to Benchmarking Quiz

"Premature optimisation is the root of all evil." -- Donald Knuth

Knuth's famous quotation applies to many of these constructs. Especially as readability is often never included in such benchmarks :). I was glad to see your warning at the top of the quiz but I thought I'd add my own.

Question 9-10 in particular can be very misleading; especially given that you don't show your method. As it's showing a counting loop, instead of an array iteration loop, it could easily trap the unwary. See Efficient Looping Constructs for some good discussion on this.

I'm also a little inclined to question your methodology. Are you using subs with lexicals to make sure they don't have any interaction with each other? I'm not saying you're doing anything wrong, just some of these results suprised me and don't compare with my benchmarks. This could just be differences in our approach and choice of test strings.

As a side note, consider for ($i=1; $i < $n+1; $i++). You're doing an extra assignment in there every loop. That slows it down considerably for ($i=1; $i <= $n; $i++) is quite a bit faster (and clearer IMO, though not as clear as Perl-ish for). Seeing a foreach along the lines of for my $i (1..$n) might have been nice too, and the results might suprise you :).

Counting loops have already been covered, but I can't seem to find a good node comparing C style for(;;) loops against Perl style foreach when iterating over lists/arrays (probably their most common usage). So read on if you want a quick benchmark.

I'm not sure how much, if any, good the inital mem grab does. I thought I detected a difference in the speed the first test ran at higher numbers (and therefore less iterations) but I could be wrong. If a more senior monk could advise on whether my thinking is faulty in this, it would be appreciated.

#!/usr/bin/perl -w use strict; use Benchmark qw(cmpthese); # Yes I do want it to be 5000 if $ARGV[0] is 0 ;) my $n = $ARGV[0] || 5000; # Grab a chunk of memory and give it to perl. my @array = (1..50000); undef @array; cmpthese(-10, { a => "&a($n)", b => "&b($n)", c => "&c($n)", d => "&d($n)", }); sub a { my @a = (1..shift); my $m; for ( @a ) { $m = $_; }; } sub b { my @a = (1..shift); my $m; for my $i ( @a ){ $m = $i }; } sub c { my @a = (1..shift); my $m; $m = $_ for ( @a ); } sub d { my @a = (1..shift); my $m; for (my $i=0; $i <= $#a; $i++) { $m = $a[$i]; }; }

I leave the running of this as an exercise to the reader. Note that -10 in cmpthese() means that each sub will run aproximately 10CPU seconds. To learn how to benchmark your own code, visit turnstep's excellent tutorial

Update: Yes I know I could have just passed a copy of an array, but it's throw away code and copy->paste had already worked it's magic ;).

Replies are listed 'Best First'.
Re: Re: Benchmarking Quiz
by jlongino (Parson) on Nov 12, 2001 at 03:06 UTC
    The purpose of the quiz was not to claim the superiority of any one given snippet over another. Only in the context that it is presented. We should question the methodology of others, particularly when they start making claims about efficiency and performance.

    Take for example the classic example of the bubble sort (which BTW, was mentioned in a reply to a SOPW recently). To say that it is inefficient is a generalization. It can be more efficient than almost any other sort given the right conditions. But it would tick me off if responder had made the generalization and didn't qualify it (they did qualify it). This doesn't happen frequently enough.

    As for questions 9 & 10, yes they can be misleading (and that was part of the point) but only if you don't examine it closely or over-generalize. Also, if the results were all predictable, it wouldn't make for a very interesting meditation.

    Any differences in your benchmarks whether seemingly major or minor can have a big impact on the metrics. I realize that you already know this, because you knew enough to question it. What worries me is all the people who see metrics and just blindly accept them.

    The program used for question 9 was a simple cmpthese:

    use strict; use Benchmark qw(cmpthese); cmpthese (-30, { a => 'foreach (1..$n){$m = $_}', b => '$m = $_ for (1..$n)', c => 'for ($i=1;$i<$n+1;$i++) {$m = $i}' });
    No, I didn't go to the extremes I did in question 1 (see my reply to blakem above). For question 10, I did however and wrote one program for each snippet. I also ran them for $n=10,000,000 several times and got the same results:
    use strict; use Benchmark; my ($i, $m); my $t0 = new Benchmark; $_ = "123:45678"; my $n = 5000000; # snippet goes here. my $t1 = new Benchmark; my $td = timediff($t1, $t0); print "x: ",timestr($td),"\n";
    BTW: thanks for adding the reference to turnsteps tutorial. I read it before starting with and it does cover the basics (which I intentionally left out due to the length of my original post, a poor decision in retrospect). Another interesting read is How to Lie with Statistics by Darrell Huff, illustrated by Irving Geis. Not that I lied here of course :)


Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://124661]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (5)
As of 2017-06-23 02:45 GMT
Find Nodes?
    Voting Booth?
    How many monitors do you use while coding?

    Results (533 votes). Check out past polls.