http://www.perlmonks.org?node_id=46767


in reply to Re: Craftier
in thread Craftier

I'm afraid there are serious problems with the Benchmark code that you posted. It is important to make sure all your code snippets do the right thing before you benchmark them, and to make sure the benchmark itself is doing the right thing. I hope it will be instructive if I detail the issues.

There are two problems with the tr/// solution; \s is not special inside tr///, and /d is required for tr/// to delete characters. There are also two problems with the split solution; you are splitting $_ using $junk as the delimiter, and you are joining with a space instead of a null string.

There are also problems with the benchmark itself. $junk is a lexical, so it is not accessible from the Benchmark module. Since you passed quoted strings, your snippets were compiled in the Benchmark module and were operating on an empty $junk. Once that problem is fixed, since each code snippet modifies $junk in place, only the first execution of the first snippet would have any work to do; all the remaining iterations would be processing a string that had already been stripped of whitespace.

Here is an improved benchmark:

#!perl use Benchmark; my $junk = 'The quick brown fox Jumped over the lazy dog'; timethese(-10, { 'split' => sub { $x = join '', split ' ', $junk; }, 'trans' => sub { ($x = $junk) =~ tr/ \t\r\n//d; }, 'subst' => sub { ($x = $junk) =~ s/\s+//g; }, });
and the new results are:
Benchmark: running split, subst, trans, each for at least 10 CPU secon +ds... split: 43912.46/s subst: 66211.19/s trans: 197755.00/s
As you can see, the translation solution is actually the big winner, and the substitution is only 1.5 times as fast as split/join.