Hmm. I disbelieve that using alternation is as efficient as looping over a list of patterns. I believe the following benchmark backs me up:
tilly gives: 1600
chetlin gives: 1600
Benchmark: running chetlin, tilly, each for at least 5 CPU seconds...
chetlin: 9 wallclock secs ( 5.52 usr + 0.00 sys = 5.52 CPU) @ 33
+3.70/s (n=1842)
tilly: 10 wallclock secs ( 5.09 usr + 0.00 sys = 5.09 CPU) @ 10
+4.52/s (n=532)
Here's the code for it; do feel free to slap me around if I made a thinko:
my @patterns=qw/foo bar baz blarch/;
my $tilly=qr/(@{[join "|",@patterns]})/;
my @chetlin=map qr/$_/,@patterns;
my $target="foo baz blarcy foo blarch"x400;
sub tilly {
my $count;
$count++ while ($target =~ /$tilly/g);
print STDERR "tilly gives: $count\n" if ((caller)[1]!~/eval/);
}
sub chetlin {
my $count;
for (@chetlin) {$count++ while ($target =~ /$_/g) }
print STDERR "chetlin gives: $count\n" if ((caller)[1]!~/eval/);
}
tilly();
chetlin();
use Benchmark;
timethese(-5, { tilly => \&tilly,
chetlin => \&chetlin,
});
In general, my credo is to avoid alternation at all costs. I would be interested in seeing what a benchmark of your optimized alternation (ref. the pointer you gave above) would give.
-dlc