It includes clever use of the vec operator.
Maybe there's historical reasons for using vec over substr, but nowadays it's much faster to use substr. The for my $foo ($from .. $to) { ... } construct is also slightly more efficiently than for (my $foo = $from; $foo <= $to; $foo++) { ... }. My guess is that the C-style loop was used as the "range loop" wasn't optimized back then and actually generated a list of, in this case, a million numbers.
I changed the for loop and did a direct translation to using substr instead of vec, and here's the results:
Benchmark: running substr, vec for at least 60 CPU seconds...
substr: 61 wallclock secs (62.41 usr + 0.02 sys = 62.42 CPU) @ 1
+.09/s (n=68)
vec: 58 wallclock secs (60.64 usr + 0.02 sys = 60.66 CPU) @ 0
+.73/s (n=44)
s/iter vec substr
vec 1.38 -- -33%
substr 0.918 50% --
Here's the code:
use strict;
use warnings;
use Benchmark qw(cmpthese timethese);
my %subs;
######################################################################
+#########
my $UPPER = 1_000_000;
$subs{vec} = sub {
my $sieve = "";
GUESS: for my $guess (2 .. $UPPER) {
next GUESS if vec($sieve,$guess,1);
#print "$guess\n";
for (my $mults = $guess * $guess; $mults <= $UPPER; $mults +=
+$guess) {
vec($sieve,$mults,1) = 1;
}
}
};
$subs{substr} = sub {
my $sieve = '0' x ($UPPER + 1);
GUESS: for my $guess (2 .. $UPPER) {
next GUESS if substr($sieve,$guess,1);
#print "$guess\n";
for (my $mults = $guess * $guess; $mults <= $UPPER; $mults +=
+$guess) {
substr($sieve,$mults,1,1);
}
}
};
######################################################################
+#########
cmpthese(timethese(-60, \%subs));
lodin