small improvements:
sub test_regexps2 {
my ($seq1, $seq2, $window, $mismatch) = @_;
my $retval = '';
my %cache;
my @mask = (0) x (length ($seq2) - $window);
for my $start (0 .. (length ($seq1) - $window)) {
my $part = substr ($seq1, $start, $window);
$retval .= $cache{$part} ||= do {
my $regex = build_regexp ($part, $mismatch);
my @res = @mask;
while ($seq2 =~ m/(?=$regex)/g) {
$res[ pos $seq2 ] = 1;
}
join '', @res, "\n";
};
}
$retval;
}
benchmarks for length 200, 400 and 600
200:
Rate orig_poster test_regexps test_regexps2
orig_poster 9.70/s -- -51% -81%
test_regexps 19.6/s 103% -- -61%
test_regexps2 49.8/s 413% 153% --
400:
Rate orig_poster test_regexps test_regexps2
orig_poster 2.44/s -- -52% -84%
test_regexps 5.08/s 109% -- -67%
test_regexps2 15.5/s 535% 205% --
600:
Rate orig_poster test_regexps test_regexps2
orig_poster 1.06/s -- -54% -86%
test_regexps 2.30/s 117% -- -70%
test_regexps2 7.75/s 633% 237% --