Which method are you talking about, with respect to those 5 seconds? The XOR method outlined above takes just ~0.05 secs on my 4-year-oldish machine, for 10,000 comparisons against a common 40-char target (with ~3-5 deviations per sequence):
my @set = qw(A T C G);
my $target = join "", map $set[rand @set], 1..40;
my @tests ;
for (1..10000) {
my $test = $target;
for (1..5) {
substr($test, rand(length($target)), 1) = $set[rand @set];
}
push @tests, $test;
}
use Time::HiRes qw(time);
my $start = time();
my %change; # reverse lookup table
for my $t1 (@set) {
my $t = $t1;
$t =~ tr/ATCG/HRDZ/;
for my $t2 (@set) {
$change{ $t ^ $t2 } = "$t1->$t2";
}
}
$target =~ tr/ATCG/HRDZ/;
for my $test (@tests) {
my $diff = $target ^ $test;
while ($diff =~ /([^\x09\x06\x07\x1d])/g) {
my $pos = pos($diff);
my $change = $change{$1};
# do something with them...
}
}
printf "%.3f secs\n", time() - $start;
__END__
0.052 secs
Storing away the results somewhere or doing something else with them will presumably take considerably longer than computing them...