Clever mapping of the characters in one of the strings could work around the problems with the XOR approach. For example, using tr/ATCG/HRDZ/:
my %change; # reverse lookup table
my @t = qw(A T C G);
for my $t1 (@t) {
for my $t2 (@t) {
my $t = $t2;
$t =~ tr/ATCG/HRDZ/;
$change{ $t1 ^ $t } = "$t1->$t2";
}
}
sub diff {
my ($target, $str) = @_;
$str =~ tr/ATCG/HRDZ/;
my $diff = $target ^ $str;
while ($diff =~ /([^\x09\x06\x07\x1d])/g) {
printf " %d: %s\n", pos($diff), $change{$1};
}
}
my $target = "ATTCCGGG";
for (qw(ATTGCGGG ATACCGGC)) {
print "\n$target\n$_\n";
diff($target, $_);
}
__END__
ATTCCGGG
ATTGCGGG
4: C->G
ATTCCGGG
ATACCGGC
3: T->A
8: G->C
The reverse lookup table just needs to be set up once, and the remaining operations (string bit operation, tr///, m//g) should all be pretty fast.
(As the keys in the lookup table are integers < 256, you could in theory also set up an array, and use the xor value as the index.)