in reply to
Re^5: counting the number of 16384 pattern matches in a large DNA sequence
in thread counting the number of 16384 pattern matches in a large DNA sequence
an untested variation:
while(/([ACGT]{7,})/g) {
for my $ix (0..lenght($1) - 7) {
++$index{substr($1, $ix, 7)}
}
}
This regular expression should process every character on the string just once and so be an order of magnitude faster than yours which tries to match the look-ahead pattern at every char.
But that is just guessing... could you benchmark it?