in reply to Analysing a (binary) string. (Solved)
- Not starting with a complete substring is impossible. Your example 'bcd abcd abcd ab' really is 'bcda bcda bcda b'. So without loss of generality you can always assume that the string starts with the pattern.
- Due to the occasional errors in the string you need to replace equality with some check tolerating a few differences, eg for some offset $i and a given $tolerance
my $n = length $$strref;
# shift/rotate string and compare to original
my $diff = $$strref ^ substr( $$strref, -$i ).substr( $$strref, 0, -$i
# number of differing characters between shifted string and original
my $ndiff = $diff =~ tr/\0//c;
return 1 if $ndiff/$n < $tolerance
- A brute force method (iterating over all offsets) has probably quadratic runtime as a function of string length, but might stop short most of the time if the repeating pattern is not too long compared to the overall string and the tolerance not set too small.
- The skip ahead method from earlier discussions (Finding repeat sequences.) is not reliable due to the errors but tye has already proposed an alternative.