in reply to Finding repeat sequences.
Consider the following:
use strict;
use warnings;
# 1 2 3 4 5
+ 6
# 12345678901234567890123456789012345678901234567890123456
+7890
my $search = 'abcdefabcababeabcdefabcababeabcdefabcababeabcdefabcababe
+abcd';
print "searching '$search'\n length=" . length($search) . "\n";
# FIRST, FIND THE LENGTH OF THE TAIL: THIS IS THE SHORTEST SUBSTRING
# WHICH MATCHES THE START OF THE STRING.
# TRY TO EXPAND THIS NUMBER AS RAPIDLY AS POSSIBLE.
my $tail_length = 1;
my $tail_step = int(length($search) / 2);
while ($tail_step > 0) {
$tail_length += $tail_step
while ( substr($search, 0, $tail_length + $tail_step)
eq substr($search, ($tail_length+$tail_step),
($tail_length+$tail_step)));
$tail_step = int($tail_step / 2);
}
print "tail is '" . substr($search, 0, $tail_length) .
"' length=$tail_length\n";
# ===
my $body_length = length($search)  $tail_length;
print "body is '$body_length' characters long ... \n";
# THE BODY MUST CONTAIN AN INTEGRAL NUMBER OF OCCURRENCES
# OF THE STRING.
# THEREFORE, FIND THE LONGEST ONE THAT OCCURS TWICE.
my $longest = $body_length;
my $n = $body_length  1;
while ($n > 1) {
if ( ($body_length % $n) == 0) {
print "try $n\n";
if ( substr($search, 0, $n) eq substr($search, $n, $n) ) {
$longest = $n;
last;
}
}
$n;
}
print "longest string '" . substr($search, 1, $longest) .
"' length=$longest\n";
The “key” to this puzzle, as I read it, is the tail. This is the shortest string at the end of the string which matches the beginning of the string. The example code tries aggressively to find that number by adding poweroftwosmaller successive increments as many times as it can. (Note: might this introduce a flaw, vs. a simple 2up count?)
The second part of the algorithm now tries to find the longest string which occurs an integral number of times in the leading (repeated ...) portion. We know that the length of this string must be a mathematical factor, i.e. (length mod factor == 0). Only the first and second occurrence must be considered. If none can be found, the string consists of one nonrepeated occurrence.
Re^2: Finding repeat sequences. by sundialsvc4 (Abbot) on Jun 19, 2013 at 04:22 UTC 
for $t from 1 to int(length(string) / 2):
next unless the first $t chars equal the last $t;
# i.e. "$t" must be a plausible tailsize.
$r = length  $t;
$incr = int($r / 2);
while( $incr > 0) {
next unless ($incr % $r == 0);
# i.e. must be a possible repeatedblock size in this space
solution is found if string(0, $incr) eq string($incr, $incr);
# i.e. if we find one repetition then it must be the answer.
$incr;
}
}
As before, start by looking for a tail. If you think you found one, look for repetitions in possiblysized increments of the remaining string.
This approach will consider all successivelylarger candidates for the "tail," up to and including the largest tail, which is "half of it." For each tailsize, it looks for one repetition of all possible sizes which would fit evenly within this area, knowing that the lefthand portion is defined to consist only of repetitions. Continue for all tailsizes, and for each of these, all repsizes, until the first success is found.
It works only for data which does, in fact, match this assertion.
 [reply] [d/l] 

$s = 'abcdabcdabceabcdabcdabceab';
for (1..length($s)1)
{
print substr($s,0,$_) and last if substr($s,$_) eq substr($s,0
+,length($s)$_);
}
abcdabcdabce  [reply] [d/l] 

I think that the pseudocoded solution is the strongest one, if it is made to stop after finding the longest possible substring (then last), and if it is allowed to find all possible (reasonablylong) tails, once again starting with the longest one. (These would be what a human being would probably consider to be the best “right answers.”)
With veryshort tails, as written it might produce wrong answers if it considers only the first two occurrences (consider string 'aaba' if the tail were merely 'a' .. incorrect). But the essential idea, I think, is still valid.
One reason why I wrote it this way was in an effort to avoid “churning” memory when dealing with exceptionallylong strings. You don’t need to consider any string that isn’t a tail, nor, within the head, any repeatedstring candidate that won’t fit. That subdivides the problem into two searches, both of which have only a few possibilities each. It might not yet be bugfree, but it ought to be close . . .
 [reply] 
