Syntactic Confectionery Delight PerlMonks

### Re: finding longest common substring

by duff (Parson)
 on Nov 19, 2003 at 22:12 UTC ( #308424=note: print w/replies, xml ) Need Help??

in reply to finding longest common substring

One method that might work well enough would be to concatenate the strings and then look for the longest non-overlapping repetition. You'll have to worry about string boundaries though. Here's some code for finding the longest repeated substring in case you take to the idea:

```sub repeated_substring {
my (\$ssl,\$pos) = (1,-1);
my \$len = length(\$_[0]);
my \$i = 0;
while (\$i < \$len-2*\$ssl) {
if (index(\$_[0],substr(\$_[0],\$i,\$ssl),\$i+\$ssl) == -1) { \$i++; ne
+xt }
\$pos = \$i; \$ssl++;
}
return \$pos == -1 ? "" : substr(\$_[0], \$pos, \$ssl-1);
}

If you joined the strings with some character that won't appear in the strings (a colon say), then you could modify the above such that as soon as you hit a colon, stop

Update: I just noticed that you're blindly grabbing the first element in the list. An optimization would be to sort the list of strings by length and always start with the shortest one (assuming you continue using your method).

DOH! I just realized that my method won't work at all!

Update: Okay ... I'm stubborn. I know it. Here's how to *make* it work with the repeated_substring() routine:

```sub findlcs {
my @ret;
for my \$i (0..\$#_-1) {
for my \$j (\$i..\$#_) {
my \$str = join ":", @_[\$i,\$j];
my \$ans = repeated_substring(\$str);
push @ret, \$ans;
}
}
return (sort { length(\$a) <=> length(\$b) } @ret)[0];
}
Whew! Boy is that inefficient and ugly! :-)

revdiablo, you did it well. Don't knock your implementation.

Another! update: I realized on my way to pick up the kids that even this method fails. I sure hope revdiablo is watching this to see what could have happened to him :-)

Create A New User
Node Status?
node history
Node Type: note [id://308424]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2020-11-24 09:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?

No recent polls found

Notices?