Sorry. The code is flawed. It does produce all the common substrings, but it will often select the wrong "longest".
The problem occurs because if a substring occurs twice in one of the input strings, and not at all in one of the others, it's count will be the same as if it had appeared once in both, The selection mechanism, the longest key who's count is equal to the number of input strings is bogus, but suffuciently convincing that it worked for all 5 sets of test data I tried it on!
I'm trying to think of an efficient way of counting how many of the original strings each substring is found in, but the only one I've come up with so far would limit the number of input strings to 32. A couple of other ideas I tried worked, but carry enough overhead to make the method less interesting.
I'll keep looking at it, but maybe my "surprise at the simplicity and efficiency" was the red flag that should have told me that I was missing something! Still, nothing ventured, nothing gained.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!
|