Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: finding longest common substring

by QM (Parson)
on Nov 19, 2003 at 23:06 UTC ( #308439=note: print w/replies, xml ) Need Help??

in reply to finding longest common substring

See QOTW Expert #14 for a similar discussion, which may prove helpful (or not).

Quantum Mechanics: The dreams stuff is made of

Replies are listed 'Best First'.
Re: finding longest common substring
by thospel (Hermit) on Nov 20, 2003 at 14:27 UTC
    QOTW 14 solves a slightly different problem, the longest repeated substring in a single string which is somewhat related

    Here is my suffix trees based QOTW entry updated for LCS. In a sense this is the fastest type of algorithm possible, since it's guaranteed linear in both time and space relative to the combined string length (add one termination character to each string, and actually there is also O(number_of_strings) due to the way I currently mark the strings, though that can be improved upon) Unfortunately the needed complex datastructures make it use so much memory and need so much setup that many of the clever solutions in QOTW 14 can be converted to effectively faster solutions.

      I was referred to your node above from my earlier request about Generalized Suffix Tree.
      Since I am very interested to the use of it for bioinformatics purpose.

      I just want to clarify:
      • Does you implementation above covers the "generalized" suffix tree?
        Or is it the same with S.Yona's module?
      • If so, do you have a CPAN module for it?
      • Otherwise, I would really appreciate if you can kindly point me where can I find the actual Perl implementation for it, if you happen to know one.

      Thanks so much beforehand.
      Hope to hear from you again.

        No, it's not a generalized suffix tree. What I implemented here is basically the suffixtree of the concatenation of the strings, which has the same sort of space and time complexity for operations as a generalized suffix tree.

        Note that I don't expect that a pure perl version of suffix trees will be very usable for huge strings since the memory overhead of the datastructures will just be too much. This stuff needs an XS module.

Re: Re: finding longest common substring
by duff (Parson) on Nov 19, 2003 at 23:49 UTC

    In fact, my unsubmitted entry to QOTW #14 is where my repeated_substring() routine came from

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://308439]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (3)
As of 2021-01-16 08:25 GMT
Find Nodes?
    Voting Booth?