Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: Challenge: Fast Common Substrings

by blokhead (Monsignor)
on Apr 04, 2007 at 16:02 UTC ( #608313=note: print w/ replies, xml ) Need Help??


in reply to Re: Challenge: Fast Common Substrings
in thread Challenge: Fast Common Substrings

++ Wow, thank you for introducing me to suffix trees. What an interesting concept, and how refreshing to see a linear-time algorithm for constructing such a creature. I see you've used the javascript applet at this page, which others may want to check out.

However, I'd like to slightly revise the algorithm you outlined. Consider the following example:

string = ababc%bc$ | |(3:abc%bc$)|leaf |(1:ab)| | |(5:c%bc$)|leaf tree:| | |(3:abc%bc$)|leaf |(2:b)| | | |(6:%bc$)|leaf | |(5:c)| | | |(9:$)|leaf | | |(6:%bc$)|leaf |(5:c)| | |(9:$)|leaf | |(6:%bc$)|leaf | |(9:$)|leaf
"ab" appears twice in the first string, and so it gives a node with two leaves. The actual condition you should check is whether a node has one leaf containing the % separator and another leaf without the % symbol.

blokhead


Comment on Re^2: Challenge: Fast Common Substrings
Download Code
Re^3: Challenge: Fast Common Substrings
by lima1 (Curate) on Apr 04, 2007 at 16:25 UTC
    Or even easier: check the positions of the substrings (<=7 and > 7 in my example).
Re^3: Challenge: Fast Common Substrings (O(n)?)
by tye (Cardinal) on Apr 04, 2007 at 21:53 UTC

    The page you link to mentions being able to build them in O(n) but then only really describes how to go from a suffix tree for string $x to one for string $x.$c (1==length$c) in O(length $x). Using that algorithm would require O(N*N) to build the suffix tree for a string of length N.

    So I'm not sure I believe the O(N) claim for building the whole suffix tree based on that page.

    - tye        

      The naive algorithm requires O(N*N). The Ukkonen algorithm needs only O(N). If you want to understand it - it is not trivial - I recommend Gusfields book (Algorithms on Strings,...).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://608313]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2014-10-21 06:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (97 votes), past polls