Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Analysing a (binary) string. (Solved)

by hexcoder (Chaplain)
on Jun 28, 2013 at 13:36 UTC ( #1041254=note: print w/replies, xml ) Need Help??

in reply to Analysing a (binary) string. (Solved)


just for completeness (since its solved). A suitable data structure could be that of a suffix tree or a suffix array. It allows finding the longest repeating substring(s) in linear time.

I found this page instructive:

I am using it in the detection of cut-and-paste code.

  • Comment on Re: Analysing a (binary) string. (Solved)

Replies are listed 'Best First'.
Re^2: Analysing a (binary) string. (Solved)
by BrowserUk (Pope) on Jun 28, 2013 at 14:08 UTC

    Two questions:

    1. Have you ever built a suffix (or any kind of) tree in Perl with 30 million nodes?
    2. What effect do you think the presence of errors in the repeats would have upon the suffix trees ability to detect them?

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      1. Not yet, but a suffix array has lower space requirements than a suffix tree.

      2. In Dan Gusfield's book 'Algorithms on Strings, Trees, and Sequences' several inexact matching solutions based on suffix trees are discussed (for example in chapter 12.2). If there are up to k errors without insertions and deletions, it is the k-mismatch problem (see chap. 9.4) with a solution running in O(km), where k is the number of errors and m is the length of the string. Otherwise if the errors shall include insertions/deletions it is called the k-difference problem.

      Hope that helps.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1041254]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (11)
As of 2017-01-16 12:14 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (150 votes). Check out past polls.