Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
What is not clear to me from your description is whether you are looking for the longest substring with at least one repeat, or whether you are looking for the arbitrary length substring with the highest repeat count, or whether you are looking for the substring which, along with its (adjacent?) repeats comprises the longest length, or something else. Can you provide some more information and examples?

I thought (believe) I have described the problem exactly. Constructing examples is hard -- I have a program running (for 4+ hours now) generating controlled random string and trying to find exceptional cases.

I'll try the description (unsatisfactory) again.

The complete string will consist of, and only of, one or more repetitions of a substring, But the last repetition may be truncated. In code:

my $substring = getsubstring(); my $string = $substring x int( rand $N ); substr( $string, -int( rand length( $substring) ) ) = '' if length $ss +tring > length $substring;

That is, all these are valid strings and all have 'fred' as their substring:

fredf fredfr fredfre fredfred fredfredf fredfredfr fredfredfre

With regard to suffix trees, I feel I would probably need a prefix tree (Trie) instead, but these string can be very long and every implementation of Trie I've seen would not handle them.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^2: Finding repeat sequences. by BrowserUk
in thread Finding repeat sequences. by BrowserUk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (8)
    As of 2014-07-14 10:09 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      When choosing user names for websites, I prefer to use:








      Results (257 votes), past polls