Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re: Finding repeat sequences.

by LanX (Canon)
on Jun 21, 2013 at 01:03 UTC ( #1040046=note: print w/ replies, xml ) Need Help??


in reply to Finding repeat sequences.

Still struggling to understand the task...

is

    $str = ($pattern x $n ) . substr($pattern,0,$k)

with

    0 <= $k < ($l = length($pattern))

and the task is to find maximum $pattern for a given $str to fit these constraints?

If yes, some simple mathematics should already considerably minimize the set of possible combinations you need to investigate with regexes.

Cheers Rolf

( addicted to the Perl Programming Language)

test
DB<109> $pattern='abcdabcdabce' => "abcdabcdabce" DB<110> $n=2,$k=2 => (2, 2) DB<111> $str = ($pattern x $n ) . substr($pattern,0,$k) => "abcdabcdabceabcdabcdabceab" DB<112> $str eq 'abcdabcdabceabcdabcdabceab' => 1


Comment on Re: Finding repeat sequences.
Select or Download Code
Re^2: Finding repeat sequences.
by BrowserUk (Pope) on Jun 21, 2013 at 01:24 UTC
    and the task is to find maximum $pattern to fit these constraints?

    Um. I cannot see any errors in that. So yes.

    If yes, some simple mathematics should already considerably minimize the set of possible combinations you need to investigate with regexes.

    Hm. A realistic, but relatively small, example from my test harness:

    b:64000 in s: 640028748 hdb :: 24.290438 s

    L=64000, N = 10,000, K=28,740.

    But those could equally well be: L=16,000, N = 40,001, K=12,740; or (thousands*) of other permutations.

    I don't think it helps.

    (*I'm being very, very conservative; my best guess is 100s, of millions.)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      here a regex solution which works for the shortest possible tail of length k

      DB<127> $str => "abcdabcdabceabcdabcdabceab" DB<128> $str=~/^((.+?).*)\2$/; $rest=$1, $tail=$2 => ("abcdabcdabceabcdabcdabce", "ab") DB<129> $rest =~ /^(.+?)\1*$/; $1 => "abcdabcdabce"

      needs to be extended for longer possible tails.

      But taking the dimensions of your data I doubt that regexes are appropriate.

      You could test all $patterns which repeat at least once (or x times) and calculate $k = $m % $l with $m =length ($str), and check if $str starts and ends with the same substring $tail of length $k and then check if the pattern continues repeating.

      Or start eliminating all possible $tails and check if $l of a repeating pattern is a divisor of the $rest.

      Had no time to check all the other posted solutions and don't wanna reinvent the wheel, so I better stop here! =)

      HTH

      Cheers Rolf

      ( addicted to the Perl Programming Language)

        That looks suspiciously like a close variation on choroba's attempt.

        Had no time to check all the other posted solutions and don't wanna reinvent the wheel,

        All the tested solutions, along with how they faired in my test harness, are nicely grouped together in Re: Finding repeat sequences. (Results:Part 1).


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Finding repeat sequences.
by hdb (Parson) on Jun 21, 2013 at 10:54 UTC

    It is to find the shortest pattern, otherwise $n==1 always.

    Correction: replaced $n=1 with $n==1

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1040046]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (7)
As of 2014-07-10 01:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (198 votes), past polls