Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Finding repeat sequences. (Results:Part 1)

by BrowserUk (Pope)
on Jun 19, 2013 at 16:17 UTC ( #1039799=note: print w/ replies, xml ) Need Help??


in reply to Finding repeat sequences.

Below are the results from my first pass -- verifying basic functionality -- of (my adaptions of) the 8 solutions from tye, choroba, DamianConway, tobyink, AnomalousMonk, hdb, Eily, sundialsvc4 in this thread:

C:\test>1039630-b.pl14 -SHOW=0 Looking for 'fredfre' in 'fredfrefredfr' tye found 'fredfrefred'; excluded from further consideration svc4 found 'redfrefredf'; excluded from further consideration Looking for 'fredfre' in 'fredfrefredf' Looking for 'fredfre' in 'fredfrefred' hdb found 'none found'; excluded from further consideration Looking for 'fredfre' in 'fredfrefre' Looking for 'fredfre' in 'fredfrefr' Looking for 'fredfre' in 'fredfref' Looking for 'fredfre' in 'fredfre' Looking for 'fredfre' in 'fredfrefredfrefredfr' choroba found 'fredfrefredfre'; excluded from further consideration Looking for 'fredfre' in 'fredfrefredfrefredf' Looking for 'fredfre' in 'fredfrefredfrefred' Looking for 'fredfre' in 'fredfrefredfrefre' Looking for 'fredfre' in 'fredfrefredfrefr' Looking for 'fredfre' in 'fredfrefredfref' Looking for 'fredfre' in 'fredfrefredfre' Eily found 'fredfrefred'; excluded from further consideration Partisipants in performance tests: anomalous tobyink damianc

If authors want to correct (my adaptions of) their solutions that's great, (but please don't moan at me If I screwed the pouch adapting them to subroutines :).

Here's the test harness:


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.


Comment on Re: Finding repeat sequences. (Results:Part 1)
Select or Download Code
Re^2: Finding repeat sequences. (Results:Part 2. The winner)
by BrowserUk (Pope) on Jun 19, 2013 at 18:06 UTC

    After exclusions on functionality, the three left standing -- DamianConway, tobyink & AnomalousMonk -- went forward to performance testing where the latter quickly fell by the wayside.

    Of the remaining two, tobyink's solution is hands down winner with a cumulative 66 seconds version DamianConway's 1670 seconds:

    ... my @bases = ( 'fred', join( '', 'a'..'z' ), unpack( 'b*', pack 'Q*', 0 .. 99 ), ); my %res; for my $base ( @bases ) { for my $reps ( 1, 10, 100, 1000 ) { my $str = $base x $reps . substr( $base, 0, rand( length $base + ) ); for my $test ( keys %tests ) { my $start = time; my $res; eval { local $SIG{ALRM} = sub { die "timeout\n" }; alarm $T * $reps; $res = $tests{ $test }->( \$str ) // 'none found'; alarm 0; }; if( $@ eq "timeout\n" ) { delete $tests{ $test }; warn "$test timed out [@{[ $T * $reps ]}]; excluded\n" +; next; } my $stop = time; warn "$test: Found '$res' instead of '$base'\n" unless $re +s eq $base; $res{ $test }{ length( $base ) }{ $reps } = $stop - $start +; $res{ $test }{ length( $base ) }{ all } += $stop - $start; $res{ $test }{ all } += $stop - $start; printf "b:%5u in s:%10u %10s :: %f s\n", length( $base ), length( $str ), $test, $stop - $start ; } } } pp \%res; __END__ [18:23:33.12] C:\test>1039630-b.pl14 -SHOW=0 -T=10 tye found 'fredfrefred'; excluded from further consideration svc4 found 'redfrefredf'; excluded from further consideration hdb found 'none found'; excluded from further consideration choroba found 'fredfrefredfre'; excluded from further consideration Eily found 'fredfrefred'; excluded from further consideration Partisipants in performance tests: anomalous tobyink damianc anomalous timed out [10]; excluded { anomalous => { 4 => { 1 => "0.000113964080810547", 10 => "0.000288963317871094 +", 100 => "0.00277900695800781", 1000 => "0.0275580883026123", all => "0.0307400226593018", }, 26 => { 1 => "0.0011751651763916", 10 => "0.00826501846313477", +100 => "0.118575096130371", 1000 => "4.02753186225891", all => "4.15554714202881", }, all => "4.18628716468811", }, damianc => { 4 => { 1 => "0.000234842300415039", 10 => "0.000370979309082031", + 100 => "0.000273942947387695", 1000 => "0.000956058502197266", all => "0.00183582305908203", }, 26 => { 1 => "0.000357866287231445", 10 => "0.000438928604125977" +, 100 => "0.00239300727844238", 1000 => "0.0216219425201416", all => "0.0248117446899414", }, 6400 => { 1 => "1.69514513015747", 10 => "12.9105410575867", 100 +=> "135.586049079895", 1000 => "1520.64571499825", all => "1670.83745026588", }, all => "1670.86409783363", }, tobyink => { 4 => { 1 => "0.000294923782348633", 10 => "0.000179052352905273", + 100 => "5.60283660888672e-005", 1000 => "5.10215759277344e-005", all => "0.000581026077270508", }, 26 => { 1 => "7.60555267333984e-005", 10 => "0.000123977661132813 +", 100 => "0.00015711784362793", 1000 => "0.000526189804077148", all => "0.000883340835571289", }, 6400 => { 1 => "0.071134090423584", 10 => "0.232213973999023", 10 +0 => "4.46547484397888", 1000 => "61.8524870872498", all => "66.6213099956512", }, all => "66.6227743625641", }, }

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Congrats to tobyink!

      Thanks to your efforts, BrowserUK, I realized my mistakes in code and thinking. Should you ever get around to re-run your tests, here is a corrected version, don't know whether it is faster then tobyink's.

      hdb => sub { my $input = shift; my $length = length $$input; my $i = 0; my $possible; my $j; while( 1 ) { $possible = substr $$input, 0, ++$i; $possible = substr $$input, 0, $i=$j if( ($j = index $$input, $p +ossible, $i) > 0 ); return $possible if substr( $$input, $i ) eq substr($$input, 0, +$length - $i); } },

        Update: (I sent this to hdb as a /msg; and then decided it belongs in here): That is blinding! I've been looking for a solution that avoided being O(length string). Now I can stop looking. Thankyou!

        Nice one++ Thank you :) (0.228s -v- 70.6s)

        Partisipants in performance tests: hdb tobyink b: 4 in s: 7 hdb :: 0.000154 s b: 4 in s: 7 tobyink :: 0.000169 s b: 4 in s: 43 hdb :: 0.000098 s b: 4 in s: 43 tobyink :: 0.000333 s b: 4 in s: 402 hdb :: 0.000156 s b: 4 in s: 402 tobyink :: 0.000190 s b: 4 in s: 4003 hdb :: 0.000250 s b: 4 in s: 4003 tobyink :: 0.000233 s b: 26 in s: 44 hdb :: 0.000126 s b: 26 in s: 44 tobyink :: 0.000104 s b: 26 in s: 274 hdb :: 0.000040 s b: 26 in s: 274 tobyink :: 0.000106 s b: 26 in s: 2614 hdb :: 0.000031 s b: 26 in s: 2614 tobyink :: 0.000086 s b: 26 in s: 26008 hdb :: 0.000190 s b: 26 in s: 26008 tobyink :: 0.000492 s b: 6400 in s: 8044 hdb :: 0.000483 s b: 6400 in s: 8044 tobyink :: 0.052703 s b: 6400 in s: 69681 hdb :: 0.001241 s b: 6400 in s: 69681 tobyink :: 0.240158 s b: 6400 in s: 641930 hdb :: 0.017612 s b: 6400 in s: 641930 tobyink :: 4.962130 s b: 6400 in s: 6404352 hdb :: 0.208475 s b: 6400 in s: 6404352 tobyink :: 65.351273 s { hdb => { 26 => { 1 => "0.000125885009765625", 10 => "4.00543212890625e-005", 100 => "3.09944152832031e-005", 1000 => "0.000190019607543945", all => "0.000386953353881836", }, 4 => { 1 => "0.000154018402099609", 10 => "9.79900360107422e-005", 100 => "0.000156164169311523", 1000 => "0.000249862670898438", all => "0.000658035278320313", }, 6400 => { 1 => "0.000483036041259766", 10 => "0.00124096870422363", 100 => "0.0176119804382324", 1000 => "0.208475112915039", all => "0.227811098098755", }, all => "0.228856086730957", }, tobyink => { 26 => { 1 => "0.000103950500488281", 10 => "0.000106096267700195", 100 => "8.60691070556641e-005", 1000 => "0.000491857528686523", all => "0.000787973403930664", }, 4 => { 1 => "0.000169038772583008", 10 => "0.000333070755004883", 100 => "0.000190019607543945", 1000 => "0.000233173370361328", all => "0.000925302505493164", }, 6400 => { 1 => "0.0527029037475586", 10 => "0.240158081054688", 100 => "4.96213006973267", 1000 => "65.3512728214264", all => "70.6062638759613", }, all => "70.6079771518707", }, }

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^2: Finding repeat sequences. (definitions; part $n)
by tye (Cardinal) on Jun 20, 2013 at 00:40 UTC
    Looking for 'fredfre' in 'fredfrefredfr' tye found 'fredfrefred'; excluded from further consideration

    Looks like you changed your definition of your desired results, so it was silly to use a "solution" tailored to a different definition.

    This particular example matches my original speculation of what made sense, which means you might want to try my original solution. Though, I didn't wade through everything trying to find the various redefinitions of what was desired (besides the ones given in reply to my nodes) and then sort them chronologically in order to figure out what the new requirements actually are.

    - tye        

      There is no new requirement. Re-read the OP.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1039799]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (8)
As of 2014-12-27 16:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (177 votes), past polls