Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^7: Finding repeat sequences.

by hdb (Prior)
on Jun 21, 2013 at 17:21 UTC ( #1040203=note: print w/ replies, xml ) Need Help??


in reply to Re^6: Finding repeat sequences.
in thread Finding repeat sequences.

Much clearer!

I like especially the fact that you do not need the length of the input. Another improvement could be that you use the fact that if the call to index fails once it will always fail going forward, like this:

hdb => sub { my $r = shift; my $i = 1; my $j = 1; until( substr( $$r, $i ) eq substr( $$r, 0, -$i ) ) { if( $j > 0 ) { $j = index( $$r, substr( $$r, 0, $i ), $i ); $i = $j > 0 ? $j : ++$i; } else { $i++; } } return substr( $$r, 0, $i ) },

avoiding the cost of index and substr paying with one if.

And Damian's curse hit! Because it has to be $i and not $i+1 in the call to index. I think that's what he meant with off-by-1 error.


Comment on Re^7: Finding repeat sequences.
Download Code
Re^8: Finding repeat sequences.
by BrowserUk (Pope) on Jun 21, 2013 at 17:36 UTC
    And Damian's curse hit! Because it has to be $i and not $i+1 in the call to index. I think that's what he meant with off-by-1 error.

    Sorry, but you are wrong. You are relying on your eyes rather than tests.

    Here is your original labelled 'hdb', the above version labelled 'hdb2' and my refactor labelled 'buks' going through the test harness:

    [17:49:26.08] C:\test>1039630-b.pl14 -SHOW=0 -T=10 Looking for 'fredfre' in 'fredfrefredfr' Looking for 'fredfre' in 'fredfrefredf' Looking for 'fredfre' in 'fredfrefred' Looking for 'fredfre' in 'fredfrefre' Looking for 'fredfre' in 'fredfrefr' Looking for 'fredfre' in 'fredfref' Looking for 'fredfre' in 'fredfre' Looking for 'fredfre' in 'fredfrefredfrefredfr' Looking for 'fredfre' in 'fredfrefredfrefredf' Looking for 'fredfre' in 'fredfrefredfrefred' Looking for 'fredfre' in 'fredfrefredfrefre' Looking for 'fredfre' in 'fredfrefredfrefr' Looking for 'fredfre' in 'fredfrefredfref' Looking for 'fredfre' in 'fredfrefredfre' Participants in performance tests: hdb buks hdb2 b: 4 in s: 43 hdb :: 0.000070 s b: 4 in s: 43 buks :: 0.000033 s b: 4 in s: 43 hdb2 :: 0.000102 s b: 4 in s: 401 hdb :: 0.000388 s b: 4 in s: 401 buks :: 0.000099 s b: 4 in s: 401 hdb2 :: 0.000104 s b: 4 in s: 4002 hdb :: 0.000080 s b: 4 in s: 4002 buks :: 0.000126 s b: 4 in s: 4002 hdb2 :: 0.000125 s b: 26 in s: 280 hdb :: 0.000026 s b: 26 in s: 280 buks :: 0.000029 s b: 26 in s: 280 hdb2 :: 0.000138 s b: 26 in s: 2621 hdb :: 0.000121 s b: 26 in s: 2621 buks :: 0.000080 s b: 26 in s: 2621 hdb2 :: 0.000093 s b: 26 in s: 26007 hdb :: 0.000209 s b: 26 in s: 26007 buks :: 0.000239 s b: 26 in s: 26007 hdb2 :: 0.000243 s b:640000 in s: 6775097 hdb :: 0.253329 s b:640000 in s: 6775097 buks :: 0.381257 s hdb2 timed out [100]; excluded b:640000 in s: 64243613 hdb :: 2.816375 s b:640000 in s: 64243613 buks :: 2.833577 s b:640000 in s: 640543222 hdb :: 28.591489 s b:640000 in s: 640543222 buks :: 28.325861 s { 26 => { 10 => { buks => "2.88486480712891e-005", hdb => "2.59876251220703e-005", hdb2 => "0.000138044357299805", }, 100 => { buks => "8.0108642578125e-005", hdb => "0.000120878219604492", hdb2 => "9.29832458496094e-005", }, 1000 => { buks => "0.000238895416259766", hdb => "0.00020909309387207", hdb2 => "0.000242948532104492", }, all => { buks => "0.00034785270690918", hdb => "0.000355958938598633", hdb2 => "0.000473976135253906", }, }, 4 => { 10 => { buks => "3.29017639160156e-005", hdb => "7.00950622558594e-005", hdb2 => "0.000102043151855469", }, 100 => { buks => "9.918212890625e-005", hdb => "0.000388145446777344", hdb2 => "0.000104188919067383", }, 1000 => { buks => "0.000126123428344727", hdb => "7.98702239990234e-005", hdb2 => "0.000124931335449219", }, all => { buks => "0.000258207321166992", hdb => "0.000538110733032227", hdb2 => "0.00033116340637207", }, }, 640000 => { 10 => { buks => "0.381257057189941", hdb => "0.253329038619995" +}, 100 => { buks => "2.83357691764832", hdb => "2.81637501716614" }, 1000 => { buks => "28.3258609771729", hdb => "28.5914890766144" }, all => { buks => "31.5406949520111", hdb => "31.6611931324005" }, }, all => { buks => "31.5413010120392", hdb => "31.6620872020721", hdb2 => "0.000805139541625977", }, }

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Sir, you are correct. I was thinking of a string of 'a's but that one case is already covered by the initial until condition.

      Interesting that my last modification timed out. Did you run it with $i or $i+1?

        Interesting that my last modification timed out. Did you run it with $i or $i+1?

        Exactly as you posted it (c&p).

        I just added print "i:$i j:$j"; to the bottom of the loop and it shows what happens:

        i:1 j:1 i:1 j:1 i:1 j:1 i:1 j:1 i:1 j:1 i:1 j:1 i:1 j:1 i:1 j:1 ...

        Your concern about my use of $i+1 ignores the difference in the relative ordering of the ++$i increment in my version and yours. (The original.)

        And your addition of the if statement never avoids the need to perform the index, so is pure overhead.

        As for the possibility of out-by-one errors:

        I'll trade the need to fix them, for the ability to make them; when the alternative is to rely upon the (unseen, unknown, uncontrollable, and possibly non-existent) generic optimisations of a declarative tool to convert a high-level -- and potentially completely broken -- abstract description of my problem into an efficient solution.

        Had I opted for that approach, I would have settled on Damian's original solution. There would have been no incentive for him to improve it; or you to correct and improve yours.

        The result would have been that my application for this; currently running on one of my 4 cores and projected to run for 7 to 10 days; would have projected to run for 143 1110 years! In other words, it would have been abandoned.

        Far too often people settle for the first solution on offer -- cpan -- and never consider the affect their choices have.

        So yes, I'll willingly trade the need to correct out-by-one errors; for the ability to make them in the first place.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1040203]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (15)
As of 2014-09-18 15:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (116 votes), past polls