### Re^7: Finding repeat sequences.

by hdb (Monsignor)
 on Jun 21, 2013 at 17:21 UTC ( #1040203=note: print w/replies, xml ) Need Help??

in reply to Re^6: Finding repeat sequences.

Much clearer!

I like especially the fact that you do not need the length of the input. Another improvement could be that you use the fact that if the call to index fails once it will always fail going forward, like this:

```    hdb => sub {
my \$r = shift;
my \$i = 1;
my \$j = 1;
until( substr( \$\$r, \$i ) eq substr( \$\$r, 0, -\$i ) ) {
if( \$j > 0 ) {
\$j = index( \$\$r, substr( \$\$r, 0, \$i ), \$i );
\$i = \$j > 0 ? \$j : ++\$i;
} else {
\$i++;
}
}
return substr( \$\$r, 0, \$i )
},

avoiding the cost of index and substr paying with one if.

And Damian's curse hit! Because it has to be \$i and not \$i+1 in the call to index. I think that's what he meant with off-by-1 error.

Replies are listed 'Best First'.
Re^8: Finding repeat sequences.
by BrowserUk (Pope) on Jun 21, 2013 at 17:36 UTC
And Damian's curse hit! Because it has to be \$i and not \$i+1 in the call to index. I think that's what he meant with off-by-1 error.

Sorry, but you are wrong. You are relying on your eyes rather than tests.

Here is your original labelled 'hdb', the above version labelled 'hdb2' and my refactor labelled 'buks' going through the test harness:

```[17:49:26.08] C:\test>1039630-b.pl14 -SHOW=0 -T=10
Looking for 'fredfre' in 'fredfrefredfr'
Looking for 'fredfre' in 'fredfrefredf'
Looking for 'fredfre' in 'fredfrefred'
Looking for 'fredfre' in 'fredfrefre'
Looking for 'fredfre' in 'fredfrefr'
Looking for 'fredfre' in 'fredfref'
Looking for 'fredfre' in 'fredfre'
Looking for 'fredfre' in 'fredfrefredfrefredfr'
Looking for 'fredfre' in 'fredfrefredfrefredf'
Looking for 'fredfre' in 'fredfrefredfrefred'
Looking for 'fredfre' in 'fredfrefredfrefre'
Looking for 'fredfre' in 'fredfrefredfrefr'
Looking for 'fredfre' in 'fredfrefredfref'
Looking for 'fredfre' in 'fredfrefredfre'
Participants in performance tests: hdb buks hdb2
b:    4 in s:        43        hdb :: 0.000070 s
b:    4 in s:        43       buks :: 0.000033 s
b:    4 in s:        43       hdb2 :: 0.000102 s
b:    4 in s:       401        hdb :: 0.000388 s
b:    4 in s:       401       buks :: 0.000099 s
b:    4 in s:       401       hdb2 :: 0.000104 s
b:    4 in s:      4002        hdb :: 0.000080 s
b:    4 in s:      4002       buks :: 0.000126 s
b:    4 in s:      4002       hdb2 :: 0.000125 s
b:   26 in s:       280        hdb :: 0.000026 s
b:   26 in s:       280       buks :: 0.000029 s
b:   26 in s:       280       hdb2 :: 0.000138 s
b:   26 in s:      2621        hdb :: 0.000121 s
b:   26 in s:      2621       buks :: 0.000080 s
b:   26 in s:      2621       hdb2 :: 0.000093 s
b:   26 in s:     26007        hdb :: 0.000209 s
b:   26 in s:     26007       buks :: 0.000239 s
b:   26 in s:     26007       hdb2 :: 0.000243 s
b:640000 in s:   6775097        hdb :: 0.253329 s
b:640000 in s:   6775097       buks :: 0.381257 s
hdb2 timed out [100]; excluded
b:640000 in s:  64243613        hdb :: 2.816375 s
b:640000 in s:  64243613       buks :: 2.833577 s
b:640000 in s: 640543222        hdb :: 28.591489 s
b:640000 in s: 640543222       buks :: 28.325861 s
{
26 => {
10   => {
buks => "2.88486480712891e-005",
hdb  => "2.59876251220703e-005",
hdb2 => "0.000138044357299805",
},
100  => {
buks => "8.0108642578125e-005",
hdb  => "0.000120878219604492",
hdb2 => "9.29832458496094e-005",
},
1000 => {
buks => "0.000238895416259766",
hdb  => "0.00020909309387207",
hdb2 => "0.000242948532104492",
},
all  => {
buks => "0.00034785270690918",
hdb  => "0.000355958938598633",
hdb2 => "0.000473976135253906",
},
},
4 => {
10   => {
buks => "3.29017639160156e-005",
hdb  => "7.00950622558594e-005",
hdb2 => "0.000102043151855469",
},
100  => {
buks => "9.918212890625e-005",
hdb  => "0.000388145446777344",
hdb2 => "0.000104188919067383",
},
1000 => {
buks => "0.000126123428344727",
hdb  => "7.98702239990234e-005",
hdb2 => "0.000124931335449219",
},
all  => {
buks => "0.000258207321166992",
hdb  => "0.000538110733032227",
hdb2 => "0.00033116340637207",
},
},
640000 => {
10   => { buks => "0.381257057189941", hdb => "0.253329038619995"
+},
100  => { buks => "2.83357691764832", hdb => "2.81637501716614" },
1000 => { buks => "28.3258609771729", hdb => "28.5914890766144" },
all  => { buks => "31.5406949520111", hdb => "31.6611931324005" },
},
all => {
buks => "31.5413010120392",
hdb  => "31.6620872020721",
hdb2 => "0.000805139541625977",
},
}

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Sir, you are correct. I was thinking of a string of 'a's but that one case is already covered by the initial until condition.

Interesting that my last modification timed out. Did you run it with \$i or \$i+1?

Interesting that my last modification timed out. Did you run it with \$i or \$i+1?

Exactly as you posted it (c&p).

I just added print "i:\$i j:\$j"; to the bottom of the loop and it shows what happens:

```i:1 j:1
i:1 j:1
i:1 j:1
i:1 j:1
i:1 j:1
i:1 j:1
i:1 j:1
i:1 j:1
...

Your concern about my use of \$i+1 ignores the difference in the relative ordering of the ++\$i increment in my version and yours. (The original.)

And your addition of the if statement never avoids the need to perform the index, so is pure overhead.

#### As for the possibility of out-by-one errors:

I'll trade the need to fix them, for the ability to make them; when the alternative is to rely upon the (unseen, unknown, uncontrollable, and possibly non-existent) generic optimisations of a declarative tool to convert a high-level -- and potentially completely broken -- abstract description of my problem into an efficient solution.

Had I opted for that approach, I would have settled on Damian's original solution. There would have been no incentive for him to improve it; or you to correct and improve yours.

The result would have been that my application for this; currently running on one of my 4 cores and projected to run for 7 to 10 days; would have projected to run for 143 1110 years! In other words, it would have been abandoned.

Far too often people settle for the first solution on offer -- cpan -- and never consider the affect their choices have.

So yes, I'll willingly trade the need to correct out-by-one errors; for the ability to make them in the first place.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Create A New User
Node Status?
node history
Node Type: note [id://1040203]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (6)
As of 2018-03-23 15:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
When I think of a mole I think of:

Results (294 votes). Check out past polls.

Notices?