in reply to Re^6: Finding repeat sequences.
in thread Finding repeat sequences.
Why would the code be updated. -- I don't know ...Why would anyone modify it, when it performs the required task as is. -- Again, I don't know.
So you try to predict the future; ie, guess. I don't.
If at some point in the future the code needs modification; I'll adapt it. Then.
If at some point after that, the code requires modification again, or I encounter another task that I realise it can be adapted to, Then I'll consider trying to generalise it. But right now, it has one, and only one, very specific purpose.
And I'll willingly and knowingly trade the near 3 orders of magnitude performance gain for that task now, against any potential savings against potential future maintenance costs.
I'm very firmly of the opinion -- based on my years of experience -- that premature generalisation has cost this industry far more, in both financial and in terms of its reputation for spending a fortune developing huge, all encompassing, singing & dancing solutions that never work, and quietly or otherwise, just end up in the bit bucket; than premature optimisation ever has or ever will.
And look...you're now modifying the code! That itself is a potential source of bugs. You may never have accidentally left a "debugging print" in code, but I certainly have. I've even shipped code with them left in.
Hand on heart, no, I never have.
But then, I don't use test harnesses that steal my output and summarises it to a bunch of meaningless statistics.
Equally, nor do I do my explorations on my 'live' code. (Ie. The function in the actual application is very unlikely to be an anonymous subroutine value in a hash, to a key called hdb. Nor is it likely to be called find_substring().
In fact, it is quite likely to not look much like hdb's implementation at all. Now I've found and understood the algorithm, I'll almost certainly re-write it to better fit with the nature of application.
Eg. I probably pass in a reference to the bitstring, convert it to the bytestring internally, and return a packed tuple that encapsulates the compressed bitvector as (say):
return pack 'L L Q*', $reps, $bits, substr( $$bitvector, 0, int( ( $_ + + 63) / 64 ) * 8 );
This thread is all about algorithm, not implementation. (Which still leaves me wondering if hdb's algorithm couldn't be encapsulated into a regex?)
Sure, sometimes you need subtlety. And sometimes you have to write "manual" code.... But I will continue to believe that such code should be the exception, not the paradigm.
I completely agree; but were this paradigmatic problem, I probably wouldn't have needed to ask for help.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^8: Finding repeat sequences.
by DamianConway (Beadle) on Jun 21, 2013 at 02:03 UTC | |
by BrowserUk (Patriarch) on Jun 21, 2013 at 02:37 UTC | |
by DamianConway (Beadle) on Jun 21, 2013 at 05:01 UTC | |
by BrowserUk (Patriarch) on Jun 21, 2013 at 05:39 UTC |