Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: Challenge: Fast Common Substrings

by Moron (Curate)
on Apr 04, 2007 at 10:01 UTC ( #608236=note: print w/replies, xml ) Need Help??

in reply to Challenge: Fast Common Substrings

# more update: this is the tested version now!! sub CountSubstrings { # $string1, $string2, $substr_length my $l = pop @_; my %found = (); my %match = (); my $first = 2; my $string1 = $_[0]; for my $string ( @_ ) { $first --; my $ls = length( $string ); my $limit = $ls - $l + 1; for ( my $i = 0; $i < $limit; $i++ ) { my $sbstr = substr( $string, $i, $l ); $first or defined ( $match{$sbstr} ) && next(); $found{ $sbstr }{ $string } ||= 1; $first and next;; defined ( $found{ $sbstr }{ $string1 } ) and $match{ $sbstr } = 1; } } Count( keys %match ); } sub Count { 1+$#_ };
Description of alggorithm: Store all substrings of string 1 of the given length in subhash $found{ $string1 }; do the same for string 2 but in addition lookup in the first hash and record any matches in the %match hash. The suggested opportunities for optimisation include (a) only need to check matches for the substring iteration of the second string. (b) only need to build the substring for string 2 if there is no match found in the match hash.


Free your mind

Replies are listed 'Best First'.
Re^2: Challenge: Fast Common Substrings
by BrowserUk (Pope) on Apr 04, 2007 at 10:25 UTC
    # untested

    True. It doesn't even compile. And when you've corrected the numerous errors, it doesn't work. And it's hard to see how you thought it would. Care to enlighten? (And did you notice the word "fast" in the title?)

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      I think many people will see this is only an algorithm outline and will take the word "untested" as an approriate hint in that regard. I am multitasking at the moment.

      re "enlighten" - I think under the circumstances the most appropriate enlightment I can offer is to improve your lateral thinking In particular I recommend Six Thinking Hats. You would benefit from the insight that you tend to be stuck wearing a black hat (negative logic) and need to learn how to take it off at will and replace it with other forms of thinking when needed. If you need evidence - In your post you say on the one hand you find it difficult to see how I expect it to work and yet you go on to imply an ability to assess its performance when corrected. That's like saying the performance of anything in a category is X without realising the scope you have already accepted for that category, making it impossible to make any sensible such judgement. Conversely, all I am offering is a means to use hashes to solve the problem and I sincerely expect this to perform well when corrected given the pure simplicity of the algorithm and the opportunities it uses to cut out unnecessary iteration. I think you will find that many beginners out there will find it easy enough to correct this code and get a fast-performing result. So why not you, eh? You're clearly not nearly a beginner are you. So what else can it be ... ?


      Free your mind

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://608236]
[Marshall]: Hi folks - can't sleep - just hanging out
marto waves
[Marshall]: marto howdy!

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2018-03-18 12:45 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (230 votes). Check out past polls.