Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

The following is a copy of Re^9: Fast common substring matching posted by bioMan, but the nesting was so deep I though I'd "bring it to the top".

I have done a quick test. Quick because of your blazing fast algorithm. My runtimes for 3 to 200 strings is given below. The times come from the Time::HiRes::gettimeofday function in your program. I set $subStrSize = 256.

3,0.001355 4,0.002366 5,0.003744 6,0.005187 7,0.006808 8,0.008807 9,0.011459 10,0.014172 11,0.017328 12,0.020473 13,0.023491 14,0.027905 15,0.031813 16,0.036426 17,0.043177 18,0.067228 19,0.052098 20,0.059449 30,0.132611 40,0.235241 50,0.367828 60,0.510851 70,0.695534 80,0.913879 90,1.178992 100,1.427899 125,2.254988 150,3.269179 200,5.831225

These results are from a second copy of your program that I created. It appears my first copy still has a bug.

Yes, there must be a bug in my original copy. Running my full data set with my backup copy of your program gave a runtime of 14.759568. I'd say that's a little better than three years! I'll check the results to see how they compare with some of my old results from limited data sets. That check could take a little time :-)

The problem could be with the way PerlIDE runs scripts. All the times above are from the command line (Win2000).

I ran the full data set with the second copy of the program from PerlIDE and got a runtime of 15.147881. So the problem is with my first copy of your program.

I really like your algorithm. It's speed will allow me to modify my original data and test a couple of theories. You've got a real winner.


Perl is Huffman encoded by design.

In reply to Re: Fast common substring matching by GrandFather
in thread Fast common substring matching by GrandFather

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others wandering the Monastery: (5)
    As of 2014-07-26 00:27 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      My favorite superfluous repetitious redundant duplicative phrase is:









      Results (175 votes), past polls