Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Johnnywang,

If I understand the problem correctly, then from an algorithmic standpoint, what you are looking for is unrealistic, at least to be solved in any reasonable amount of time. The problem is this: the only way to guarantee finding every common continuous subsequence of any length is to do it by brute force. You have to check every subsequence from size m up to the size of the smallest sequence.

In the example you give, you have to check every subsequence from size 2 to size 5. I did a quick math check (so the exact number might be wrong), but the number of check would be in the neighborhood of 600. And this is with only 3 sequences of sizes 11, 11, and 5. Just adding 1 number to each of the 3 sequences would add hundreds of more required checks. The growth of the problem with every additional sequence is polynomial.

For you to want to check 1000's of sequences that may be 1000's of numbers long would require in the billions of required checks which would take much longer than I'm sure you are willing to accept. This is one of those NP problems that CS majors learn about in algorithm classes (traveling salesman, coloring problem, hamiltonian circuit problem, etc).

If someone can come up with an algorithm that is not brute force, thus making the problem not NP, I would love to see it.

davidj

In reply to Re: Seeking algorithm for finding common continous sub-patterns by davidj
in thread Seeking algorithm for finding common continous sub-patterns by johnnywang

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (4)
As of 2024-04-25 14:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found