It strikes me that something like this has already been encountered in bioinformatics. Try out Algorithms on Strings, Trees and Sequences by Dan Gusfield or Computational Molecular Biology by Pavel Pezner.
So in bioinforamtics, an individual sequence can be considered an array. Comparing a pair of sequences can be like comparing an array with an array. The dimensionality increases with the number of sequences that you want to compare. Typically a two dimesional comparrison is carried out n-1 times on the data set to perform an initial comparison, resulting in a statistical score. You then pop the initial query sequence from the data set and carry through the comparrison with the remaining sequences until you have only one left in the set. The statistical score is used to sort the results in terms of relatedness.
This might then be represented as a tree of sequences with branches and proximity indicating closeness of similarity, or a multiple sequence alignment where the distance of two sequences from each other in the alignment indicates their degree of similarity. You might look into a program called ClustalW for some examples of how this is done.
I hope this adds some fuel to your fire.
yet another biologist hacking perl....
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
| & || & |
| < || < |
| > || > |
| [ || [ |
| ] || ] ||