http://www.perlmonks.org?node_id=998026


in reply to Re: supertree construction in perl
in thread supertree construction in perl

Hi BrowserUk, Did you find any solution to my problem. Please help me on this. I have dealt with all the aspects of the algorithm in isolation but am not able to put it all together(as is required by the algorithm). Help me on this.

Replies are listed 'Best First'.
Re^3: supertree construction in perl
by BrowserUk (Patriarch) on Oct 09, 2012 at 15:45 UTC

    Sorry, but as I said, I find the "algorithm description" in the paper you cited totally incomprehensible. It could be an alchemist's recipe or Archimedes' exclamation when stubbed his toe getting out of the bath for all the sense I can make of it. Constructed to impress rather than convey information. Further, I personally think that (much of) the literature on the subject is wrong, but I haven't reached the point of proof yet.

    I did come across this description of Aho's algorithm which at a cursory glance seems to be infinitely clearer:

    The following procedure is an implementation of the algorithm presented by Aho et al. (1981) with modifications for dealing with incompatibilities.
    1. A subset of the orthologous-repeats table is created, in which only “relevant” rows (organisms) are considered (initially all rows, since all organisms are being considered). Within this subset of rows, only those columns in which at least two rows have a 1 and one row has a 0 are considered.
    2. Using this subset of the original repeat occurrence table, a graph is created by iterating through the columns. If two rows both have a 1 in a given column, an edge of weight 1 is created between the two corresponding organisms. If an edge already exists between those two organisms, its weight is incremented by 1.
    3. Multiple connected components are sought within the graph. If the graph contains a single connected component, weak edges must be eliminated. This is accomplished by removing edges, beginning with those of weight 1 and incrementally removing edges of greater weight, until multiple connected components arise.
    4. Steps 1-3 are recursively applied to each connected component containing more than two organisms. The “relevant” rows in each run are the organisms within the connected component.

    Consider the above example illustrated in Table 3B. The phylogenetic inference of column i is supported by column h, and column k is supported by column l. Thus, in the shared-repeat graph, edges (A, B) and (C, D) have weight 2, while the edge (A, C) has weight 1. Removing the minimum weight edge is akin to removing column j, which has the least support.

    I haven't pursued it because I'd rather finish or abandon my own experiments before I get familiar with someone else's method; but maybe it will help you to bring your code together. Good luck.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

      BrowserUK here is the original paper by Alfred Aho(1981, SIAM journal of computing) on which my research is based. The paper is more approachable by computer scientists unlike the recent ones that I posted in here. http://www.smallfiles.org/download/2927/Aho_et_al_paper.pdf.html
        The paper is more approachable by computer scientists

        Hm. "Approachable"? Makes it sound like I might be intimidated by the other paper. I'm not. I'm disgusted by it.

        It annoys me that academia continues to require that thesis be written in such an obfuscated manner; that they continue to emphasis form over content; that thesis writing has become an art-form an end unto itself, devoid of purpose beyond impressing the promoters of that art.

        The purpose should be to: describe a problem, the history of that problem, a (new) approach to solving it, and the analysis of that solution; as clearly and simply as possible. But it is rarely ever done that way.

        the original paper by Alfred Aho(1981, SIAM journal of computing) on which my research is based.

        I've only given it a once over so far, but frankly, it does not seem any better to me. Presumably, as you've come back here looking for help, you don't get it either.

        The procedure on the second page that purports to be an "algorithm description" is nothing of the sort. It is more of a "wish list". It is the equivalent of calling the following a recipe for a Lemon Meringue Pie:

        1. Make some pastry;
        2. Fill it with some lemon filling;
        3. Top it with some meringue;
        4. Cook it.

        Maybe the details are described somewhere in the pages of words; but

          the language used is so obscure;

          I've a pretty good vocabulary, but I had to look up the meaning of 'consonant' in this context; and there are a dozen, more common, simpler words that could be substituted for it.

          And there are many other examples: 'tableau' instead of 'table'; etc.

        • Terminology is used without explanation or reference;

          WTF is a "distinguished variable"? Wikipedia doesn't have an explanation. Google has a whole 10,000 uses of the term; but no definitions that I could find.

        • Likewise, notation is introduced without either explanation -- or purpose;

          compute Zrc $1, $2, , Sr;

          Compute what? (And more to the point: How?)

        The algorithm may or may not be good; but the description is utter meaningless garbage.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        RIP Neil Armstrong