http://www.perlmonks.org?node_id=995858


in reply to phylogenetic tree construction using perl

I don't know much about this, but have you tried this suite of modules to help you: Bio-Phylo? There are some Tree building modules in there.

There are no stupid questions, but there are a lot of inquisitive idiots.

Replies are listed 'Best First'.
Re^2: phylogenetic tree construction using perl
by zing (Beadle) on Sep 26, 2012 at 21:06 UTC
    Hi greengaroo, The algorithm is this:- Pictorial representation:- http://picpaste.com/triplets-IQMFT1QY.jpg
    Triplets :: S=('b,c|a', 'a,c|d', 'd,e|b'), Species :: L={a,b,c,d,e} TreeConstruct(S): 1.] Let L be the set of species in S. Build G(L) the auxillary graph. 2.] Let C1,C2....Cq be the set of connected components in G(L). 3.] If q>1,then - For i=1,2.....q, let S(i) be the set of triplets in S labeled by +the set of leaves in C(i). - Let T(i) = TreeConstruct(S(i)) - Let T be a tree formed by connecting all T(i) with the same paren +t node. Return T. 4.]If q=1 & C1 contains exactly one leaf,return the leaf ,else return +fail.
    I have updated the code and now it takes input connections in form of triplets and prints the connected components of the graph.
    use strict; use warnings; use Graph; @ARGV = ('b,c|a', 'a,c|d', 'd,e|b') unless @ARGV; my %HoA; foreach ( @ARGV ) { m/^([a-z])[,]([a-z])[|]([a-z])$/ ; push @{$HoA{$1}}, $2; } print "\n===========\@HoA=====\n"; print "from->to\n"; while (my ($key, $values) = each %HoA) { print $key, "=> [", join(',', @$values), "]\n"; } my $g = Graph->new( undirected => 1 ); for my $src ( keys %HoA ) { for my $tgt ( @{ $HoA{$src} } ) { $g->add_edge($src, $tgt); } } my @subgraphs = $g->connected_components; my @allgraphs; for my $subgraph ( @subgraphs ) { push @allgraphs, {}; for my $node ( @$subgraph ) { if ( exists $HoA{ $node } ) { $allgraphs[-1]{$node} = [ @{ $HoA{$node} } ]; } } } print "----connected components------------"; use YAML; print Dump \@allgraphs; -------------OUTPUT---------------- ===========@HoA===== from->to a=> [c] b=> [c] d=> [e] ----connected components--------------- - a: - c b: - c - d: - e
    Hope this helps you get an idea

      I'm still reading through the Triplet Methods paper, so I'm certain my understanding is not only limited but wrong. Nonetheless, section 7.4.1 ("Reconstruct a network by a sorting network") caught my eye for the simple reason that Algorithm-Networksort exists on CPAN and I am its author.

      So if the module can be of use to you, great. If there's a feature that you need from it that's doable, I'd be more than happy to add it to the module. Let me know.

      By the way, your link to your jpeg on picpaste doesn't display anything.

        Thanks for pointing it out jgamble. Actually the picpaste had figure 7.4 from the paper (which depicts the algorithm for a better understanding). And the algorithm is Alfred Aho's famous algorithm, which has been adopted by biologists to construct a tree. The problem is that it deals with connected components, subgraphs,recursion etc all at once. So Im having trouble proceeding, though im still trying to solve it moving in bits and pieces that is.

        The algorithm TreeConstruct described in section 7.2.3 is my main concern.
        Hi jgamble, Did you get the algorithm. Please help me on it. http://picpaste.com/Z1SVFTT6.png

        This is what a triplet looks like. The link below shows three triplets (a,b|c) (a,c|d) (d,e|b) and their consensus supertree. http://picpaste.com/Nu0ON9uo.jpg