Hi all ,
I had this problem of creating a supertree by recursive Alfred Aho's algorithm.So I divided the problem into its the key concepts and dealt with them one by one. I'll first put up my code I built with help of people on this forum and then later on I'll explain my problem.So here's what I've build till now
# perl code.pl data
# ___data____
#b c a
#a c d
#d e b
#This program read the triplets from file named "data" and returns the
#supertree.
#### NOTE ::: SuperTree part hasnt been incorporated yet.
use strict;
use warnings;
use Data::Dumper;
use Graph;
use Data::Dump qw/ pp /;
####READ IN THE INPUT DATA ########
my @triplets; # Get all the triplets
while (<>) {
push @triplets, [ split ];
}
#Make a deep copy of @triplets
my @triplet_deep_copy = map { [@$_] } @triplets;
#####AUXILIARY GRAPH G(L) #######
# In order to generate the G(L) first of all extract first two column
+s of @triplets into another matrix
my @auxiliary_edges=@triplets;
splice(@$_, 2, 1)
foreach @auxiliary_edges;
print "----EDGE LIST TO BUILD AUXILIARY GRAPH-----\n";
print Dumper \@auxiliary_edges;
##### CONNECTED COMPONENTS ##########
my $auxiliary_graph = Graph->new( undirected => 1 );
my @from;
my @to;
for (my $p = 0; $p <= 2; $p++) {
$from[$p]=$triplets[$p][0];
}
for (my $q = 0; $q <= 2; $q++) {
$to[$q]=$triplets[$q][1];
}
for (my $r = 0; $r <= 2; $r++) {
$auxiliary_graph->add_edge($from[$r], $to[$r]);
}
my @subgraphs = $auxiliary_graph->connected_components; # Subgraphs
my $V = $auxiliary_graph->vertices; # Number of taxa
my $connected_components=scalar @subgraphs; #Get the number of connect
+ed components
###### FINDING THE TRIPLETS WHICH ARE SUBSET(OR INDUCED BY) OF EACH OF
+ THE CONNECTED COMPONENTS######
Main(@auxiliary_edges);
exit(0);
sub induced {
my $trip = shift;
my @matches;
for my $QT ( @_ ) {
for my $triplet ( @$trip ) {
my %seen; # my %Pie;
undef @seen{@$QT};
delete @seen{@$triplet};
if ( keys( %seen ) <= ( @$QT - @$triplet ) ) {
push @matches, $triplet;
}
} ## end for my $triplet ( @$trip )
} ## end for my $QT ( @_ )
return @matches;
}## end sub induced
sub Main {
my $tree = Graph->new( undirected => 1 );
my $dad='u';
$tree->add_vertex($dad);
for my $one (@subgraphs) {
my @matches = induced( \@triplet_deep_copy, $one );
print "\nTriplet induced by subgraph ", pp( $one => { MATCHES =>
+ \@matches } ), "\n\n";
}
}
So this is what I have written till now. Now let me explain my problem.
___INPUT(set of triplets)____
b c a
a c d
d e b
Set of species/vertices=a,b,c,d,e
Now build the auxiliary graph by selecting first two vertices of each of the triplets,i.e.
b c
a c
d e
The auxiliary graph thus generated will be
a-c-b d-e
The number of connected components in this auxiliary graph (q)=2 (viz. a-c-b and d-e)
The algorithm I need to implement is this:-
TreeConstruct(S)
1. Let L be the set of species appear in S. Build G(L)
2. Let C1 , C2 , . . . , Cq be the set of connected components in G(L)
3. If q >1, then
• For i = 1, 2, . . . , q, let Si be the set of triplets in S labeled
+by the set
of leaves in Ci .
• Let Ti = TreeConstruct(Si )
• Let T be a tree formed by connecting all Ti with the same parent nod
+e.
Return T.
4. If q=1 and C1 contains exactly one leaf, return the leaf; else retu
+rn fail.
The progression will be like this:-
1. Initially we have q=2 (a-c-b & d-e). So introduce an internal verte
+x (u) and make these connected components child of u.
u=> a-c-b;
d-e;
2. Select component 1 = a-c-b. Check all lines from INPUT which are a
+subset of this component1.First line of INPUT i.e. "b c a" is a subse
+t of component1.
3."b c a" now becomes the INPUT for the program and it is recursed aga
+in with this INPUT(Now for input "b c a" the auxiliary graph will be
+"b-c" & "a",i.e. two connected components,thus q=2 ...)
Final output(SUPERTREE) for the given input should be like this
u => u => d
=> e
=> u => a
=> u => b
=> c
TRIPLETS(input) and SUPERTREE(output) look like these
http://ars.sciencedirect.com/content/image/1-s2.0-S0166218X10000983-gr1.jpg
The picture link above has the exact triplets for my problem and the exact supertree(output) expected
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.