Greetings,
I'm exploring the Graph::Traversal::DFS in order to compute all possible paths of a directed graph.
This solution here:
http://www.perlmonks.org/?node_id=382052
Seems to do what I want: provides all paths of a $tree variable as an array of array references.
The problem is that the code is too obscure for me understand at the moment and I've already
implemented Graph.pm in my code elsewhere and would like to continue using it.
All paths shouldn't be a major computational problem in this case because the graphs will be relatively simple
and I could get the components first as a simplification.
Here's a simple example for a set of vertices called Group 26:
print qq[$_>{dna}\t$_>{score}\n] foreach @{$RNA_SEQ{$RNA[0]}>{alignments}};
opera_scaffold_55153 28.9909638554217
opera_scaffold_12001 26.5813253012048
opera_scaffold_221202 19.6159638554217
opera_scaffold_220117 15.210843373494
opera_scaffold_184033 1.35542168674699
# The relationship among the vertices is in an array of hash reference called @E.
# Each element of @E looks like this:
{v1 => $vertex1, v2 => $vertex2, rna => $rna, state => 1 }
# build a directed graph with this data, vertices have a weight ( scor
+e )
my $g = Graph>new();
for my $edge ( @E )
{
my %weights = map { $_>{dna} => $_>{score} } @{$RNA_SEQ{$edge>{
+rna}}>{alignments}};
$g>add_weighted_vertices( $edge>{v1}, $weights{$edge>{v1}}, $ed
+ge>{v2}, $weights{$edge>{v2}});
$g>add_edge ( $edge>{v1}, $edge>{v2} );
}
# Print out the graph to see what it looks like
foreach ( @E )
{
my $w1 = $g>get_vertex_weight($_>{v1});
my $w2 = $g>get_vertex_weight($_>{v2});
print qq[$_>{v1} > $_>{v2}\t$w1 $w2\n];
}
# Output the edges of the graph with weights for each vertex
opera_scaffold_55153 > opera_scaffold_221202 28.9909638554217 19.6159638554217
opera_scaffold_221202 > opera_scaffold_184033 19.6159638554217 1.35542168674699
opera_scaffold_184033 > opera_scaffold_220117 1.35542168674699 15.210843373494
opera_scaffold_220117 > opera_scaffold_12001 15.210843373494 26.5813253012048
# traverse the graph using DFS
my $t = Graph::Traversal::DFS>new($g);
my @v = $t>preorder;
print qq[Preorder:\n];
print qq[$_\t] foreach @v;
Preorder:
opera_scaffold_221202 opera_scaffold_184033 opera_scaffold_220117 opera_scaffold_12001 opera_scaffold_55153
@v = $t>postorder;
print qq[Postorder:\n];
print qq[$_\t] foreach @v;
Postorder:
opera_scaffold_12001 opera_scaffold_220117 opera_scaffold_184033 opera_scaffold_221202 opera_scaffold_55153
my @r = $t>roots;
print qq[Roots:\n];
print qq[$_\t] foreach @r;
Roots:
opera_scaffold_221202 opera_scaffold_55153
Postorder gives me what I want in this example, albeit in the reverse order. However this is a simple example in which
the graph has just one single path.
Group 16 looks like this:
Graph Edges:
opera_scaffold_131077 > opera_scaffold_45770 2.43611177454024 54.8841652734655
opera_scaffold_102945 > opera_scaffold_23837 13.8762837353714 19.871029376642
Preorder:
opera_scaffold_45770 opera_scaffold_102945 opera_scaffold_23837 opera_scaffold_131077
Postorder:
opera_scaffold_45770 opera_scaffold_23837 opera_scaffold_102945 opera_scaffold_131077
Roots:
opera_scaffold_45770 opera_scaffold_102945 opera_scaffold_131077
Clearly not what I'm looking for. There's two paths in this graph represented by the two edges:
opera_scaffold_131077 > opera_scaffold_45770 and
opera_scaffold_102945 > opera_scaffold_23837
Again, I need a way that, given a graph, I can get back all possible paths.
At that point I need to determine the cumulative score for each path, and then choose the one with the highest score.
Graph::Traversal seems to offer a lot of options but I'm not sure how to use them all and the documentation isn't very good.
Can anyone offer some help on this problem PLEASE. THANKS!.
