Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^17: Finding All Paths From a Graph From a Given Source and End Node

by BrowserUk (Pope)
on May 22, 2011 at 09:22 UTC ( #906140=note: print w/replies, xml ) Need Help??


in reply to Re^16: Finding All Paths From a Graph From a Given Source and End Node
in thread Finding All Paths From a Graph From a Given Source and End Node

In that case, try this:

#! perl -slw use strict; sub _pathsFrom { my( $code, $graph, $start, $path, $seen ) = @_; return $code->( @$path, $start ) unless exists $graph->{ $start }; for my $next ( @{ $graph->{ $start } } ) { if( exists $seen->{ "$start-$next" } ) { return $code->( @$path, $start ); } else { _pathsFrom( $code, $graph, $next, [ @$path, $start ], { %$seen, "$start-$next", undef } ); } } } sub pathsFrom(&@) { _pathsFrom( @_, [], {} ) } my %graph = ( a => [ 'b' ], b => [ 'c' ], c => [ 'd', 'j' ], d => [ 'e', 'g' ], g => [ 'h' ], h => [ 'c' ], j => [ 'k' ], ); pathsFrom{ print join '->', @_; } \%graph, 'a'; __END__ c:\test>904729-2 a->b->c->d->e a->b->c->d->g->h->c->j->k a->b->c->j->k

This version is the identical algorithm but somewhat more efficient:

use enum qw[ CODE GRAPH START PATH SEEN ]; sub _pathsFrom2 { return $_[CODE]->( @{ $_[PATH] }, $_[START] ) unless exists $_[GRAPH]->{ $_[START] }; for ( @{ $_[GRAPH]->{ $_[START] } } ) { if( exists $_[SEEN]->{ $_[START] . "-$_" } ) { return $_[CODE]->( @{ $_[PATH] }, $_[START] ); } else { _pathsFrom2( @_[ CODE, GRAPH ], $_, [ @{ $_[PATH] }, $_[START] ], { %{ $_[SEEN] }, $_[START] . "-$_", undef } ); } } } sub pathsFrom(&@) { _pathsFrom2( @_, [], {} ) }

BTW: If you have a dataset that forms a decent sized test I'd like to get a copy as I've found generating legal directed graphs quite difficult.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^18: Finding All Paths From a Graph From a Given Source and End Node
by eMBR_chi (Acolyte) on Jun 04, 2011 at 19:58 UTC

    Hi BrowserUK

    Thanks so much for your help. And sorry for the late response. I've tried to get some test as you requested. Its quite raw and needs to be processed into a hash listing but it does give an idea of the nature of the data I am dealing with.

    You will notice that in some cases no enzyme is supplied. In this case the relationship between substrate and product suffices for the graph connections. In some other cases like the flow from (S)-1-Phenylethanol -> Acetophenone (line 72 and 73), the reaction can be catalysed by two different enzymes. This should not necessarily result in two routes as I can embedd information on the two enzyme possibilities in the same edge.

    I hope that supplying an 800 line dataset like this is okay by you. Feel free to truncate as needed. Hopefully there will be enough of those to form connections. Given the relatively large size I have decided to paste it ouside of PerlMonks

    Please find it at http://codepaste.net/9gs3da. And note that it is tab delimited, with the enzyme catalyising each connection supplied in the middle column.

    Once more thanks a great lot. Cheers

      That dataset is an excellent test sample. Thankyou.

      BTW. CSV (Comma Seperated Values) data normally actually has commas separating the values :). Using white space to separate values that themselves contain whitespace gave me pause for thought. (I wonder how the thou-shalt-not-parse-csv-yourself brigade would handle that?)

      For my purposes, I just ignored the middle column. I tacked the data on the end of my test script and it found and output all possible paths from all possible starting points in 0.04 seconds. Including remembering which starting point generated the longest path, which turned out to be "2,4,4-Trimethyl-1-pentanol":

      c:\test>904729-3 >nul Took 0.04846 2,4,4-Trimethyl-1-pentanol ->2,4,4-Trimethylpentanal ->2,4,4-Trimethylpentanoate ->2,4,4-Trimethylpentanoyl-CoA ->2,4,4-Trimethylpent-2-enoyl-CoA ->2,4,4-Trimethyl-3-hydroxypentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoate ->2,2-Dimethyl-3-pentanone ->1-Hydroxy-4,4-dimethylpentan-3-one ->4,4-Dimethyl-3-oxopentanal 2,4,4-Trimethyl-1-pentanol ->2,4,4-Trimethylpentanal ->2,4,4-Trimethylpentanoate ->2,4,4-Trimethylpentanoyl-CoA ->2,4,4-Trimethylpent-2-enoyl-CoA ->2,4,4-Trimethyl-3-hydroxypentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoyl-CoA ->Pivalyl-CoA 2,4,4-Trimethyl-1-pentanol ->2,4,4-Trimethylpentanal ->2,4,4-Trimethylpentanoate ->2,4,4-Trimethylpentanoyl-CoA ->2,4,4-Trimethylpent-2-enoyl-CoA ->2,4,4-Trimethyl-3-hydroxypentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoyl-CoA ->Propanoyl-CoA

      The script excluding most of the data:


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://906140]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2021-02-26 00:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?