http://www.perlmonks.org?node_id=906134


in reply to Re^15: Finding All Paths From a Graph From a Given Source and End Node
in thread Finding All Paths From a Graph From a Given Source and End Node

Oh Great!

I appear to have used the word 'elongation' in a somewhat confusing manner. I didn't consider the fact that A->B->C->G->H->C->D->E is actually an elongation of A->B->C->D->E. But the intended meaning is that having arrived at E, if there is nothing else beyond E to which the path should go, then it ends

And! The focus of the work is to list the possible paths from a given node, without getting stuck in a loop somewhere. This would mean that for the hypothetical example we are using here the following paths should result:

a. A->B->C->D->E

b. A->B->C->J->K

c. A->B->C->D->G->H->C->J->K

And! Yes. Your model of what is going on is so very viable.

Of course we all would rather like to take any of the short routes. In nature however, the shortest routes sometimes are not the best and the objective here is to outline all the possible routes; and number them appropriately 'route1, route2, route3...', affording one the chance to look closely at each route and see its workability.

Thanks a lot for your time. Its good to hear of a possible solution.

  • Comment on Re^16: Finding All Paths From a Graph From a Given Source and End Node

Replies are listed 'Best First'.
Re^17: Finding All Paths From a Graph From a Given Source and End Node
by BrowserUk (Pope) on May 22, 2011 at 09:22 UTC

    In that case, try this:

    #! perl -slw use strict; sub _pathsFrom { my( $code, $graph, $start, $path, $seen ) = @_; return $code->( @$path, $start ) unless exists $graph->{ $start }; for my $next ( @{ $graph->{ $start } } ) { if( exists $seen->{ "$start-$next" } ) { return $code->( @$path, $start ); } else { _pathsFrom( $code, $graph, $next, [ @$path, $start ], { %$seen, "$start-$next", undef } ); } } } sub pathsFrom(&@) { _pathsFrom( @_, [], {} ) } my %graph = ( a => [ 'b' ], b => [ 'c' ], c => [ 'd', 'j' ], d => [ 'e', 'g' ], g => [ 'h' ], h => [ 'c' ], j => [ 'k' ], ); pathsFrom{ print join '->', @_; } \%graph, 'a'; __END__ c:\test>904729-2 a->b->c->d->e a->b->c->d->g->h->c->j->k a->b->c->j->k

    This version is the identical algorithm but somewhat more efficient:

    use enum qw[ CODE GRAPH START PATH SEEN ]; sub _pathsFrom2 { return $_[CODE]->( @{ $_[PATH] }, $_[START] ) unless exists $_[GRAPH]->{ $_[START] }; for ( @{ $_[GRAPH]->{ $_[START] } } ) { if( exists $_[SEEN]->{ $_[START] . "-$_" } ) { return $_[CODE]->( @{ $_[PATH] }, $_[START] ); } else { _pathsFrom2( @_[ CODE, GRAPH ], $_, [ @{ $_[PATH] }, $_[START] ], { %{ $_[SEEN] }, $_[START] . "-$_", undef } ); } } } sub pathsFrom(&@) { _pathsFrom2( @_, [], {} ) }

    BTW: If you have a dataset that forms a decent sized test I'd like to get a copy as I've found generating legal directed graphs quite difficult.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Hi BrowserUK

      Thanks so much for your help. And sorry for the late response. I've tried to get some test as you requested. Its quite raw and needs to be processed into a hash listing but it does give an idea of the nature of the data I am dealing with.

      You will notice that in some cases no enzyme is supplied. In this case the relationship between substrate and product suffices for the graph connections. In some other cases like the flow from (S)-1-Phenylethanol -> Acetophenone (line 72 and 73), the reaction can be catalysed by two different enzymes. This should not necessarily result in two routes as I can embedd information on the two enzyme possibilities in the same edge.

      I hope that supplying an 800 line dataset like this is okay by you. Feel free to truncate as needed. Hopefully there will be enough of those to form connections. Given the relatively large size I have decided to paste it ouside of PerlMonks

      Please find it at http://codepaste.net/9gs3da. And note that it is tab delimited, with the enzyme catalyising each connection supplied in the middle column.

      Once more thanks a great lot. Cheers

        That dataset is an excellent test sample. Thankyou.

        BTW. CSV (Comma Seperated Values) data normally actually has commas separating the values :). Using white space to separate values that themselves contain whitespace gave me pause for thought. (I wonder how the thou-shalt-not-parse-csv-yourself brigade would handle that?)

        For my purposes, I just ignored the middle column. I tacked the data on the end of my test script and it found and output all possible paths from all possible starting points in 0.04 seconds. Including remembering which starting point generated the longest path, which turned out to be "2,4,4-Trimethyl-1-pentanol":

        c:\test>904729-3 >nul Took 0.04846 2,4,4-Trimethyl-1-pentanol ->2,4,4-Trimethylpentanal ->2,4,4-Trimethylpentanoate ->2,4,4-Trimethylpentanoyl-CoA ->2,4,4-Trimethylpent-2-enoyl-CoA ->2,4,4-Trimethyl-3-hydroxypentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoate ->2,2-Dimethyl-3-pentanone ->1-Hydroxy-4,4-dimethylpentan-3-one ->4,4-Dimethyl-3-oxopentanal 2,4,4-Trimethyl-1-pentanol ->2,4,4-Trimethylpentanal ->2,4,4-Trimethylpentanoate ->2,4,4-Trimethylpentanoyl-CoA ->2,4,4-Trimethylpent-2-enoyl-CoA ->2,4,4-Trimethyl-3-hydroxypentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoyl-CoA ->Pivalyl-CoA 2,4,4-Trimethyl-1-pentanol ->2,4,4-Trimethylpentanal ->2,4,4-Trimethylpentanoate ->2,4,4-Trimethylpentanoyl-CoA ->2,4,4-Trimethylpent-2-enoyl-CoA ->2,4,4-Trimethyl-3-hydroxypentanoyl-CoA ->2,4,4-Trimethyl-3-oxopentanoyl-CoA ->Propanoyl-CoA

        The script excluding most of the data:


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.