Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^5: Finding All Paths From a Graph From a Given Source and End Node

by BrowserUk (Pope)
on Nov 02, 2010 at 02:52 UTC ( #868922=note: print w/replies, xml ) Need Help??


in reply to Re^4: Finding All Paths From a Graph From a Given Source and End Node
in thread Finding All Paths From a Graph From a Given Source and End Node

If you can do what you need to do with the paths by getting them one at a time, rather than all at once, then this version of findPaths() runs much faster and uses far less memory. (It runs faster because it uses much less memory.)

For example, on a randomly generated graph that has 10,000 paths, it takes 1.09 seconds instead of 38 seconds for my previous version. In the process, this uses less than 7MB where the previous version used 240MB.

And on a graph that has just over 5 million paths, this still uses just 7MB and completes in 14 1/2 minutes.

I project that the previous version would take 5 1/2 hours to complete, except that it would require 12GB to do so, and I only have 4GB.

It's still the same order of complexity--its essentially the same algorithm--but the simple expedient of avoiding allocating and reallocating zillions of small chunks of ram, mean in the real world it is much faster. Which confirms once again my feelings about big O.

Anyway, if your algorithm can be adapted to operate this way, you might find it useful.

sub _findPaths2 { my( $code, $graph, $start, $end, $path, $seen ) = @_; return $code->( @$path, $end ) if $start eq $end; $seen->{ $start } = 1; for ( grep !$seen->{ $_ }, @{ $graph->{ $start } } ) { _findPaths2( $code, $graph, $_, $end, [ @$path, $start ], { %$ +seen } ), } } sub findPaths2(&@) { _findPaths2( @_, [], {} ); } findPaths2{ print join ' ', @_; } \%graph, $start, $end;

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on Re^5: Finding All Paths From a Graph From a Given Source and End Node
  • Download Code

Replies are listed 'Best First'.
Re^6: Finding All Paths From a Graph From a Given Source and End Node
by neversaint (Deacon) on Nov 02, 2010 at 07:36 UTC
    Dear BrowserUK,
    Thanks so much. This update is truly invaluable. May you be repaid with your generosity for helping poor guys like me. I truly owe you much (as always).

    ---
    neversaint and everlastingly indebted.......
Re^6: Finding All Paths From a Graph From a Given Source and End Node
by LanX (Archbishop) on Nov 02, 2010 at 09:39 UTC
    Did you also benchmark other code? Like in this post?

    Re^2: Finding All Paths From a Graph From a Given Source and End Node

    (I can't see the point in copying around the %seen hash :)

    Furthermore this code can be easily linearized to avoid the function-call overhead...

    Anyway the benchmarks highly depend on the nature of those "randomly generated graphs".

    In general there are still plenty of possible optimizations left to speed up such a search.

    Cheers Rolf

    UPDATE:

    1) or even the current path of a DFS.

      Did you also benchmark other code? Like in this post?

      No.Here's the 5 million path tree, the command line and timing, but from a cursory glance at code you reference, you're going to require a lot (10GB or more) of memory.

      c:\test>868031 -S=7529 A Z { A => ["N", "W", "J", "L", "C", "E", "X", "T", "O", "H"], C => ["Z", "J", "Q", "U", "S", "T", "N", "P", "D", "O"], E => ["P", "G", "U", "F", "X", "A", "Y", "K"], H => ["W", "O", "J"], J => ["Z", "U", "B", "Q", "N", "I", "V", "F", "C", "P"], K => ["X", "H", "J", "C", "P", "W", "E", "S", "Q"], L => ["C", "B", "V", "A", "S", "J", "O", "H"], N => ["R", "G", "K", "N", "Q", "W", "C", "U", "E", "V"], O => ["K", "G", "X", "A", "Z", "W"], P => ["O", "G", "F", "T", "E", "U", "L", "H", "B", "R"], Q => ["V", "N", "X", "U", "D", "M", "S", "C", "R", "G"], R => ["D", "S", "K", "X", "O", "U"], S => ["Q", "E", "T", "P", "G", "Z"], U => ["Y", "V", "U", "X", "R", "W", "M", "G", "K", "N", "A"], W => ["P", "E", "G", "Y"], X => ["Z", "H", "R", "L", "J", "W", "A", "E", "X", "T", "D"], Y => ["J", "X", "G"], Z => ["K", "A", "Z"], } 5014604 FP2 took 981.75 secs for 5014604 paths

      If you prefer the 10,000 path graph as a test:

      c:\test>868031 -S=7367 A Z { A => ["F", "B", "U", "Z", "J", "C", "Q", "H"], D => ["W", "X", "F", "M", "K", "Y"], G => ["E", "P", "U"], H => ["K", "Q", "S", "T", "X", "G", "D", "B"], I => ["U", "Q", "K", "D"], K => ["O", "R", "A", "L", "X", "N", "C", "M"], L => ["G", "I", "A", "O", "N", "J", "D", "S", "R", "V", "M"], N => ["K", "L", "V", "Z", "U"], O => ["W", "K", "D", "I", "A", "J", "M", "T", "Y", "Z", "P"], P => ["B", "G"], R => ["V", "B", "G", "P"], T => ["M", "I", "N", "K", "D", "U", "A", "V", "W"], U => ["V", "Z", "J", "A", "E"], V => ["M", "G", "O", "F", "W", "Y", "P", "S"], X => ["K", "S", "B", "P", "N", "T", "W", "Z", "H", "R", "F"], Y => ["Q", "K", "J", "U", "G", "M", "P"], Z => ["A", "M", "Z", "Q", "W", "N", "G", "J", "L", "H"], } 10062 FP2 took 2.10 secs for 10062 paths
      (I can't see the point in copying around the %seen hash 1) or even the current path of a DFS. :) Furthermore this code can be easily linearized to avoid the function-call overhead...In general there are still plenty of possible optimizations left to speed up such a search.

      Improvements or alternatives welcome :) . I'm not claiming the fastest on the block, just better than my own previous efforts.

      neversaint expressed his interest in the time complexity of my previous version, so I set out to improve it. Given the combinatorial explosion that can result from the all-paths traversal of apparently quite simple graphs, moving to an iterator rather than an accumulating generator seemed the logical route.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        > but from a cursory glance at code you reference, you're going to require a lot (10GB or more) of memory.

        Why?

        I'm just running it on my busy netbook with actually 4080000 at sec 1230!

        Cheers Rolf

        1) actually I don't immediately know how to cleverly monitor memory consumption of perls datastructures but ps -aux is stable:

        USER PID %CPU %MEM VSZ RSS TTY STAT START lanx 23724 99.4 0.2 5384 2492 pts/0 Rs+ 15:30 16:24 /usr/ +bin/perl -w /tmp/graph_path.pl

        update: 5014604 in 1562 secs

        ehm? :)

        Z => ["K", "A", "Z"],

        Cheers Rolf

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://868922]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (3)
As of 2020-04-10 07:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The most amusing oxymoron is:
















    Results (49 votes). Check out past polls.

    Notices?