http://www.perlmonks.org?node_id=1067990

Dirk80 has asked for the wisdom of the Perl Monks concerning the following question:

Hello,

there is a start place and a destination place. In the middle there are 22 other places. I want to find a short way (not the shortest because I know that this would consume too much time to calculate it) from the start place to the destination place considering all places between. The goal is that all 24 places were visited exactly once.

My input is the distance in km between all places.

My idea was to take the Graph::Simple module. I tried it with 4 places.

Here my code snippet:

use strict; use warnings; use Graph::Simple; use Data::Dumper; my $g = Graph::Simple->new ( is_directed => 0, is_weighted => 1); $g->add_edge( 'A', 'B', 200 ); $g->add_edge( 'A', 'C', 700 ); $g->add_edge( 'A', 'D', 700 ); $g->add_edge( 'B', 'C', 500 ); $g->add_edge( 'B', 'D', 500 ); $g->add_edge( 'C', 'D', 600 ); my @path = $g->shortest_path('A', 'D'); print Dumper (\@path);

The result is

$VAR1 = [ 'A', 'B', 'D' ];

But why is my point "C" not considered? I would need a route from "A" to "D" which visits also "B" and "C".

What is your suggestion to solve this problem? Is it a good idea to take a graph module to solve it?

Thank you very much.

Greetings,

Dirk

Replies are listed 'Best First'.
Re: Travelling problem
by tobyink (Canon) on Dec 20, 2013 at 21:42 UTC

    The shortest_path method is not designed for solving the travelling salesman problem. It just finds the shortest way to get from A to D following the edges you've defined. A-B-D (weight 700) is shorter than A-C-D (1300), A-B-C-D (1300), or A-C-B-D (1700). The direct route A-D (weight 700) has an equal to A-B-D; in the case of multiple routes with equal weight, I don't know whether which one is returned is random or deterministic.

    Graph may have what you need.

    use Moops; class Cow :rw { has name => (default => 'Ermintrude') }; say Cow->new->name

      Thank you very much! This sounds very interesting. I'll try that SPT_Dijkstra algorithm tomorrow.

        I tried the Graph module. I had a deep look at the SPT_Dijkstra algorithm and tried it with some sample data.

        I think that I now understand the algorithm. It takes a start vertex and the result are the paths from the start vertex to all other vertices.

        But this is not what I need. I need only one path from the start to the end which takes all vertices between into account.

        I see no way how to solve my problem with the graph module. Or do I see something wrong? I'll now try another way. Nevertheless thank you. It was interesting to play with this Graph module.

        Update:

        The Graph module does not seem to be the problem, but my lack of knowledge about graph algorithms. I'm reading now about these algorithms. When I understood them, then I'll start my implementation.

Re: Travelling problem
by GrandFather (Saint) on Dec 20, 2013 at 21:35 UTC
    But why is my point "C" not considered?

    Because "shortest path" doesn't visit every node. The example given in the documentation makes that fairly clear.

    True laziness is hard work
Re: Travelling problem
by educated_foo (Vicar) on Dec 21, 2013 at 01:59 UTC
    If you just want a "decent" path quickly, use an algorithm that starts with a minimum spanning tree, e.g. the Christofides Algorithm. A Graph::* module may or may not be useful for this.
Re: Travelling problem
by LanX (Saint) on Dec 21, 2013 at 16:59 UTC
    Are you aware that
    • you are¹ describing the well known travelling salesman problem?

    • "perfect" solutions for non-trivial cases of NP-complete problems are unknown?

    • there are very good solutions for real world problems though?

    Please check if you really need all criteria!

    Pick at most 2 of 3 and we might be able to help:

    • Fast (i.e. polynomial²) computation

    • Proven minimal path

    • unknown number of nodes > n (n ~ 20 ?)

    If it's just a theoretical question and you can't limit the requirements, then I suppose a branch and bound algorithm might be the best approach, but it won't be faster than brute force in some edge cases.

    HTH! =)

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    update

    had a short glance at the WP article and it describes much better what I wanted to tell.

    footnotes

    ¹) almost

    ²) corrected, educated foo++

      Fast (i.e. non-polynomial) computation
      You mean "i.e. polynomial," right? NP is exponential ;-). I heartily agree with your post, and defer to my favorite MJD talk for the details.

      EDIT: Nut graph:

      next time someone tells you to give up on your problem because it's NP-complete, ignore them.
        > You mean "i.e. polynomial,"

        yes, kind of "non-exponential" =)

        Was a short night! :)

        Cheers Rolf

        ( addicted to the Perl Programming Language)

        update

        > next time someone tells you to give up on your problem because it's NP-complete, ignore them.

        Unfortunately well described problems are rare here.

      Yes, you are right. The "travelling salesman problem" exactly describes my problem. Thank you!!!

      I have 24 places and I know the distance between these places in km. I have to visit the first place, then 22 places in any order and then the 24th place. I don't need a perfect solution. I just want to find a short way, not the shortest one. And the computation time should not be too long. An approximation is enough.

        The Christofides Algorithm mentioned by educated foo guaranties a solution better than 1.5 times the optimum.

        Nota bene: since you have a fixed start and end for an "open" hamiltonian path (i.e. not a circle like in the TSP) you'll need to adjust the algorithm accordingly. ¹

        Cheers Rolf

        ( addicted to the Perl Programming Language)

        update

        ¹) after reading two SO discussions (search "minimal hamilton path") you just need to add a dummy point neighboring (only) the desired start- and end-point with distance 0 (!)

        Like this any TSP algorithm will chose start and end as neighbors for a hamilton circle with the same accumulated length like a optimized hamilton path, you just need to ignore the two dummy edges at the end.

        I have 24 places and I know the distance between these places in km. I have to visit the first place, then 22 places in any order and then the 24th place.

        Could you post a sample dataset?


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Travelling problem (Anyone better 86850?)
by BrowserUk (Patriarch) on Dec 22, 2013 at 02:56 UTC

    Given the sample dataset supplied in Re^4: Travelling problem, anyone got a better (lower) total distance than 86850?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      BrowserUk:

      Prompted by LanX's suggestion to use a genetic algorithm, I put one together (code in the readmore section below).

      So far, both times I ran it, it found a path totalling 84860. I just started a longer run (100,000 generations with a larger population (300) to see if anything interesting pops up.

      Update: Oops! Replied to the wrong node. Also, should've refreshed the page. When I started coding, there weren't so many replies!

      Update 2: I let the other run go for about 80K generations, but it never found anything better.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        Similar methodology to mine, and the same problem.

        Many times it will find the minima well within your 1000 generations; but on those occasions where it settles into a false minima; it doesn't (seem to; limited runs) matter how many more generations you run it for; it will never find it.

        That's what I've been trying to find a solution to for the last couple of days. So far, without much success.

        The problem appears to be that if you discard too vigorously, you settle into re-trying variations of the same paths over and over without ever introducing any "new blood".


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      Best I could get is also 86850. Used a simple genetic algorithm - mutate using crossover points, exchange, reverse, and a bit of shuffling. Then cull all the losers and repeat. Quite ruthless really. First time I have tried this approach, seems very powerful.

      Update: Sorry BrowserUK, I updated it within about 20 seconds of posting, you are indeed quick off the mark.

        Quite ruthless really. First time I have tried this approach, seems very powerful.

        You mustn't be too ruthless though. You have to allow some of the less good candidates to evolve otherwise the algorithm will lock into a local minima and never explore further.

        For example, there are at least 8 better solutions than the 86850:

        84860 85075 85294 85469 85509 85684 85982 86197

        But these will never be discovered by minor evolution from the 86850 solution.

        You have to allow some radical variations (with significantly less good scores) to evolve for a while to discover these better ones.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        Best I could get is also 8650.

        I think you dropped a digit? (The absolute minimum is 74,283 and that's impossible :)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        > First time I have tried this approach, seems very powerful.

        Well not sure if this problem isn't too easy for a good test, considering that the naive approach already gives a solution of 95166, which is probably good enough in most cases.¹

        I have the impression that a little branch and bound with just the 3 or 4 shortest edges per node (instead of just one) would quickly produce much better results.

        Cheers Rolf

        ( addicted to the Perl Programming Language)

        ¹) at least for the OP.

      There are better solutions than 84860:

      83907: 0 1 10 15 6 11 7 2 18 19 17 14 4 21 9 13 22 8 3 12 5 20 16 83666: 0 1 10 15 6 11 7 2 18 19 17 14 21 9 4 13 22 8 3 12 5 20 16 83633: 0 1 10 15 6 11 7 2 18 19 17 14 21 4 9 13 22 8 3 12 5 20 16 82991: 0 1 10 15 6 11 7 2 18 19 17 4 14 21 9 13 22 8 3 12 5 20 16

        Wow, I'm impressed about your ability to find that fast a very good solution. I don't know the perfect value. But 84860km seems to be a very short distance for the 24 places.

        Of course I want to try to write my own code. But it would be interesting for me which algorithm you used. Until now I did not have success with the Graph module. See Re^3: Travelling problem

        Now I'll try the minimal hamilton path.

        Ignore this!. Programmer error! There are better solutions than 84860:

        83907: 0 1 10 15 6 11 7 2 18 19 17 14 4 21 9 13 22 8 3 12 5 20 16 83666: 0 1 10 15 6 11 7 2 18 19 17 14 21 9 4 13 22 8 3 12 5 20 16 83633: 0 1 10 15 6 11 7 2 18 19 17 14 21 4 9 13 22 8 3 12 5 20 16 82991: 0 1 10 15 6 11 7 2 18 19 17 4 14 21 9 13 22 8 3 12 5 20 16
Re: Travelling problem
by sundialsvc4 (Abbot) on Dec 20, 2013 at 22:43 UTC

    I know of no practical way to solve the TS problem that doesn’t involve recursion.   It could actually be fairly simple, and wouldn’t necessarily occupy too much CPU time, since all adjacent edges which lead to an already-visited city would be immediately disqualified.   If the recursion successfully reached a depth equal to the number of cities, you know that the problem has been solved.   (It is the “shortest path” bugaboo, not present here, that makes TS computationally intractable.)

Re: Travelling problem (genetic algorithm)
by LanX (Saint) on Dec 22, 2013 at 00:39 UTC
    Just for fun: )

    I found a ruby blog describing a genetic algorithm attacking TSP.

    I have a deep fascination for this approach, maybe someone wants to play around (porting to Perl is easy)

    Additionally some one may want to try a meta-genetic-algo by modifying the mix of different mutation/sex types and measure success by time till the limit given by Christofides Algorithm is reached.

    =)

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    See Also

    Algorithm::Evolutionary

Re: Travelling problem (lower bound >= 79593) (+ example)
by LanX (Saint) on Dec 29, 2013 at 11:24 UTC
    idea
    After some meditation I was able to improve the lower bound considerably.

    The idea is simple, in an undirected graph each edge appears twice in the incidence matrix.²

    Taking the OP's data structure that means 2 values from each row¹!

    Except the first and the last row, since it's not a closed Hamilton circle (otherwise we would additionally have an edge (0,23) (or (23,0) respectively) to close the path )

    So if we sum up all these values we get exactly the double of the path length. (see also Double_counting_(proof_technique))

    So taking the first and second shortest edge, we can estimate for each inner row

    edge1 + edge2 >= shortest1 + shortest2

    for the start and stop row we get edge1 >= shortest1

    Including shortest2 improves the bound considerably and we get 79592.5.

    That is an optimal solution must be bigger or equal 79593

    code

    here the code if someone needs to reproduce it:

    sub lower_bound1 { my @sorted; for my $idx ($start..$stop) { my $x=0; my @order = sort { $a->[1] <=> $b->[1] } map { [$x++,$_] } @{ $dat +a[$idx] }; push @sorted,\@order; } #pp \@sorted; my $min=0; for my $idx ($start..$stop) { my $shortest1=$sorted[$idx][1][1]; # 1. shortest my $shortest2=$sorted[$idx][2][1]; # 2. shortest $min+=$shortest1; unless ($idx==$start or $idx == $stop) { $min+=$shortest2; } #pp $min,$shortest1,$shortest2, $sorted[$idx]; } return $min/2; }

    I hacked also a lower_bound2() routine taking advantage of more constraints, but this didn't improve the result that much (about 150). I doubt that this complicated approach is of much interest.

    what for

    Good lower bound estimations are important to cut of subtrees in branch-and-bound algorithms!

    If you know that e.g. the remaining 22 edges have a length of at least 76000 and you know already a temporary optimum of 84000, it doesn't worth it to check starting edges of length 8000 or more. And so on.

    Since for each generation differences to the rows shortest sum up, the possible subtree get constantly smaller and a full solution comes into reach.

    immediate consequence

    The OP's specific problem is not a very hard example of the TSP.

    hdb was able to produce a temporary solution of 84860 within 30 seconds, which is how we know now within a 5% margin of the optimum, if not already the optimum.

    For practical use this is more than acceptable.

    Cheers Rolf

    ( addicted to the Perl Programming Language)

    ¹) starting indexing with 0

    update

    ²) contrived example

    0 1 2 3 4 min1 min2 0 0 1 (2) 3 1 1 1 1 0 (10) (7) 1 1+1 10+7 2 (2) (10) 0 15 3 2+3 10+15 3 3 (7) 15 0 (2) 2+3 7+15 4 1 1 3 (2) 0 1 shortest0-4: length:21 0--2--1--3--4 2 10 7 2 lower_bound1: sum(min1)/2 = 14/2 = 7 lower_bound2: 19 = sum(min2)/2 - 15 + 1 + 1 (excluding col 0 and 4 for double rows, striking the maximum)
Re: Travelling problem
by LanX (Saint) on Dec 22, 2013 at 05:14 UTC
    FWIW:

    The naive approach (start at first and successively take the shortest edge to a yet unvisited node) gives a path of length 95166.

    Thats not bad considering that the lower bound for any possible solution is already at 70135! (summing up all minimal edges per node and striking the longest of those edges again)

    NB: This doesn't mean that such a solution exists, thats fairly utopic.

    Cheers Rolf

    ( addicted to the Perl Programming Language)