Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

OT(ish) - Best Search Algorithm

by Melly (Hermit)
on Oct 15, 2007 at 09:02 UTC ( #644868=perlquestion: print w/ replies, xml ) Need Help??
Melly has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monkeys,

This isn't really a perl question (although I'll be coding in perl), but please bestow your wisdom on this poor benighted traveller...

I've given myself a little fun project of writing a facebook app - what I want to do is to try and find a "circle of friends" - i.e. starting from myself, find a pathway back to myself. For example, if I am "A", and I have two friends "B" and "D", and "C" is not a friend of mine, but is a friend of "B" and "D", then a valid circle would be: A->B->C->D->A (as an extra note, this circle would not be valid if "B" and "D" are friends, since "A" would then be superfluous to the circle).

Anyways, I've got some rules to implement, but the basic question is what would be the best general algorithm to use in this case? The A* algorithm looks possible, but I'd be interested to hear alternative suggestions.

Thanks, and sorry for the OT post.

map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
Tom Melly, pm (at) cursingmaggot (stop) co (stop) uk

Comment on OT(ish) - Best Search Algorithm
Download Code
Re: OT(ish) - Best Search Algorithm
by Corion (Pope) on Oct 15, 2007 at 09:08 UTC

    For my deeper understanding, is this definition of circle valid?

    A "circle of friends" is a set of (site) members where each member has exactly two "friendship" relations in the whole set

    I think this definition fits your description, but I'm not sure. My mental image is of a circle of persons, each holding hands - is that the right image?

    Solutions for that could be any of the "circle detection" algorithms for graphs, but I don't know how how efficient they are. I'm not sure where A* comes into play as I see no sensible metric on which to optimize...

      A "circle of friends" is a set of (site) members where each member has exactly two "friendship" relations in the whole set

      Hmm, I think so - each member can have many "friendships", but the answer will only have two "friendships" per person in the solution (i.e. superfluous friendships will be ignored).

      I think this definition fits your description, but I'm not sure. My mental image is of a circle of persons, each holding hands - is that the right image?

      Yes, but with some constraints - the first friend in the pathway must not be known to the last friend in the pathway (i.e. I know both of them, but they don't know each other).

      map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
      Tom Melly, pm@tomandlu.co.uk
      I don't think that'll work. People can have more friends than just two. They can even share more friends than 2.

      For example, A is befriended with B, which is befriended with C and E, where C is befriended with D, and D befriended with A (the original circle), but E is also befriended with F which in turn is befriended with A.

      So there's 2 circles of friends, and A and B are in both: A-B-C-D-A and A-B-E-F-A.

        Your example still fits with my (tentative) definition. For both circles, it is true that each member has exactly two friendship relations in it. My definition does not say anything about the number of overall friendship relations on the whole site, but only about friendship relations within a circle.

        But I think I need a second thing in the definition, but I'm not sure which one is "better":

        • A "circle of friends" must have at least three members
        • There must be at least one person for every member of the circle without a direct "friend" relation

        The two are not equivalent, because the first rule allows circles of three persons, while the latter allows circles starting at four members.

Re: OT(ish) - Best Search Algorithm
by lima1 (Curate) on Oct 15, 2007 at 09:10 UTC

      Thanks - I'll take another look at Dijkstra (I think it was the first algo I found, and led me to A*)

      map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
      Tom Melly, pm@tomandlu.co.uk
        To use Dijkstra's you'd have to modified it accept the start as the destination, ensured all positive weights (doesn't work for negative cycles), and progressively updated the weights on the edges that you've already found to be part of a cycle so they wouldn't be detected again; but then some edges will be part of other cycles..and on and on and on - in other words, it would be a very BAD way to find all of your circles of friends.
Re: OT(ish) - Best Search Algorithm
by bart (Canon) on Oct 15, 2007 at 11:07 UTC
    The keywords to look for, IMO, are cycle detection. More specifically, you're looking for cycles that are not trivial, thus: longer than 3 nodes.

    HTH.

      I agree, definitely cycle detection. Dijkstra's Algorithm and A* are for finding the shortest path from any given node to all nodes in the graph. However, let's look at the problem description more closely.

      For example, if I am "A", and I have two friends "B" and "D", and "C" is not a friend of mine, but is a friend of "B" and "D", then a valid circle would be: A->B->C->D->A (as an extra note, this circle would not be valid if "B" and "D" are friends, since "A" would then be superfluous to the circle).

      You want to find cycles in a connected graph, to put it in the terminology that will give you most hits when googling. So you have here a cycle of four nodes (assuming friendship is mutual, i.e. the graph is undirected):

      A --- B | | | | | | D --- C

      And this would be a cycle you want, provided that there no edges between A and C or B and D:

      A----B | / | | / | |/ | D----C

      So, you want to find the shortest cycle that includes a given node, if such a cycle exists. (Sorry for being verbose.)

      Detecting cycles is easier than finding the shortest one. Since a cycle is simply a path that begins from the same node where it ends, you could just find the shortest paths to all nodes from a given node, say A, then for all nodes reachable from A, find a path back to A that does not include any of the nodes on the shortest paths. If you find such a path for node B, you have a cycle from A to B and back to A.

      For example, supposing you have a pathfinding algorithm:

      for my $node ($graph->nodes) { my @paths = $graph->find_shortest_paths_starting_from($node); # Now, suppose @paths is an array containing arrays of # nodes that are the actual nodes on the shortest paths, # with the first being $node and the last being the end of # the path. For instance, [ $node, $other_node, ..., $end_node ] # for the shortest path from $node to $end_node. # Next, for each endpoint, try to find a path back, but # remove the nodes from consideration that are on the # known shortest path. This is to prevent the algorithm # from re-discovering the shortest path, and will ensure # that the two possible found paths are mutually exclusive, # i.e. do not contain same nodes. (Otherwise you would have # an even shorter cycle... could this be used for our benefit?) for my $path (@paths) { my $new_graph = $graph->copy; # Slice away the first and last nodes, as they are still of # interest to us. $new_graph->remove_nodes(@$path[1 .. $#{@$path}-1]); my @paths_back = $new_graph->find_shortest_paths_starting_from($pa +th->[-1]); for my $path_back (@paths_back) { if ($path_back->[-1] eq $node) { print "We have a cycle involving $node!\n"; } } } }

      Caveat: I don't know much about graph theory, so it's up to you to figure out or prove if this really finds cycles and if it does, if those are the shortest ones. The running time of this naive approach is probably also horrible. But you should be able to get some results.

      Whether that is the shortest is harder to find out.

      --
      print "Just Another Perl Adept\n";

Re: OT(ish) - Best Search Algorithm
by Gavin (Canon) on Oct 15, 2007 at 11:13 UTC
    A look at the Prolog modules should bring a suitable solution.
    zaxo describes a similar scenario Perl and Prolog with a procedural sematics solution.
Re: OT(ish) - Best Search Algorithm
by perlfan (Curate) on Oct 15, 2007 at 15:08 UTC
    If you want to find all "circles of friends," i.e., all cycles to which you are a member, then you need to use a modified depth first traversal algorithm that begins and ends with you.

    It is similar to finding all acyclic s-d paths, except in your case s and d (source and destination) are the same - you.

    The algorithm is fast, especially if you don't have many friends - like most perl hackers :).
    # Pseudo-code array dflabels[] for each node I in graph dflables[I] = 0 s := 0 d := 0 # same as "s", which is you # as recursive function acyclic (start_node) if 1 > dflabels[start_node]# max num visits is 1 dflabels[start_node]++ for each i in adjacent_nodes(start_node) acyclic(i) end for dflabels[start_node]-- # unlike strict dft decrement else # indicates we've been here before if start_node == s # check if this is YOU print "Cycle back to me found!" end if end if return end function acyclic # initiate call acyclic(s)
    Of course, the above pseudo-code doesn't facilitate storing the actual paths, this is easily done using input and output parameters. Hope that helps.
Re: OT(ish) - Best Search Algorithm
by blokhead (Monsignor) on Oct 15, 2007 at 15:28 UTC
    Here is a simple approach:

    Say your starting vertex is A. First, find the biconnected component containing A. Biconnectivity means that there is no single vertex whose removal would disconnect the graph. But a consequence of this condition is what we're really looking for:

    • Two vertices are in the same biconnected component if and only if there is a cycle in the graph that visits both.

    Now that you have A's biconnected component, there are a few cases:

    • The component contains just the single vertex A. This means A has 0 or 1 friends, and a cycle of your definition is not possible.
    • The component has more than just A, but A is immediate friends with everyone in the component. Then all of A's non-immediate friends are outside of this component and therefore do not share a common cycle with A. This means that there is no cycle of your definition (you insist that the cycle visit a non-immediate-friend of A).
    • Otherwise, there is a non-immediate friend of A within the component, say B. By the definition of biconnectivity, there is a cycle that visits both A and B. You can find it by any sort of search (breadth-first or depth-first) from A. Just find two vertex disjoint paths from A to B. You can restrict your search to within this biconnected component for efficiency (the cycle connecting the two vertices must stay within the biconnected component).

    Sorry this is at such a high level with no code ;), but I think you should have no problem, if you were already considering implementing A* search on your own. Check out Graph.pm, it has a method for computing biconnected components, and that will be the main step in this algorithm.

    Now, if you are interested in finding the smallest such cycle, I'm not sure exactly how to do it. You might want to find the non-immediate-friend of A who is closest to A, and use him as the basis of your cycle. But I don't think this is guaranteed to give the shortest cycle overall. But it would at least be a reasonable heuristic.

    blokhead

Re: OT(ish) - Best Search Algorithm
by Melly (Hermit) on Oct 15, 2007 at 15:48 UTC

    Wow. It's a while since I visited perlmonks, and now I'm wondering why. "Cycle detection" it is.

    Needless to say, I haven't been slacking, and I quickly realised that my problem is not really about finding shortest paths in a known graph, but is more about building the graph and then looking for any valid path.

    I think the approach I'm going to take is this (mainly via sql via perl):

    • Pick 1st friend (n)
    • Get list of friends that I know but n doesn't (these will be the target end-points)
    • Put mutual friends of me and n in a temporary ignore table
    • Check n's friends for a match with target list (will always fail on first iteration)
    • Store n->n's friends in a temporary table with level 1 code since we're at the first iteration
    • Now check n's friends friends for a match with my target list, storing them as level 2
    • Don't bother checking anyone who is already in the table (i.e. avoid going round in circles)
    • Keep going like this until we either find one of our targets or hit the max iterations (e.g. no more than 10 hops)
    • If a match is found, use the table to work out the route
    • If no match was found, but we ran out of iterations, move to next friend and start again, but ignore anyone we checked via the first friend

    Of course, this could get pretty silly - if, on average, everyone has about 15 friends, then I'm going to be checking a lot of pathways - that guy with the chessboard and the rice ain't got nothing on me...

    map{$a=1-$_/10;map{$d=$a;$e=$b=$_/20-2;map{($d,$e)=(2*$d*$e+$a,$e**2 -$d**2+$b);$c=$d**2+$e**2>4?$d=8:_}1..50;print$c}0..59;print$/}0..20
    Tom Melly, pm (at) cursingmaggot (stop) co (stop) uk
      >Wow. It's a while since I visited perlmonks, and now I'm wondering why.

      haha - yeah...there are many monks who lurk around for a shot at problems just like this :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://644868]
Approved by Corion
Front-paged by Old_Gray_Bear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (12)
As of 2014-10-31 20:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (223 votes), past polls