Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: Challenge: Find median string in a list

by blokhead (Monsignor)
on Jul 04, 2007 at 19:18 UTC ( [id://624930]=note: print w/replies, xml ) Need Help??


in reply to Challenge: Find median string in a list

This problem in graph theory is known as the graph center problem. So here's a cop-out solution using Graph.pm:
use strict; use warnings; use Text::LevenshteinXS; use Graph; my $data = lc qq[Lorem ipsum dolor sit amet, consectetuer adipiscing e +lit. Mauris vulputate nunc. Pellentesque pretium. Nam tortor. Vivamus sed + eros ut arcu consectetuer auctor. Nunc ultrices nisi. Phasellus congue se +m quis nulla. Sed consequat. Etiam consectetuer. Sed ultricies libero at ma +gna commodo nonummy. Sed quis arcu. Integer massa lectus, ultrices non, +faucibus eget, sagittis eget, lacus. ]; my @words = $data =~ m/\w+/g; my $g = Graph::Undirected->new; for my $i (0 .. $#words) { for my $j ($i+1 .. $#words) { $g->add_weighted_edge( @words[$i,$j], distance(@words[$i,$j]) ); } } my $apsp = $g->all_pairs_shortest_paths; my ($center, $radius) = (undef, 1_000_000); for (@words) { my $x = $apsp->vertex_eccentricity($_); ($center, $radius) = ($_, $x) if $x < $radius; } print "$center ($radius)\n";

I'm sure there should be a more efficient algorithm for graph center than computing all pairs shortest paths (although it makes a difference if the bottleneck is computing Levenshtein distance or dealing with the big graph), but this is a starting point. Especially interesting would be an algorithm that explored the graph in an "on-line" fashion to avoid precomputing all the pairwise distances.

BTW, Graph.pm does offer a center_vertices method for the APSP object, but it appears to be broken (at least in my version of the module).

blokhead

Replies are listed 'Best First'.
Re^2: Challenge: Find median string in a list
by Limbic~Region (Chancellor) on Jul 05, 2007 at 01:05 UTC
    blokhead,
    This solution took nearly 5 hours (compared to 0.28 seconds). I am not sure how well it would have done if center_vertices would have worked.

    Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://624930]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2024-04-20 03:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found