Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: A Better Word Morph Builder

by Solo (Deacon)
on Jun 29, 2006 at 19:44 UTC ( [id://558426]=note: print w/replies, xml ) Need Help??


in reply to A Better Word Morph Builder

Well, it certainly takes a while to build the db the first time... (over 15 mins so far... maybe LevenshteinXS isn't quite right on Win?) Here's my offing that was written for clarity, rather than efficiency. Then again, maybe I'll say it's optimized for startup time, rather than solution time... who likes those long 'loading...' screens, anyway?

Update:I should have realized that I'm using the yawl-0.3.2 word list, and the time to build the db is probably affected by that!

use strict; use warnings; use List::Compare; use Benchmark qw( cmpthese ); my $lword = shift || 'love'; my $rword = shift || 'hate'; die unless length($rword) == length($lword); my $length = length($lword); my $wordlist; open(FH,'word.list') or die; while(<FH>) { chomp; undef($wordlist->{lc $_}) if length($_) == $length; } print solo($lword,$rword); #cmpthese( ... ); sub solo { my ( $rdepth,$ldepth,$rex,$lex ) = (0,0,1,1); my $left = { $_[0] => [$_[0]] }; my $right = { $_[1] => [$_[1]] }; while ( 1 ) { # compare the intersection of the leaf nodes my $lc = List::Compare->new( { lists => [ [keys %$left], [keys %$right] ], accelerated => 1, unsorted => 1, } ); my @int = $lc->get_intersection(); if ( @int ) { my $solve = shift @int; pop @{$right->{$solve}}; return join("\n", @{$left->{$solve}}, reverse @{$right->{$ +solve}}) . "\n"; } # pick a side to expand the search in if ( $lex && $rdepth > $ldepth ) { $lex = expandTree($left,++$ldepth); } elsif ( $rex ) { $rex = expandTree($right,++$rdepth); } else { return "No solution."; } } } sub expandTree { my ($tree, $depth) = @_; my ($word, $path, $expanded); while ( ($word, $path) = each %$tree ) { if ( @$path == $depth ) { for my $i (0..(length($word)-1) ) { for my $letter ( 'a'..'z' ) { my $try = $word; substr($try,$i,1) = $letter; next if exists $tree->{$try}; if ( exists $wordlist->{$try} ) { $tree->{$try} = [ @{$tree->{$word}}, $try ]; $expanded = 1; } } } } } return $expanded; }

--Solo

--
You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.

Replies are listed 'Best First'.
Re^2: A Better Word Morph Builder
by Ieronim (Friar) on Jun 29, 2006 at 20:32 UTC
    I have benchmarked your method. If i am not mistaken, is's another example of bidirectional search, isn't it?

    Benchmark results:

    Rate find_path solo find_path2 transform find_path 5.91/s -- -62% -75% -97% solo 15.7/s 166% -- -35% -92% find_path2 24.1/s 308% 53% -- -87% transform 188/s 3084% 1097% 681% --
    Note: Your solution seems to be suitable only for basic English charset:
    for my $letter ( 'a'..'z' ) {
    And what about the Unicode? ;)
      Thanks for the benchmark! You're right, it is bi-directional. And you caught me, 'a'..'z' was a precomputed optimization based on the set of characters found in my wordlist. ;) Does 2of12 include additional characters?

      --Solo

      --
      You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.
        2of12 includes only [a-z'0-9-]. But as i am Russian, i wanted the algorithm to work with any language ;) And it does, but takes a HUGE amount of time to process my 1Mb russian wordlist :(
Re^2: A Better Word Morph Builder
by Limbic~Region (Chancellor) on Jun 30, 2006 at 18:59 UTC
    Solo,
    Thanks! I also benchmarked your solution. You can see how you did here.

    Cheers - L~R

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://558426]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-03-19 08:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found