Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re: A Better Word Morph Builder

by Solo (Deacon)
on Jun 29, 2006 at 19:44 UTC ( #558426=note: print w/replies, xml ) Need Help??

in reply to A Better Word Morph Builder

Well, it certainly takes a while to build the db the first time... (over 15 mins so far... maybe LevenshteinXS isn't quite right on Win?) Here's my offing that was written for clarity, rather than efficiency. Then again, maybe I'll say it's optimized for startup time, rather than solution time... who likes those long 'loading...' screens, anyway?

Update:I should have realized that I'm using the yawl-0.3.2 word list, and the time to build the db is probably affected by that!

use strict; use warnings; use List::Compare; use Benchmark qw( cmpthese ); my $lword = shift || 'love'; my $rword = shift || 'hate'; die unless length($rword) == length($lword); my $length = length($lword); my $wordlist; open(FH,'word.list') or die; while(<FH>) { chomp; undef($wordlist->{lc $_}) if length($_) == $length; } print solo($lword,$rword); #cmpthese( ... ); sub solo { my ( $rdepth,$ldepth,$rex,$lex ) = (0,0,1,1); my $left = { $_[0] => [$_[0]] }; my $right = { $_[1] => [$_[1]] }; while ( 1 ) { # compare the intersection of the leaf nodes my $lc = List::Compare->new( { lists => [ [keys %$left], [keys %$right] ], accelerated => 1, unsorted => 1, } ); my @int = $lc->get_intersection(); if ( @int ) { my $solve = shift @int; pop @{$right->{$solve}}; return join("\n", @{$left->{$solve}}, reverse @{$right->{$ +solve}}) . "\n"; } # pick a side to expand the search in if ( $lex && $rdepth > $ldepth ) { $lex = expandTree($left,++$ldepth); } elsif ( $rex ) { $rex = expandTree($right,++$rdepth); } else { return "No solution."; } } } sub expandTree { my ($tree, $depth) = @_; my ($word, $path, $expanded); while ( ($word, $path) = each %$tree ) { if ( @$path == $depth ) { for my $i (0..(length($word)-1) ) { for my $letter ( 'a'..'z' ) { my $try = $word; substr($try,$i,1) = $letter; next if exists $tree->{$try}; if ( exists $wordlist->{$try} ) { $tree->{$try} = [ @{$tree->{$word}}, $try ]; $expanded = 1; } } } } } return $expanded; }


You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.

Replies are listed 'Best First'.
Re^2: A Better Word Morph Builder
by Ieronim (Friar) on Jun 29, 2006 at 20:32 UTC
    I have benchmarked your method. If i am not mistaken, is's another example of bidirectional search, isn't it?

    Benchmark results:

    Rate find_path solo find_path2 transform find_path 5.91/s -- -62% -75% -97% solo 15.7/s 166% -- -35% -92% find_path2 24.1/s 308% 53% -- -87% transform 188/s 3084% 1097% 681% --
    Note: Your solution seems to be suitable only for basic English charset:
    for my $letter ( 'a'..'z' ) {
    And what about the Unicode? ;)
      Thanks for the benchmark! You're right, it is bi-directional. And you caught me, 'a'..'z' was a precomputed optimization based on the set of characters found in my wordlist. ;) Does 2of12 include additional characters?


      You said you wanted to be around when I made a mistake; well, this could be it, sweetheart.
        2of12 includes only [a-z'0-9-]. But as i am Russian, i wanted the algorithm to work with any language ;) And it does, but takes a HUGE amount of time to process my 1Mb russian wordlist :(
Re^2: A Better Word Morph Builder
by Limbic~Region (Chancellor) on Jun 30, 2006 at 18:59 UTC
    Thanks! I also benchmarked your solution. You can see how you did here.

    Cheers - L~R

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://558426]
[james28909]: well, i can stream from ffmpeg to mpc-hc with udp:// port. guess i just need to grab these packets and slap a rtp header on it and encrypt it with linsodium and send it.
[james28909]: in the rtp header, there is a nonce, i can just make this any number and increment it per packet?

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2017-05-23 05:17 GMT
Find Nodes?
    Voting Booth?