Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: Junk NOT words

by pizza_milkshake (Monk)
on Nov 01, 2002 at 20:37 UTC ( #209839=note: print w/ replies, xml ) Need Help??


in reply to Junk NOT words

this seems to do the trick

#!/usr/bin/perl -l # file storing all legitimate words, one per line my $DICT = qq(/usr/share/dict/words); # terms to test for validity my @TEST = qw( pizzamilkshake perlmonks pearlmonks hellothere heytherebaby hey123 timexyz whereangelsare ); # build dict hash my %WORDS; open DICT, "< $DICT" or die $!; while (<DICT>){ chomp; # chop off whitespace if it's there $WORDS{uc $_}++; # force UC, add key to %WORDS } close DICT; my (@subs, $giveup, $p1, $p2, $sub); for my $test (@TEST){ $test = uc $test; # force word to UC at the beginning @subs = (); # reset sub-match array $giveup = $p1 = $p2 = 0; # not giving up, starting at the start while ( !$giveup && # else we've thrown in the towel $p1 < length($test) # else found match ){ $p1++; $sub = substr $test, $p2, $p1-$p2; # grab next substring #print STDERR "sub: $sub"; if ($WORDS{$sub}){ # if it matches a legal word... #print STDERR "MATCH ($sub) in $test"; push @subs, [ $p1, $p2 ]; # successful path, save it $p2 = $p1; # advance p2 to the end of the current match } elsif ($p1 >= length($test)){ # at the end of the string wit +h no match # if the entire string doesn't match a word or we have now +here to # backtrack... if ($p2 == 0 || @subs == 0){ #print STDERR "giving up on $test"; $giveup++; # nowhere to go } else { #print STDERR "backtracking..."; # reset p1 and p2 to last state and try to get a longe +r match ($p1, $p2) = @{$subs[$#subs]}; pop @subs; # delete last item... it's path is a dead e +nd } } } print ("$test: " . ($giveup ? "NO" : "YES")); }
perl -MLWP::Simple -e'getprint "http://parseerror.com/p"' |less


Comment on Re: Junk NOT words
Download Code
Replies are listed 'Best First'.
Re: Re: Junk NOT words
by pizza_milkshake (Monk) on Nov 02, 2002 at 06:56 UTC
    i've put an updated copy of this script at:

    http://www.parseerror.com/perl/nonsense-test.txt

    i've added some simple optimizations to noticeable speed up performance on longer examples. perl -e'$_=q#: 13_2: 12/"{>: 8_4) (_4: 6/2"-2; 3;-2"\2: 5/7\_/\7: 12m m::#;s#:#\n#g;s#(\D)(\d+)#$1x$2#ge;print'

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://209839]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (7)
As of 2015-07-31 04:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls