Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Junk NOT words

by Anonymous Monk
on Nov 01, 2002 at 04:58 UTC ( #209639=note: print w/ replies, xml ) Need Help??


in reply to Junk NOT words

How about you index the dictionary file and then work you're way through the string character by character matching against the word. When the next letter results in no further branches in the index it takes that as a word, if the next word results in a dead end try the previous word again minus one character.
Example:
(excuse me for not actually checking against a dictionary for obsucure words)
w>h>e>r>e>r
r=dead end
"where" removed
a>n>g>e>l>s>a
a=dead end
"angels" removed
a>r>e>a>l
l = dead end
"area" removed
l>l>t>h
not making sense, backup
a>r>e>|
stop at "a"
"are" removed
a>l>l>t
t=dead end
"all" removed
ugh... hope you get the idea.


Comment on Re: Junk NOT words
Re: Re: Junk NOT words
by Anonymous Monk on Nov 01, 2002 at 16:26 UTC
    The problem with this algorithm, is that it has a VERY bad worse case performance, it's in O(2^n), where n is the length of the string. Meaning that as the strings get larger, this problem will become insolvable by deterministic methods. Some sort of heuristic is needed.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://209639]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (8)
As of 2014-08-01 08:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (257 votes), past polls