|Problems? Is your data what you think it is?|
Words in Wordsby sarchasm (Acolyte)
|on Sep 30, 2011 at 17:51 UTC||Need Help??|
sarchasm has asked for the
wisdom of the Perl Monks concerning the following question:
Good day! I'm trying to write a script that takes a list of words and checks to see if each word exists within another word in the list. The specifics are as follows:
1) The word list contains 640,000 entries
2) A word cannot match itself (ie: "a" cannot match "a")
3) A word cannot match itself as a plural even if it makes a different word (ie: "a" cannot match "as")
4) A word cannot match itself with an apostrophe s "'s" (ie: "a" cannot match "a's" but "a" can match "aa's")
Coming from a mainframe background I am trying to use loops but the performance is horrible. From what i've read, hashing seems to be the way to go but I think I am still implementing this as a loop and getting very poor performance (100 records in 20 seconds).
Here is what i've tried so far:
Any suggestions would be greatly appreciated. Thank you!