Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: abbreviation checking

by dree (Monsignor)
on Dec 02, 2002 at 20:34 UTC ( #217047=note: print w/ replies, xml ) Need Help??


in reply to abbreviation checking

Better than soundex are Metaphone and Double Metaphone.

If you have to compare phrases there is Text::PhraseDistance


Comment on Re: abbreviation checking
Re: Re: abbreviation checking
by Anonymous Monk on Dec 02, 2002 at 22:05 UTC
    While making an MP3-renaming script, which attacks a problem similar to yours, I used a combination of Metaphone and "distance" modules. My approach:
    1. Get a list of "known-good" words. I use already-verified MP3 filenames as a source of these.
    2. Calculate their Metaphones.
    3. Calculate the Metaphone of any new words and look for matches. If none, see if there are any matches with a distance of 1 or 2. Distances larger than 2 produce too many matches.
    4. Have the user confirm the 'corrections'.
    It's not an exact science, and human intervention is unavoidable if correctness matters.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://217047]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2015-07-05 14:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (67 votes), past polls