Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Levenshtein distance: calculating similarity of strings

by hossman (Prior)
on Mar 25, 2003 at 01:57 UTC ( [id://245592]=note: print w/replies, xml ) Need Help??


in reply to Levenshtein distance: calculating similarity of strings

In addition to Text::Levenshtein (whose docs look short, but sufficient to me) Jarkko has a great module that does the same thing, but with a lot more fine control: String::Approx.

It provides an LRU cache for what he calls the "matching engines" of a pattern (so that testing the distance between $string1 and $string3 doesn't incur a bunch of overhead you allready spent on calculating hte distance between $string1 and $string2), it provides lots of usefull methods that utilize the Levenshtein Distance to do splicing, slicing, or finding the location of substring ... and he provides modifiers that let you say things like "allow only inserts or substitutions of characters, but no deleters" or tell me if these strings are 90% similar"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://245592]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2024-03-29 06:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found