In addition to Text::Levenshtein (whose docs look short, but sufficient to me) Jarkko has a great module that does the same thing, but with a lot more fine control: String::Approx.
It provides an LRU cache for what he calls the "matching engines" of a pattern (so that testing the distance between $string1 and $string3 doesn't incur a bunch of overhead you allready spent on calculating hte distance between $string1 and $string2), it provides lots of usefull methods that utilize the Levenshtein Distance to do splicing, slicing, or finding the location of substring ... and he provides modifiers that let you say things like "allow only inserts or substitutions of characters, but no deleters" or tell me if these strings are 90% similar"
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|