Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: similar texts !?

by Albannach (Monsignor)
on Jul 12, 2003 at 13:35 UTC ( [id://273624]=note: print w/replies, xml ) Need Help??


in reply to similar texts !?

Here's another vote for Text::Levenshtein which I have found very handy for comparing strings (mostly detecting data entry errors), especially those with mixed letters and numbers, though I too wish I could get the XS version working.

I'd also like to point out Text::Metaphone as a soundex on steroids, as I've found soundex to be too insensitive at times. Note however that all but letters are ignored by Metaphone, which may limit its usefulness to you.

I think BrowserUk points out a serious problem in the case of MP3 files, but as most cases I've seen use some sort of fairly standard separators between "fields" in the filename, you could split each name into fields, then do the comparisons between two MP3 names on all possible pairings, selecting the best match as the most likely set of pairings. This will of course be much slower than comparing the entire name, but there are probably only 3 or 4 fields per name so you shouldn't be looking at run times greater than the lifetime of the universe either.

--
I'd like to be able to assign to an luser

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://273624]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-04-19 19:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found