I know that title is a little deceiving, but then again, after you see what I want to be able to do, it might not be all that deceiving...
I'm having a problem on what I thought to be a small project. I have a list of individual names and companies, altogether in one long list. There is nothing signifying if the text is a person's name or a company name. I have to compare it against another list and find possible conflicts. Well, anyone that has done this can imagine the possibilities, I'll just list a few here...
search for Aetna Insurance Company should match...
Aetna Insurance Company
Aetna Ins. Co.
Aetna Insurance Co.
Aetna Ins. Company
Aetna Ins Company
Aetna Insurance Co
Aetna Ins Co
That's only a very minute example of other particulars I've come up with...
search for Sam Jones should bring up...
Sam J. Jones
Sam J Jones
This is why I used the title fuzzy logic, is there any kind of perl module out there that can do this? Even a bunch of modules that can give me various parts of this would be good. One in particular my boss brought up is Soundex, which is here
. However I'm finding that soundex is really only good for misspellings like Smith and Smyth.
I can't imagine this is an easy question to answer, and I'm sure there's not going to be one all-encompassing module to do this. I'm just hoping for some pointers and maybe some modules that could do bits and pieces of this.
Thanks in advance Monks...