Well, I'd take a look at Lingua-EN-MatchNames, first off.
from the description :
You have two databases of person records that need to be synchronized or matched up,
but they use different keys--maybe one uses SSN and the other uses employee id.
The only fields you have to match on are first and last name.
That's what this module is for.
Just feed the first and last names to the C<name_eq()> function, and it returns
C<undef> for no possible match, and a percentage of certainty (rank) otherwise.
The ranking system isn't very scientific, and gender isn't considered, though
it probably should be.
It's got some good examples, and you can set up "fuzzy searching", so you can match, e.g. "G. W. Bush, Jr" to "George Bush Jr" with a degree of confidence.
and if that doesn't work, I'd try Text::soundex or just grepping through the strings.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
|
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.
|
|