Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: De Duping Street Addresses Fuzzily

by Crian (Chaplain)
on Feb 01, 2005 at 12:19 UTC ( #426869=note: print w/ replies, xml ) Need Help??


in reply to De Duping Street Addresses Fuzzily

We are doing something simular in our company (for addresses is germany) as one small part of our jobs. Therefore the addresses get plausibilised (if I can say this in this way(?)) with some kind of technic simulare to the technics described in the posts above.

If you want accurate results, don't think it's fast done. Perhaps it could be cheaper to buy a complete solution.

Data that comes from humans contains all kind of errors and rubish you can and can not think of. Specially if it's a large mass of data (such as 50000000 addresses or else), there will be almost everything in your data - be aware.


Comment on Re: De Duping Street Addresses Fuzzily

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://426869]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (14)
As of 2014-07-22 15:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (117 votes), past polls