Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Extracting structured data from unstructured text - just how difficult would this be?

by moklevat (Priest)
on Feb 21, 2008 at 16:18 UTC ( #669300=note: print w/ replies, xml ) Need Help??


in reply to Extracting structured data from unstructured text - just how difficult would this be?

In response to your question, I'm going to say "quite difficult" or at least very time consuming. On the other hand, if the point is to get work done, then I think Amazon has already created the system you are looking for with the Mechanical Turk.


Comment on Re: Extracting structured data from unstructured text - just how difficult would this be?
Re^2: Extracting structured data from unstructured text - just how difficult would this be?
by clinton (Priest) on Feb 21, 2008 at 16:23 UTC
    That may just be a brilliant solution - good thinking batman!

    The only downside is that we have to verify their work, which may be almost as time consuming

      Perhaps you could set your system up to have duplicate data entry, and then diff the duplicate entries to flag potential problems. Alternately you could set up a second Turk task to compare and verify entries.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://669300]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (12)
As of 2014-12-18 22:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (67 votes), past polls