Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Extracting structured data from unstructured text - just how difficult would this be?

by moklevat (Priest)
on Feb 21, 2008 at 16:18 UTC ( #669300=note: print w/ replies, xml ) Need Help??


in reply to Extracting structured data from unstructured text - just how difficult would this be?

In response to your question, I'm going to say "quite difficult" or at least very time consuming. On the other hand, if the point is to get work done, then I think Amazon has already created the system you are looking for with the Mechanical Turk.


Comment on Re: Extracting structured data from unstructured text - just how difficult would this be?
Replies are listed 'Best First'.
Re^2: Extracting structured data from unstructured text - just how difficult would this be?
by clinton (Priest) on Feb 21, 2008 at 16:23 UTC
    That may just be a brilliant solution - good thinking batman!

    The only downside is that we have to verify their work, which may be almost as time consuming

      Perhaps you could set your system up to have duplicate data entry, and then diff the duplicate entries to flag potential problems. Alternately you could set up a second Turk task to compare and verify entries.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://669300]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (4)
As of 2015-09-01 08:44 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The oldest computer book still on my shelves (or on my digital media) is ...













    Results (368 votes), past polls