Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: Extracting structured data from unstructured text - just how difficult would this be?

by clinton (Priest)
on Feb 21, 2008 at 16:51 UTC ( #669314=note: print w/ replies, xml ) Need Help??


in reply to Re: Extracting structured data from unstructured text - just how difficult would this be?
in thread Extracting structured data from unstructured text - just how difficult would this be?

I was thinking about something along exactly these lines, so we may just be two talking @$$'$

What'd be interesting is trying to look for "contextual words", so does May refer to the month or the daughter, London is a place, or Jack London. It would be impossible to predict all of these ambiguities, so the "training" makes a lot of sense to me.

Of course, you will never achieve 100% accuracy but I don't think you want to.

Absolutely correct - we don't depend on this data, it just adds value when we can extract it.

thanks for the input

Clint


Comment on Re^2: Extracting structured data from unstructured text - just how difficult would this be?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://669314]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2014-08-30 16:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (293 votes), past polls