Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: Extracting structured data from unstructured text - just how difficult would this be?

by clinton (Priest)
on Feb 21, 2008 at 16:51 UTC ( #669314=note: print w/ replies, xml ) Need Help??


in reply to Re: Extracting structured data from unstructured text - just how difficult would this be?
in thread Extracting structured data from unstructured text - just how difficult would this be?

I was thinking about something along exactly these lines, so we may just be two talking @$$'$

What'd be interesting is trying to look for "contextual words", so does May refer to the month or the daughter, London is a place, or Jack London. It would be impossible to predict all of these ambiguities, so the "training" makes a lot of sense to me.

Of course, you will never achieve 100% accuracy but I don't think you want to.

Absolutely correct - we don't depend on this data, it just adds value when we can extract it.

thanks for the input

Clint


Comment on Re^2: Extracting structured data from unstructured text - just how difficult would this be?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://669314]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (8)
As of 2014-12-18 12:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (51 votes), past polls