Keep It Simple, Stupid | |
PerlMonks |
Re: Will it work?by Marshall (Canon) |
on Mar 15, 2016 at 21:04 UTC ( [id://1157847]=note: print w/replies, xml ) | Need Help?? |
Without an example, I have trouble answering your question. However, if the situation is one where a very detail oriented person who knows minimal English could sit there and look at 1,000 papers and summarize the results by extracting certain key phrases, even without knowing exactly what they mean, then the probability is high that a program can be written to do that. Programs don't work well with "sort of" or "interpret what you think about this...". "Recommend: Yes/NO" is something that a program can detect. "I'm leaning towards voting Yes, but at this time, I am unsure" is something that a program has close to zero chance of figuring out. To have a chance at this, you need to identify some key phrases and a syntax that a very, very literal detailed person could use to extract your info. This very, very literal detailed person (the program) will do its job flawlessly, but only within very strict rules. You could wind up in a situation where the program can do 900 of 1,000 files with a clear result, but yet you wind up with 100 to do manually. This has to do with the "rules" and whether the detailed savant (the program) can tell if it got a valid result or not. I've worked with situations where the program can get to 99.5% with certainty, but for the other 0.5%, it knows that it is not certain. Update: 0.5% may not seem like a lot, but if there are 350,000 records, this is a big deal. Try to find some simple rules where you are absolutely certain that the correct result has been found. Then see what that percentage that is. If that is 90%, then you are probably in pretty good shape as the program did 90% of the work! To get something like this completely automated, the program may need to start applying some ad-hoc rules that involve some uncertainty and that means that the program will guess "wrong" some of the time. You have to decide whether that matters or not?
In Section
Seekers of Perl Wisdom
|
|