P is for Practical | |
PerlMonks |
Re: Extracting text from PDF. No reallyby chrisdolan (Beadle) |
on Mar 29, 2008 at 03:20 UTC ( [id://677164]=note: print w/replies, xml ) | Need Help?? |
I'm the author of CAM::PDF. Even under the best circumstances, getpdftext.pl produces barely readable output. My module doesn't have a renderer, so the text extraction is a total hack that I tossed into the module for fun. I'm quite pleased that other tools have produced good results! CAM::PDF (which I barely maintain anymore, I'm sorry to say) is optimized for high-performance, low-level editing of PDF documents.
In Section
Seekers of Perl Wisdom
|
|