weismat has asked for the wisdom of the Perl Monks concerning the following question:
Hello Monks,
I would like to parse a rather simple, but large pdf file.
I can copy and paste the content page wise, thus it does not contain images for the text.
I looked at the PDF-API2 documentation and found it very unhandy. How would you approach to parse the text content a pdf document? Do you any hints I should look at? I found a lot to create, but nothing to parse PDF.
Thanks!
Update: I want to stress out that no images are involved and I can use Window's copy and paste function. For the moment I have implemented an autoIt solution which creates a text file based on around 4000 copy and pastes. I would like to have a clean solution for the future.
Update: I want to stress out that no images are involved and I can use Window's copy and paste function. For the moment I have implemented an autoIt solution which creates a text file based on around 4000 copy and pastes. I would like to have a clean solution for the future.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: PDF Parsing
by marto (Cardinal) on Nov 28, 2007 at 11:06 UTC | |
Re: PDF Parsing
by dragonchild (Archbishop) on Nov 28, 2007 at 14:38 UTC | |
Re: PDF Parsing
by Starky (Chaplain) on Nov 28, 2007 at 15:31 UTC | |
by Anonymous Monk on Dec 03, 2007 at 16:49 UTC | |
by ademmler (Novice) on Dec 03, 2007 at 16:55 UTC | |
by Anonymous Monk on Jan 03, 2008 at 15:18 UTC | |
Re: PDF Parsing
by runrig (Abbot) on Nov 28, 2007 at 21:37 UTC | |
by weismat (Friar) on Dec 01, 2007 at 07:55 UTC | |
Re: PDF Parsing
by toma (Vicar) on Nov 30, 2007 at 07:39 UTC |
Back to
Seekers of Perl Wisdom