go ahead... be a heretic | |
PerlMonks |
Re: How to extract image captions from a PDF file using perlby chrestomanci (Priest) |
on Nov 17, 2010 at 10:01 UTC ( [id://871973]=note: print w/replies, xml ) | Need Help?? |
Perhaps you could convert your PDF files to SVG using inkscape, and then parse the resultant SVG using one of the standard XML processing libraries. Inkscape has a command line mode that can do almost anything that you can do with the GUI. inkscape -f Input_file.pdf -l Output_file.svg
In Section
Seekers of Perl Wisdom
|
|