Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: extract text from pdf

by LANTI (Sexton)
on Apr 24, 2012 at 08:46 UTC ( #966763=note: print w/replies, xml ) Need Help??


in reply to Re: extract text from pdf
in thread extract text from pdf

If I want just the PDFs text to use it for whatever (save it in a database, ...) I found this line quiete convenient:

my $txt = `pdftotext whatever.pdf -` or die 'ERROR running pdftotext'; say $txt;
Or if the file-name is in a variable and the PDF-file contains umlauts or other non-ascii chars:
my $command_line = qq{pdftotext -enc 'UTF-8' '$path' -}; my $text = `$command_line` or die 'ERROR running pdftotext';

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://966763]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2018-07-19 23:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?















    Results (421 votes). Check out past polls.

    Notices?