Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: PDF Text

by MidLifeXis (Monsignor)
on Jun 12, 2008 at 18:04 UTC ( [id://691749]=note: print w/replies, xml ) Need Help??


in reply to PDF Text

Do a search on CPAN to see if you find anything useful there. PDF::CAM seems to have a couple of functions that might work.

Extracting the layout from a PDF files into a text file might still be problematic. It will be problematic if the page does not contain text at all, but contains a graphic image of a page instead. You would need to use some sort of OCR solution then.

--MidLifeXis

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://691749]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (4)
As of 2026-02-16 17:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.