Beefy Boxes and Bandwidth Generously Provided by pair Networks vroom
There's more than one way to do things
 
PerlMonks  

pdf2doc2rtf2html2txt

by petemar1 (Pilgrim)
on Sep 08, 2000 at 16:26 UTC ( [id://31616]=perlquestion: print w/replies, xml ) Need Help??

This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.

petemar1 has asked for the wisdom of the Perl Monks concerning the following question:

has anyone ever heard of taking an adobe acrobat file and stripping it down to those less formatted text file extensions (to msword, then to rich text, then to html, then finally vanilla text)?

- m peters - www gwangwa com -

Replies are listed 'Best First'.
Re: pdf2doc2rtf2html2txt
by KM (Priest) on Sep 08, 2000 at 16:36 UTC
    I believe there is a pdf->text translator (or plugin with an API). Since text is the lowest common denomonator, I would then convert text to those other formats. You may want to look at the PDF related CPAN modules and see what ones can read in a PDF file, from which you could likely just get the text out of. If I get a chance, I will try to find some URL for you, or you can use Google :)

    Cheers,
    KM

Re: pdf2doc2rtf2html2txt
by merlyn (Sage) on Sep 08, 2000 at 16:45 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://31616]
Approved by root
help
Sections?
Information?
Find Nodes?
Leftovers?
    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.