Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: pdf2txt?

by freddo411 (Chaplain)
on Oct 09, 2003 at 22:14 UTC ( [id://298126]=note: print w/replies, xml ) Need Help??


in reply to pdf2txt?

You have a very difficult job in front of you. PDF isn't a format that translates back nicely into ASCII.

I know for certain that if you have a long paragraph that is visually wrapped into several lines in a PDF, that the text that composes the paragraph is broken up into several strings (well, however many lines there are). This presents problems when you want to sensibly save simple ASCII back out.

There are other issues as well, having to do primarally with getting the text in the correct order in the ASCII file.

Unless you are "cherry picking" a string or two, you'll be happier if you can redefine your problem in another way....

Cheers

-------------------------------------
Nothing is too wonderful to be true
-- Michael Faraday

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://298126]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-22 05:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found