Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: pdf custom properties

by fluffyvoidwarrior (Monk)
on Sep 21, 2012 at 16:56 UTC ( #994946=note: print w/ replies, xml ) Need Help??


in reply to pdf custom properties

This book is largely Perl based.
PDF Hacks - 100 Industrial Strength Tips and Tools
Author Sid Steward.

It'll probably be on Oreilly Bookshelf if you can't get it from Amazon.

Pdf is a tricky and complex format but primarily text based so Perl can often hack into it.

Have you tried writing a marker string using Acrobat Pro or similar, then search for the marker with vim. If it remains plain text you can probably find it and identify postscript placeholders in the pdf file that give you the location. You can then use regex to write your text data on the same placeholder location in other pdf files. (I've used this technique before on pdf and it often works a treat but sometimes it's not that easy)

Remember though, a pdf isn't a simple data file, it's a postscript program so you'll break it if you aren't careful.

There are some pdf APIs available - pdflib, etc. But they aren't cheap.


Comment on Re: pdf custom properties
Re^2: pdf custom properties
by bulk88 (Priest) on Sep 23, 2012 at 04:09 UTC
    Most pdfs are compressed. A uncompressed pdf is a plain text file that is human readable (but your eyes will blead from the endless vector graphics tokens). There are some cmd line tools which can compress/uncompress and defragment/progressive download your pdf file. The pdf format allows stream generation with a server creating a pdf as a nonseekable stream, the allocation table goes on the end, if a token/block isn't in the final allocation table it is effectively free space but its still wastes space in the pdf file. A pdf file's tree allows reference looping BTW.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://994946]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (11)
As of 2014-09-16 21:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (49 votes), past polls