Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: pdf custom properties

by fluffyvoidwarrior (Monk)
on Sep 21, 2012 at 16:56 UTC ( [id://994946]=note: print w/replies, xml ) Need Help??


in reply to pdf custom properties

This book is largely Perl based.
PDF Hacks - 100 Industrial Strength Tips and Tools
Author Sid Steward.

It'll probably be on Oreilly Bookshelf if you can't get it from Amazon.

Pdf is a tricky and complex format but primarily text based so Perl can often hack into it.

Have you tried writing a marker string using Acrobat Pro or similar, then search for the marker with vim. If it remains plain text you can probably find it and identify postscript placeholders in the pdf file that give you the location. You can then use regex to write your text data on the same placeholder location in other pdf files. (I've used this technique before on pdf and it often works a treat but sometimes it's not that easy)

Remember though, a pdf isn't a simple data file, it's a postscript program so you'll break it if you aren't careful.

There are some pdf APIs available - pdflib, etc. But they aren't cheap.

Replies are listed 'Best First'.
Re^2: pdf custom properties
by bulk88 (Priest) on Sep 23, 2012 at 04:09 UTC
    Most pdfs are compressed. A uncompressed pdf is a plain text file that is human readable (but your eyes will blead from the endless vector graphics tokens). There are some cmd line tools which can compress/uncompress and defragment/progressive download your pdf file. The pdf format allows stream generation with a server creating a pdf as a nonseekable stream, the allocation table goes on the end, if a token/block isn't in the final allocation table it is effectively free space but its still wastes space in the pdf file. A pdf file's tree allows reference looping BTW.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://994946]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-20 04:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found