Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??
I've been contemplating the state of the art of manipulating PDFs in Perl. The field is littered with the corpses of CPAN modules that try to make it easy to work with PDFs, but I settled on two as being the most useful: PDF::API2 and CAM::PDF. I welcome anyone's comments pointing out things I've missed or other useful tidbits.

My original motivation was a project in which I needed to input an existing PDF (generated by some unknown method) and prepend a coversheet containing a barcode derived from some metadata (passed in as separate arguments; not from the file itself). The barcodes are so people can fax them back to me and I can route the documents, but that's a different story.

If you like counting pixels and keeping track of text's baseline and things like that, you'll love PDF::API2. It's meant to be a low-level tool, and if you want very fine-grained control of your layouts, it's the tool for you. The best examples I found are

(that's an amazingly short list for such a complicated package, but "lack of examples" seems to be a common complaint). The other tool I've used for building PDFs is wkhtmltopdf but it's not Perl. If you're not above system calls, though, it's not bad.

As a low-level tool for creating PDFs, PDF::API2 is everything I want. For reading PDFs, my experience is a bit more mixed. There is a known issue with some features of PDF 1.5 and up. That is a problem for my project, because I consume PDFs people make and "please go back and save this as version 1.4" isn't an option.

To manipulate existing PDFs, CAM::PDF works fine. As of version 1.58, it doesn't claim to broad support for PDF versions beyond 1.5, but my experience is that it can read any PDF I've thrown at it. It bills itself as a PDF manipulation library, and it can do all the helpful things like rearrange pages, import pages from another document, and even clever tricks like swapping out one image for another. So if you have a document and want to tweak it or learn about it, CAM::PDF is a good choice.

In our particular case, we combined the two. We use PDF::API2 to create a one-page coversheet document, then use CAM::PDF to prepend it to the original. It's early days, and nobody is trying to mess me up with complicated PDFs yet, but so far it seems to be working out nicely.


In reply to State of the art of PDFs in Perl by mcdave

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others romping around the Monastery: (9)
    As of 2014-12-29 09:11 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      Is guessing a good strategy for surviving in the IT business?





      Results (185 votes), past polls