http://www.perlmonks.org?node_id=920379

ksublondie has asked for the wisdom of the Perl Monks concerning the following question:

I'm having problems finding the right module for my situation.

I have a perl web app and I'm wanting to create different dynamic and downloadable pdf's (when the user performs an action, they get the typical "save as..." dialog box). Of course since this is a web app, I don't want to create the files, just the output stream. I've tried Postscript::Simple which gives me the placement control I want, but then I can't figure out how to get the PDF out of it as a scalar without creating the file (I tried Postscript::Convert), and PDF::HTML which gets me the pdf output, but it's not powerful enough to let me control the placement like I want.

I've searched cpan, but so far, haven't found any that will work yet. Can someone point me in the right direction?

Replies are listed 'Best First'.
Re: Finding the right PDF module
by kcott (Archbishop) on Aug 16, 2011 at 05:06 UTC

    Take a look at PDF::API2::Simple: the stringify() method looks like it will handle your output requirements (I haven't used that method myself).

    If PDF::API2::Simple doesn't have enough grunt for your app, try PDF::API2.

    -- Ken

Re: Finding the right PDF module
by CountZero (Bishop) on Aug 16, 2011 at 09:53 UTC
    Of course since this is a web app, I don't want to create the files, just the output stream
    I do not know what is so "of course" about that.

    Webservers are very good at serving files. Writing the PDF to the server's filesystem is likely to be faster than issuing a stream and waiting for any slow link in the chain to the client. Your script will terminate faster and be ready for another request, while the web-server can take all the time needed to get the file to the client.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

      "of course" probably means that the pages are unique anyway and there is no point in storing them all. #903763 discusses HTML::HTMLDoc. I personally use <HTML>doc directly. It performs well but requires you to stick to simple HTML without CSS.
        I can only speak from my personal experience, but the applications I made stored the unique results of the request in a file (it were spreadsheets, but that is irrelevant for this discussion) and just returned a link to this file to the client requesting it. All the heavy lifting of making the file was done by the Perl script and the sending of the file to the client was done by the Apache server. As far as the webserver was concerned this was just a static file. I found it most efficient. Once every so often a cron job would reap all "old" files to reclaim disk space.

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        ...the pages are unique...no point in storing them all...

        Exactly. I'm leaving that up to the end user.

Use temp files, give wkhtmltopdf a try
by mcsonique (Initiate) on Aug 16, 2011 at 13:31 UTC

    I also think there is nothing wrong in letting the webserver handle the PDF download. If you must deliver the download by your script itself your may use File::Temp to create temporay files, read them and deliver them.

    I create PDFs using the command line tool wkhtmltopdf. My webapp constructs some HTML temporary file that wkhtmltopdf converts into a nice pdf file which I then deliver.

Re: Finding the right PDF module
by Anonymous Monk on Aug 16, 2011 at 16:19 UTC
    +1 on using PDF::API2 modules. While the documentation is not very good, plenty of help available via Google. The api does pretty solid work for its usage.
      Wow, I can't believe I missed that one! So far, it seems promising, but you're right, the documentation is lacking. Thanks!

        There's a number of example scripts you may also find helpful. They come bundled with the distributions but you can also read them online (linked from their MANIFEST files): PDF::API2 examples/* and PDF::API2::Simple examples/*. If you dig around in the MANIFESTs, you may find useful snippets in other directories such as t/* and contrib/*.

        -- Ken

Re: Finding the right PDF module
by jdrago999 (Pilgrim) on Aug 18, 2011 at 01:22 UTC

    The few times I've had to output data to a PDF (for printing reasons usually) I've used html2ps and then ps2pdf. It feels like a hack, but gets the job done well. Plus it's easier for me to mentally visualize the layout as HTML than a bunch of PDF::API2 method calls. Then I'd use PDF::API2 to combine the resulting pdf files together into a single document.