Re^10: blank pdf generated using PDF::API2 (Updated)

No: it worked out of the box. I'm sorry I didn't see the file was protected before: I didn't think about that as there was no password asked and no message displayed (and obviously no errors with scripts) when I manipulated the file manually (either via Acrobat Reader or using sejda console or sejda desktop or even Perl scripts). I'm not a PDF expert but I presume that the password is an authentication mechanism more than a protection as it doesn't prevent anything to read the file. But in that case, how is the content deciphered automatically?

Anyway, for CAM::PDF, the script I gave in my first message is working exactly as I wrote it without any password related stuff:

my$file='file.pdf';
my $oldpdf = CAM::PDF->new($file) or die "$CAM::PDF::errstr\n";

if ($oldpdf->numPages() > 100) {
    printf "    (%d pages)\n", $oldpdf->numPages();
    $oldpdf->extractPages(1..100);
    $oldpdf->cleanoutput("split_$file");
}
[download]

Still with CAM::PDF, getPageText method works correctly and displays the real text of the file. I also managed to modify some data with getPageContent and setPageContent but not all data (I tried to obfuscate the file with this but the resulting pdf was corrupted).

And with PDF::API2, xmpMetadata method for example produces unreadable data on that file (I cannot give the result here: it doesn't parse correctly on the site).

I'm now looking for a way to use PDF::API2 the same way CAM::PDF is working: ie. by copying the pdf file and then removing the undesired pages but I'm not sure this is possible

Thank you again for your help: it's almost time for me to end my week at work, so I'll return on that subject on Monday. Nice weekend folks.

Comment on Re^10: blank pdf generated using PDF::API2 (Updated) Select or Download Code

Replies are listed 'Best First'.

Re^11: blank pdf generated using PDF::API2 (Updated)
by lennelei (Acolyte) on Jul 25, 2017 at 07:57 UTC

Hi all,

just a quick update: I didn't find any way to handle correctly the protected PDF with PDF::API2 :(

I finally choose to simply check if the PDF is bigger than 100 pages and move it to a specific folder in that case. I then split all the big PDFs using sejda-console. As 99% of the PDFs we received are smaller, sejda is not started too often and the whole process is not significantly longer.

Thank you all for your help, hopefully PDF::API2 will be able some day to handle those PDFs as well :)

Best regards.

[reply]


Perl-Sensitive Sunglasses
	PerlMonks