Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Problem merging thousands of PDFs with PDF::API2: 'Deep recursion on subroutine "PDF::API2::Basic::PDF::Objind::release"'

by zwon (Abbot)
on Oct 13, 2014 at 16:33 UTC ( #1103645=note: print w/replies, xml ) Need Help??


in reply to [SOLVED] Problem merging thousands of PDFs with PDF::API2: 'Deep recursion on subroutine "PDF::API2::Basic::PDF::Objind::release"'

You're re-opening $out_pdf for every new input file, that may be one of the reasons. Try to open output pdf file just once before the while loop.
  • Comment on Re: Problem merging thousands of PDFs with PDF::API2: 'Deep recursion on subroutine "PDF::API2::Basic::PDF::Objind::release"'
  • Download Code

Replies are listed 'Best First'.
Re^2: Problem merging thousands of PDFs with PDF::API2: 'Deep recursion on subroutine "PDF::API2::Basic::PDF::Objind::release"'
by ateague (Monk) on Oct 13, 2014 at 17:56 UTC
    You're re-opening $out_pdf for every new input file, that may be one of the reasons. Try to open output pdf file just once before the while loop.

    That certainly is a problem. However, according to the PDF::API2 docs for the update method, $out_pdf is removed from memory after the first iteration of the loop after writing out to the merged pdf.

    $pdf->update() Saves a previously opened document. <...> $pdf->end() Remove the object structure from memory. PDF::API2 contains circul +ar references, so this call is necessary in long-running processes to + keep from running out of memory. This will be called automatically when you save or stringify a PDF +.

    I get the following error when I modify the script to open $out_pdf before the loop:

    Can't call method "new_obj" on an undefined value at C:/Perl64/site/lib/PDF/API2/Basic/PDF/Pages.pm line 92.

    The new, updated script follows:

    #!/usr/bin/perl use 5.018; use PDF::API2; use strict; use warnings; my $path = "./pdfs/"; my $out_pdf_file = 'merged.pdf'; my $out_pdf = PDF::API2->new(-file => $out_pdf_file); opendir (my $DIR, $path) or die "Could not open $path:\n$!\n$^E"; chdir $path; while ( my $in_pdf_file = readdir $DIR ) { next if $in_pdf_file =~ /^\./; my $in_pdf = PDF::API2->open($in_pdf_file) or die "Error opening PDF + file [$in_pdf_file]:\n$!\n$^E"; foreach my $page ( 1 .. $in_pdf->pages() ) { $out_pdf->import_page($in_pdf, $page, 0); } $out_pdf->update(); } closedir $DIR;

    Moving $out_pdf->update(); out of the loop fixes the undefined value error, but the script quickly exhausts all the memory on the computer.

      If you don't need any PDF::API2 specific features, then perhaps you can try something else? For example CAM::PDF. It comes with appendpdf.pl script which does something similar to what you need.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1103645]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2020-05-27 07:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If programming languages were movie genres, Perl would be:















    Results (153 votes). Check out past polls.

    Notices?