http://www.perlmonks.org?node_id=660212


in reply to Re^2: PDF Parsing
in thread PDF Parsing

Hi, figuring how to parse existing PDF files gave me headaches but reading PDF::API2::File's perldoc I figured it out. if you do something like my $foo = PDF::API2->open(bar.pdf);, the file structure is stored in $foo->{'pdf'}. Then you've got the Catalog (see pdf' specs) that you can parse to get objects indirect references (pages & annots or acroform) Once you've got an hash refering to the item you want to mess with you can use read_obj method like that : my $pdfapi = PDF::API2->open(foo.pdf); my $pdf = $pdfapi->{'pdf'}; my $object = $pdf->read_obj($indirect_reference_hashref);