I ended up converted the pdf to a text file using pdftotext.
Then I modded my original program to read from that.
However, I would like to know a programmatic solution to this problem
since I have run into it twice and have failed both times.
| [reply] [Watch: Dir/Any] |
I ran in the same error and here is the "empirical" way I found to solve it.
I'm working on a Windows XP machine with Activeperl 5.10.0 Build 1003. My CAM-PDF module version is 1.13,
the versions of its dependecies are Text-PDF module version 0.29 and Crypt-RC4 module version 2.02
As CaMelRyder reported, from this simple program:
use strict; use warnings; use CAM::PDF;
my $pdf = CAM::PDF->new("test.pdf"); # test.pdf is the existing pdf
my $string = $pdf->getPageText(1);
open (C, ">", "test.txt"); print C $string; # print extracted string
.. I obtained this error:
Failed to open filter FlateDecode (Text::PDF::FlateDecode)
Unrecognized type in parseAny:
1 ═╩▓▀O▒:♠n╩...
the really strange thing was that adding ONE LINE at the beggining of the program:
@INC=('C:/Perl/lib','C:/Perl/site/lib','.'); # THIS LINE IS ADDED
use strict; use warnings; use CAM::PDF;
my $pdf = CAM::PDF->new("test.pdf"); # test.pdf is the existing pdf
my $string = $pdf->getPageText(1);
open (C, ">", "test.txt"); print C $string; # print extracted string
everything worked fine and no error message was displayed.
What I've done is a very little change to the Perl special variable @INC which is: (from perldoc)
"The array @INC contains the list of places that the use construct look for its library files."
Usually the content of this array @INC is ('C:/Perl/site/lib','C:/Perl/lib','.')
I verified that by typing from the command line of my computer: perl -de0
in this way I opened the perl debugger and I typed: x @INC
On resume, I obtained that:
usually _______________ => @INC is ____ ('C:/Perl/site/lib','C:/Perl/lib','.')
to make CAM-PDF working => @INC must be ('C:/Perl/lib','C:/Perl/site/lib','.') => I INVERTED THE FIRST TWO ELEMENTS OF @INC
I DON'T KNOW THE REASON OF ALL THAT, AND I WAIT FOR EXPLANATIONS FROM SOME PERLMONK MORE EXPERT THAN ME.
| [reply] [Watch: Dir/Any] [d/l] [select] |
Hi
Don't get worry If you get the following error
srir@leo ~$ perl test.pl
Failed to open filter FlateDecode (Text::PDF::FlateDecode)
Unrecognized type in parseAny:
1 V[s6+ߴ/.(|S^MwJgG7m$...
we can fix the above just installing the following Perl Modules in the order .
Crypt-RC4-2.02 \
Compress-Raw-Zlib-2.020 \
Compress-Zlib-2.015 \
Digest-MD5-2.38 \
Getopt-Long-2.38 \
Compress-Zlib-Perl-0.02 \
IO-Compress-2.020 \
Pod-Parser-1.38
Test-Simple-0.92
Text-PDF-0.29
Thanks
Phani Krishna Jampala
Sr Software Engineer (Build & Release)
| [reply] [Watch: Dir/Any] |
Hi
Dont worry if you face the following error
srir@leo ~$ perl test.pl
Failed to open filter FlateDecode (Text::PDF::FlateDecode)
Unrecognized type in parseAny:
1 V[s6+ߴ/.(|S^MwJgG7m$...
we can resolve the above issue by installing of the following perlmodules in order
Crypt-RC4-2.02 \
Compress-Raw-Zlib-2.020 \
Compress-Zlib-2.015 \
Digest-MD5-2.38 \
Getopt-Long-2.38 \
Compress-Zlib-Perl-0.02 \
IO-Compress-2.020 \
Pod-Parser-1.38
Test-Simple-0.92
Digest-MD5-2.38
Text-PDF-0.29
CAM-PDF-1.52
Thanks
Phani Krishna Jampala
Sr Software Engineer (build & release)
| [reply] [Watch: Dir/Any] |
thats odd, because the standard @INC order is
D:\>perl -le " print for @INC "
D:/Perl/lib
D:/Perl/site/lib
.
| [reply] [Watch: Dir/Any] [d/l] |