in reply to Re: Extracting information from a PDF file
in thread [Updated] Extracting information from a PDF file
"Are your PDF files generated automatically?"
Nope, it is just one ill-made file.
I'll look into rwritepdf.pl, as well as the article on Wikipedia.
Update: Hmm, I get the same sort of output I get when trying to print the page's content. Example:
851(85@P8�]��}��� +;999��??9N��10]1]%#"'#"' +&5465465454&546326/&'&543O25N"j��0pe +H����S��T�� +#+*�*>�>B B?�?.Lt� +5533;�����+T~�OD&tʏ +33;wo�Gt�� + `@I*%%*57 % %EJUYei 8 8/?o�+ A9+N�]M��]�]&# +65533;�?�]]10 '&! "32764'&"�&# +65533;����VI��� +5533;�������kf=ʏ +33;�Z:������&# +65533;���[����ᦙ +3;yr���a]d���#,Y@9&%) ++2'=+ -%8 8K))8` + 0@a23�<��Y6)1�3DDU)k�&# +65533;R�@<�pp����P ` p � � � _ o � � /?O]�]��]�]]?�10"'&?632A +533;M&>$-7F\$┌¯O6%TÜ_ 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 &# +9227; ’ ’ Ö ÖAAAAAAAAAŽŽŽŽ +ŽŽŽ.ÚVÂ┌Ø┴ ’ø│ª +: £ ¼ ≥ ÿ └ └ é é é ù≤æg�'�!!��k�MM +�_�K���/�� +�������ʏ +33;�������&# +65533;������� +�������ʏ +33;�������&# +65533;������� +�������ʏ +33;�������&# +65533;������� +�������ʏ +33;�������&# +65533;��� + +$QGUHD7VK0FL'RO/\3(=SXJ*W6N-).&Z5PI2& +#65533;+EY,1]%49TTE2F10808t00���� +��dddddddddddd7�UoUoTTA +533;�������ʏ +33;�������ᦙ +3;�d�d�d�d�d�d� +;d��� ��Z����� +;�pO%�_KR^��hep`{ct*dMu�& +#65533;ptpt3�3�3�3��D�DA +533;D�D�D�D�D3��iA +533;gbWK?:��>�>�R�1ᦙ +3;7y0|5MGwGw���)�A)m�>yL +yL*��NzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNz +NzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNz +NzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNz +NzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNzNz +NzNzNzNzNzq9����_<�� +;�E;�d�%�r�� � + endstream endobj 26 0 obj 14358 endobj 27 0 obj << /Type /Encoding /BaseEncoding /WinAnsiEncoding /Differences [ 1 /M /E /O /R /I /A /L /space /S /C /H /T /F /P /r /i / +n /c /p /a /l /e /t /y /N /u /s /hyphen /h /o /d /K /g /G /period /slash /b /f ] >> endobj 28 0 obj << /Type /Encoding /BaseEncoding /WinAnsiEncoding /Differences [ 1 /space /A /n /d /r /e /a /T /s /h /M /c /i /D /o /l / +comma /L /y /P /E /Z /p /u /g /G /t /S /k /J /F /K /C /w /R /m /f /O /quoteright /hyphen /H /b /v /slash /I /N /z /B /colon /one /Q /V ] >> endobj xref 0 29 0000000000 65535 f 0000000012 00000 n 0000000061 00000 n 0000000269 00000 n 0000000345 00000 n 0000000516 00000 n 0000021670 00000 n 0000021690 00000 n 0000021735 00000 n 0000021951 00000 n 0000022264 00000 n 0000022482 00000 n 0000022853 00000 n 0000022885 00000 n 0000022929 00000 n 0000023102 00000 n 0000038411 00000 n 0000038432 00000 n 0000038683 00000 n 0000038856 00000 n 0000038888 00000 n 0000038944 00000 n 0000039341 00000 n 0000039361 00000 n 0000051833 00000 n 0000051854 00000 n 0000071486 00000 n 0000071508 00000 n 0000071728 00000 n trailer << /ID [ <e84930d3cb6e2eebd076b1c784a83363> <cafe7596fcd757ed4ed43ba7f84e4976> ] /Info 2 0 R /Root 1 0 R /Size 29 >> startxref 72004 %%EOF
Is it encoded unconventionally? Not sure what to do now.
I'm so adjective, I verb nouns!
chomp; # nom nom nom
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: Extracting information from a PDF file
by Perlbotics (Bishop) on Aug 20, 2008 at 22:46 UTC | |
by Lawliet (Curate) on Aug 20, 2008 at 22:54 UTC |
In Section
Seekers of Perl Wisdom