Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris

Re: Comparison word against pdf

by thezip (Vicar)
on Apr 16, 2013 at 18:59 UTC ( #1028988=note: print w/replies, xml ) Need Help??

in reply to Comparison word against pdf

I've done some rudimentary parsing of PDF's using CAM::PDF's getPageText() method, but I was only able to deal with PDF v1.4 formatted files though (v1.5 and v1.6 I couldn't parse).

I have not done anything similar in Word, but there must be something around that performs a similar extraction function.

Once you've extracted each file, then you'd need to write the comparator function.

What can be asserted without proof can be dismissed without proof. - Christopher Hitchens, 1949-2011

Replies are listed 'Best First'.
Re^2: Comparison word against pdf
by hdb (Prior) on Apr 16, 2013 at 19:04 UTC

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1028988]
[choroba]: low end, in my talk, will be code that "we don't touch because it works" and noone knows why
[choroba]: I want to present the most bizzare bugs and misfeatures I met when working for a large financial institution
[choroba]: I already gave a similar talk to my friends in a pub and at an internal conference at work and people liked it, so maybe...
[choroba]: LanX: That's the heritage, I can't do anything else
[RonW]: Sounds like some system my employer has "It does exactly what we need it to do and can't afford to risk anything we can't prove is 100% compatible"
[marto]: choroba sounds interesting
[RonW]: james28909 Why not write a Perl program to do the task?
[choroba]: RonW Yes, but then, one day, they needed to switch from FTP to SFTP, and... but I can't give the whole talk away here :)

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (10)
As of 2017-05-22 21:39 GMT
Find Nodes?
    Voting Booth?