Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: Extracting text from MS Word files on a Linux box

by afoken (Chancellor)
on Jun 21, 2018 at 20:18 UTC ( [id://1217133]=note: print w/replies, xml ) Need Help??


in reply to Re: Extracting text from MS Word files on a Linux box
in thread Extracting text from MS Word files on a Linux box

Have you tried strings? Always used to do the trick before the MS format changed.

docx is just a bunch of zipped XML files and some misc files. strings will fail due to ZIP, but once unpacked, strings will happily dig through the XML files.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1217133]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (4)
As of 2024-03-28 13:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found