Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Re: (Zigster) MSWORD TO TEXT

by zigster (Hermit)
on Apr 12, 2001 at 19:11 UTC ( #72070=note: print w/replies, xml ) Need Help??

in reply to MSWORD TO TEXT

I use the UNIX command 'strings' it works fine and dandy with most word docs I have come across. The op is a little ruff but in most cases I can read the document. It all depends how clean you want the output.


Replies are listed 'Best First'.
Re: Re: (Zigster) MSWORD TO TEXT
by Hero Zzyzzx (Curate) on Apr 12, 2001 at 22:50 UTC

    All I can say about strings is WOW! That works perfectly on Word 2k, WordPerfect 8, and Excel 2k files. Combined with pdftotext you have a nearly complete solution for extracting text from common user docs, which I'm doing for a search engine for a web-based document management site. Just goes to show that if there's something you want to do on Unix/Linux, chances are the tool is already sitting on your hard drive.

      Glad to know it worked for you, I would be very interested in seeing the result when you have completed it. As a full on UNIX head working in a MS world a complete toolset for converting MS docs to ASCII would be of great interest to me. Please msg me when/if you complete the tools.



Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://72070]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2018-05-28 04:27 GMT
Find Nodes?
    Voting Booth?