Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: (Zigster) MSWORD TO TEXT

by zigster (Hermit)
on Apr 12, 2001 at 19:11 UTC ( #72070=note: print w/replies, xml ) Need Help??


in reply to MSWORD TO TEXT

I use the UNIX command 'strings' it works fine and dandy with most word docs I have come across. The op is a little ruff but in most cases I can read the document. It all depends how clean you want the output.
--

Zigster

Replies are listed 'Best First'.
Re: Re: (Zigster) MSWORD TO TEXT
by Hero Zzyzzx (Curate) on Apr 12, 2001 at 22:50 UTC

    Zigster,
    All I can say about strings is WOW! That works perfectly on Word 2k, WordPerfect 8, and Excel 2k files. Combined with pdftotext you have a nearly complete solution for extracting text from common user docs, which I'm doing for a search engine for a web-based document management site. Just goes to show that if there's something you want to do on Unix/Linux, chances are the tool is already sitting on your hard drive.

      Glad to know it worked for you, I would be very interested in seeing the result when you have completed it. As a full on UNIX head working in a MS world a complete toolset for converting MS docs to ASCII would be of great interest to me. Please msg me when/if you complete the tools.

      Cheers
      --

      Zigster

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://72070]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2018-05-28 04:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?