Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Re: (Zigster) MSWORD TO TEXT

by zigster (Hermit)
on Apr 12, 2001 at 19:11 UTC ( #72070=note: print w/replies, xml ) Need Help??

in reply to MSWORD TO TEXT

I use the UNIX command 'strings' it works fine and dandy with most word docs I have come across. The op is a little ruff but in most cases I can read the document. It all depends how clean you want the output.


Replies are listed 'Best First'.
Re: Re: (Zigster) MSWORD TO TEXT
by Hero Zzyzzx (Curate) on Apr 12, 2001 at 22:50 UTC

    All I can say about strings is WOW! That works perfectly on Word 2k, WordPerfect 8, and Excel 2k files. Combined with pdftotext you have a nearly complete solution for extracting text from common user docs, which I'm doing for a search engine for a web-based document management site. Just goes to show that if there's something you want to do on Unix/Linux, chances are the tool is already sitting on your hard drive.

      Glad to know it worked for you, I would be very interested in seeing the result when you have completed it. As a full on UNIX head working in a MS world a complete toolset for converting MS docs to ASCII would be of great interest to me. Please msg me when/if you complete the tools.



Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://72070]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (1)
As of 2022-01-27 00:49 GMT
Find Nodes?
    Voting Booth?
    In 2022, my preferred method to securely store passwords is:

    Results (70 votes). Check out past polls.