<?xml version="1.0" encoding="windows-1252"?>
<node id="849782" title="Re: Read doc/docx in Linux" created="2010-07-15 10:22:55" updated="2010-07-15 10:22:55">
<type id="11">
note</type>
<author id="664508">
philipbailey</author>
<data>
<field name="doctext">
&lt;p&gt;I have used antiword successfully in the past for reading the text of Word files at the command line.  It doesn't seem to be actively maintained any more, though.&lt;/p&gt;

&lt;p&gt;I also notice that AbiWord has a command line option for converting Word to other formats.  You could of course use the full GUI version of AbiWord, or indeed OpenOffice.&lt;/p&gt;

&lt;p&gt;(Update) I realise of course that none of my answer directly answers the question of reading these files in Perl, but in practice the command line possibilities mentioned are often a practical way to go.&lt;p&gt;</field>
<field name="root_node">
849659</field>
<field name="parent_node">
849659</field>
</data>
</node>
