<?xml version="1.0" encoding="windows-1252"?>
<node id="881233" title="docx to txt" created="2011-01-08 10:52:05" updated="2011-01-08 10:52:05">
<type id="115">
perlquestion</type>
<author id="872655">
welle</author>
<data>
<field name="doctext">
&lt;p&gt;Hi,&lt;/p&gt;
&lt;p&gt;I need to access a .docx, converting its content i txt. No fancy formatting needed. I am experiencing several issues with Win32::OLE (everything worked fine with .doc files and Word 2003). Therefore I would like a pure perl solution. For the same purpose with Excel I use Spreadsheet::ParseExcel.&lt;/p&gt;
&lt;p&gt;Anyone know a script/module or whatever for doing this? As docx file are zipped XML file, it could be possible to parse it. But maybe it is not that straightforward. And I don't want to reinvent the wheel... Google search only showed me docx2txt (http://docx2txt.sourceforge.net/). It could be a starting point, but I am having some difficulties to figure out how to integrate it into my application. Any advice would be great. Welle &lt;/p&gt; 
</field>
</data>
</node>
