Pathologically Eclectic Rubbish Lister | |
PerlMonks |
Re: Need help with perl only parsing of M$ word fileby Brutha (Friar) |
on Sep 05, 2003 at 08:18 UTC ( [id://289132]=note: print w/replies, xml ) | Need Help?? |
Dennis, I scan a directory tree of word files for creation of an index with SWISH-E. I use Win32::OLE and have M$-Word installed, but I do not need any interaction and Word does not have to be visible. This is bound to windows. Are you dependend on the windows platform? All tools I found were not exactly what I need, many come from the unix world and depend on these handy gnu libraries, but I am on Windows here. Be aware, that after extracting the text you might still have lots of control characters forming tables etc. I am not interested in bold or italic text, but extract title and other document properties, user-defined properties and text. My solution was straight forward as with every OLE interaction I have written in Perl. You open the application and the macro editor, press F1 to find the functions, record macros, save the VB-Script and translate and extend it to Perl, cutting its length to the half. If somebody is interested, I could post my code as a starting point. regards Brutha And it came to pass that in time the Great God Om spake unto Brutha, the Chosen One: "Psst!"
In Section
Seekers of Perl Wisdom
|
|