Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
The stupid question is the question not asked
 
PerlMonks  

Open a Microsoft Word Doc and Save as plain text file

by soon_j (Scribe)
on Jun 07, 2006 at 23:54 UTC ( #554185=perlquestion: print w/ replies, xml ) Need Help??
soon_j has asked for the wisdom of the Perl Monks concerning the following question:

Monks good day! Do you happen to have a simple routine or Perl module that allows me to open a microsoft word file and allow me to save it in a plain text format?

Or if not... is there a module that can identify the file of type if it's a Word document, RTF document, plain text, PDF, or image?

I just need to check if the file uploaded is a plain text (the only type I need).

Thanks!

Comment on Open a Microsoft Word Doc and Save as plain text file
Re: Open a Microsoft Word Doc and Save as plain text file
by blue_cowdawg (Prior) on Jun 08, 2006 at 00:15 UTC
        Do you happen to have a simple routine or Perl module that allows me to open a microsoft word file and allow me to save it in a plain text format?

    You know, Kind Monk, there is a CPAN Search Mechanism that allows you to search for such things. I tried it myself (albeait not too strenuously) and came up dry.

    However, if you are trying this on an Evil Empire® system you might want to investigate doing something with Win32::OLE, but I'm not an expert in its use so can't help you more.


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: Open a Microsoft Word Doc and Save as plain text file
by sgifford (Prior) on Jun 08, 2006 at 00:28 UTC
    I'm not aware of a pure-Perl solution for parsing MS Word documents. A search on http://freshmeat.net turns up a few programs that can do the conversion, or you can use blue_cowdawg's suggestion of Win32::OLE if you're on a Windows system with Word (and you trust the documents not to do anything nasty, like take over the computer where your script is running).

    File::Type can help you identify the type of the file.

      Thank you monks for your inputs.

      I found this code from CGI to be very good in determining the type of file

      my $cgi = new CGI; my $file = $cgi->param('file'); my $type = $cgi->uploadInfo($file)->{'Content-Type'};
        Ah, yes, good idea. It does rely on the client to identify the document, though; if the user's Web browser doesn't know what the file is, it will probably be reported to you as application/octet-stream or text/plain if it smells like text.
Re: Open a Microsoft Word Doc and Save as plain text file
by jplindstrom (Monsignor) on Jun 08, 2006 at 10:57 UTC
Re: Open a Microsoft Word Doc and Save as plain text file
by davorg (Chancellor) on Jun 08, 2006 at 15:58 UTC

    It's not Perl, but I find wvware to be very useful.

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://554185]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2014-04-19 18:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (483 votes), past polls