Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Open a Microsoft Word Doc and Save as plain text file

by soon_j (Scribe)
on Jun 07, 2006 at 23:54 UTC ( #554185=perlquestion: print w/ replies, xml ) Need Help??
soon_j has asked for the wisdom of the Perl Monks concerning the following question:

Monks good day! Do you happen to have a simple routine or Perl module that allows me to open a microsoft word file and allow me to save it in a plain text format?

Or if not... is there a module that can identify the file of type if it's a Word document, RTF document, plain text, PDF, or image?

I just need to check if the file uploaded is a plain text (the only type I need).

Thanks!

Comment on Open a Microsoft Word Doc and Save as plain text file
Re: Open a Microsoft Word Doc and Save as plain text file
by blue_cowdawg (Monsignor) on Jun 08, 2006 at 00:15 UTC
        Do you happen to have a simple routine or Perl module that allows me to open a microsoft word file and allow me to save it in a plain text format?

    You know, Kind Monk, there is a CPAN Search Mechanism that allows you to search for such things. I tried it myself (albeait not too strenuously) and came up dry.

    However, if you are trying this on an Evil Empire® system you might want to investigate doing something with Win32::OLE, but I'm not an expert in its use so can't help you more.


    Peter L. Berghold -- Unix Professional
    Peter -at- Berghold -dot- Net; AOL IM redcowdawg Yahoo IM: blue_cowdawg
Re: Open a Microsoft Word Doc and Save as plain text file
by sgifford (Prior) on Jun 08, 2006 at 00:28 UTC
    I'm not aware of a pure-Perl solution for parsing MS Word documents. A search on http://freshmeat.net turns up a few programs that can do the conversion, or you can use blue_cowdawg's suggestion of Win32::OLE if you're on a Windows system with Word (and you trust the documents not to do anything nasty, like take over the computer where your script is running).

    File::Type can help you identify the type of the file.

      Thank you monks for your inputs.

      I found this code from CGI to be very good in determining the type of file

      my $cgi = new CGI; my $file = $cgi->param('file'); my $type = $cgi->uploadInfo($file)->{'Content-Type'};
        Ah, yes, good idea. It does rely on the client to identify the document, though; if the user's Web browser doesn't know what the file is, it will probably be reported to you as application/octet-stream or text/plain if it smells like text.
Re: Open a Microsoft Word Doc and Save as plain text file
by jplindstrom (Monsignor) on Jun 08, 2006 at 10:57 UTC
Re: Open a Microsoft Word Doc and Save as plain text file
by davorg (Chancellor) on Jun 08, 2006 at 15:58 UTC

    It's not Perl, but I find wvware to be very useful.

    --
    <http://dave.org.uk>

    "The first rule of Perl club is you do not talk about Perl club."
    -- Chip Salzenberg

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://554185]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (5)
As of 2015-07-04 22:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls