|Perl: the Markov chain saw|
Manipulating MSWord files in a linux box?by DaWolf (Curate)
|on Oct 04, 2005 at 16:29 UTC||Need Help??|
DaWolf has asked for the
wisdom of the Perl Monks concerning the following question:
Greetings, brothers and sisters.
I'm with a very complicated problem and I was hoping that Perl can give me a hand.
I've seen a lot of nodes about MSWord files, but all of them seem to point to Win32::OLE wich is a great module, but I don't believe will help me, since I need - at least if possible - to do this in a linux server.
Here's the scenario:
It's basically a file server, wich has to control the documents published on it using a web interface.
So, when a user submits a document (tipically a MSWord file), the application needs to append a customized header to this file. Please note that by "header" I mean a MSWord document header, basically a table with some information, company logo, etc...
Problem #1: How to manipulate MSWord files without "opening" MSWord via Win32::OLE?
Then, when a user clicks on the file link, it should be able to view it in the MSWord-MSIE integrated interface BUT it shouldn't be able to change the file.
The only way the user could change the contents of the document is after downloading it. After the changes the user should then re-submit the document to the system, wich will then append a new header to it and so on...
Problem #2: How to prevent a MSWord document opened in the MSWord-MSIE integrated interface to be changed?
A possible solution to both problems is to convert the contents of the file to another format, like HTML, and let the user view the contents directly in MSIE.
The .doc file would only be available for the user via download to make the changes.
This solution has a problem, however: the conversion must be perfect.
Is there a module that can make a ***decent*** doc -> html conversion? By decent I mean preserving tables, inline images, etc...?
As you can see I'm pretty much lost here, so I was hoping that someone could give me some advices/alternatives, etc...