Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Greetings, brothers and sisters.

I'm with a very complicated problem and I was hoping that Perl can give me a hand.

I've seen a lot of nodes about MSWord files, but all of them seem to point to Win32::OLE wich is a great module, but I don't believe will help me, since I need - at least if possible - to do this in a linux server.

Here's the scenario:

It's basically a file server, wich has to control the documents published on it using a web interface.

So, when a user submits a document (tipically a MSWord file), the application needs to append a customized header to this file. Please note that by "header" I mean a MSWord document header, basically a table with some information, company logo, etc...

Problem #1: How to manipulate MSWord files without "opening" MSWord via Win32::OLE?

Then, when a user clicks on the file link, it should be able to view it in the MSWord-MSIE integrated interface BUT it shouldn't be able to change the file.

The only way the user could change the contents of the document is after downloading it. After the changes the user should then re-submit the document to the system, wich will then append a new header to it and so on...

Problem #2: How to prevent a MSWord document opened in the MSWord-MSIE integrated interface to be changed?

A possible solution to both problems is to convert the contents of the file to another format, like HTML, and let the user view the contents directly in MSIE.

The .doc file would only be available for the user via download to make the changes.

This solution has a problem, however: the conversion must be perfect.

Is there a module that can make a ***decent*** doc -> html conversion? By decent I mean preserving tables, inline images, etc...?

As you can see I'm pretty much lost here, so I was hoping that someone could give me some advices/alternatives, etc...

TIA,


In reply to Manipulating MSWord files in a linux box? by DaWolf

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-03-19 11:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found