Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re: Manipulating MSWord files in a linux box?

by terra incognita (Pilgrim)
on Oct 04, 2005 at 17:01 UTC ( [id://497333]=note: print w/replies, xml ) Need Help??

in reply to Manipulating MSWord files in a linux box?

Have you thought about saving your Word documents as XML? IIRC you need Office Professional version to do this.

This would allow you to display the document over the web with out changing it (IE will figure out and display the file properly using XSL). As well you can lock the file down by using web server controls. This would result in users being able to modify the client side doc but not being able to save it back to the server unless they do it your way.

When a user downloads then saves a document back the server you can then check that the file contains the proper header (both text and images can be supported), if it does not then you can delete the old header and put a new one in it's place. Update

The body of the Word doc is contained within the <w:body> tags. This also contains the header and footer information in it as well. The header/footer info is stored under the <w:sectPr> node.

Here is a bare body that only contains the word "BODY".

<w:body> <wx:sect> <w:p> <w:r> <w:t>BODY</w:t> </w:r> </w:p> <w:p/> <w:p/> <w:p/> </wx:sect> </w:body>
Here is document body with the header and footer.
<w:body> <wx:sect> <w:p> <w:r> <w:t>BODY</w:t> </w:r> </w:p> <w:p/> <w:p/> <w:p/> <w:sectPr> <w:hdr w:type="odd"> <w:p> <w:pPr> <w:pStyle w:val="Header"/> </w:pPr> <w:r> <w:t>Header1</w:t> </w:r> <w:r> <w:tab wx:wTab="3525" wx:tlc="none" wx:cTlc="58"/> </w:r> <w:r> <w:tab wx:wTab="4320" wx:tlc="none" wx:cTlc="71"/> </w:r> </w:p> </w:hdr> <w:ftr w:type="odd"> <w:p> <w:pPr> <w:pStyle w:val="Footer"/> </w:pPr> <w:r> <w:t>Footer1</w:t> </w:r> </w:p> </w:ftr> <w:pgSz w:w="12240" w:h="15840"/> <w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800" w:h +eader="720" w:footer="720" w:gutter="0"/> <w:cols w:space="720"/> <w:docGrid w:line-pitch="360"/> </w:sectPr> </wx:sect> </w:body>
Hope this helps.

Replies are listed 'Best First'.
Re^2: Manipulating MSWord files in a linux box?
by DaWolf (Curate) on Oct 04, 2005 at 19:36 UTC
    Thanks, it's a very interesting suggestion and I like the idea of the content being in a structured XML file.

    The only bad thing about this is that I can't predict if the client will have the appropriate Office/MSWord version, and since it's a commercial product it would be bad to force the client to buy a software so he can use mine...

    Update: on a related subject, take a look at Problems with Microsft's new Office 'XML'. It seems that MS is already causing us trouble...

      I agree - I'd hate for you to force me to use MSWord at all ;-)

      Alternate suggestions. Whether they're nearly as draconian is up to you.

      Manipulate PDF files instead. Require the user to send you a PDF instead of a doc file, and use that instead. Have a link to some open-source print-driver that outputs to PDF for those who have older word processors on Windows. Have a link to for all platforms as a good way to write to PDF.

      Manipulate .sxw files instead. Forcing the client to buy software is one thing. Forcing them to use software that doesn't cost them anything is quite a bit different. Different enough? You decide.

      Otherwise, you'll probably be stuck with moving your code over to a Windows box where you can use OLE. Even that is dangerous in a web server - I'm not sure what happens when two users come in at the same time ... ;-) At the very least, test it to be sure.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://497333]
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (4)
As of 2024-07-18 23:22 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.