http://www.perlmonks.org?node_id=405332


in reply to A copyeditor needs help to get started with a Perl project

I think that MS has very good tools for XML generation and editing. Isn't there a utility for going between Word and XML? XML has the depth of the book organization but allows for the easy manipulation by Perl and many other tools.

I'm talking off the top of my head here, and would be hard put to give a demonstration, but this is something I would investigate if I were to do another book or more book scale editing.

What is STM books?

Update:

I found this commercial suite of products for doing the conversions between Word and XML.
perlcapt
-ben
  • Comment on Re: A copyeditor needs help to get started with a Perl project

Replies are listed 'Best First'.
Re^2: A copyeditor needs help to get started with a Perl project
by wordsmith (Acolyte) on Nov 06, 2004 at 18:08 UTC

    Yes, I really should pick up XML because I'm in the publishing industry. I can see 3 milestones for my project:

    (1) Perl with text files.

    (2) Perl with HTML files for the greater functionality of being able to handle Word document elements such as superscripts, fonts, etc. (Or bite the bullet and do it with VB if this approach fails to work).

    (3) Perl with XML. At this point, I should have a marketable product and the big bucks should start flowing in. :-)

    What are STM books? STM stands for "scientific, technical, and medical." But it's the IT books that drive us up the wall with the jargon, acronyms, and terms uppercased or not depending on the author's whims. My Perl project is primarily directed toward taming IT books. Most sciences have fairly stable conventions regarding nomenclature, but not IT methinks.

    Would you say "I bought two mouses from the store" or "I bought two mice from the store"? And which would choose: keystream, key stream, or key-stream? We had one book where it appeared all three ways.

    By the way, are there no well-known freeware Word-to-XML converters?

    Thanks, everybody, for all the help.

      I can't find BYTE Magazine's style guide which has all of these issues pinned down after thousands of man-hours debating them, but I did find two links that are probably even better:

      I have a personal/professional interest in the .doc -> XML subject and am pursuing it as a result of your comments. I'll post an update to this message when I have found out more about it. It appears as though there is a Word plugin or enhancement for going between XML and Word documents.

      Update:

      Here is the Microsoft Toolbox for Word/XML I'll try it out and let you know.
      perlcapt
      -ben