Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

There are several twisty corridors here in the Monastery in which demoronizer cobwebs hang from the ceiling; IMO they're well worth pursuing by anyone interested in cleaning up the .html produced by ANY of MS's Word, Excel or supposedly WYSIWYG products. Look under the covers, and what you got was remarkable bloat and non-conformant code.

So, a few keywords for future Super_Searchers: "HTML, html MS, Microsoft, Office, Word, Excel, FrontPage, PowerPoint, Publisher, cleanup, parse" ...and there surely could be more (arguably even Notepad, which when in word-wrap mode adds MS-ish lineends at every displayed wrap position).

davidrw and astroboy offered links to useful alternate tools in Word HTML issues. There also a bit of discussion re the issues implied in samtregar's remark in this thread.

Self-updating of demoronizer is laid out very nicely by derby in Re^3: Reg Ex to strip MS smart quotes

But (... sigh! )...even the the lastest Word->html output does not exactly demonstrate that the allegedly-enlightened giant in Redmond has learned to avoid making the same mistakes in different (ie, incompatible) ways.

...and, oh yes, a (deprecated) disclaimer: I don't hate W32; I just hate cleaning up MS .html to w3c standards.

Fair warning, also: I should probably use a sig like html 4.01 dinosaur

In reply to Re^2: "Demoronizer" in Best of the Best Users in Perlmonks site by ww
in thread Best of the Best Users in Perlmonks site by l.frankline

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others meditating upon the Monastery: (10)
    As of 2014-09-17 12:16 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      How do you remember the number of days in each month?











      Results (74 votes), past polls