Cleaning up Word HTML is actually the exact purpose for which Tidy was created. It started as a W3C project, or at least was hosted there for a time. I understand it's an excellent piece of software though I have only tinkered with it because I write my HTML in Notepad. *grin*

