In the HTML parsers I've written, I've always went loose for several reasons.
in reply to OO style question: how much information to hide?
Personally, I would just make the "strictness" a method you could call so you can have it both ways (carp or croak). The other thing I've done in the parsers I've written is to allow tags to be specified 'tag' and '/tag' as well so the problem can be circumvented all together. The later fits my thinking well.
- I've seen a lot of malformed HTML.
- I see a lot of odd "HTMLish" tags embedded for processing / templating
- I've done 2. myself.
"To be civilized is to deny one's nature."