I know. The first is an example of illegal HTML (at least, illegal as of XHTML 1.0) and the second is an example of nesting, as I mentioned. In the application Cody Pendant is (writing|maintaining) I would personally accept those as acceptable exceptions: neither will screw up more than the poster's message. As I understood it, the biggest problem with leaving open-ended links or otherwise screwing up the HTML was that the rest of the page would be screwed up as well. These two will get rendered as
<A HREF = link>FOO</A>
and
<!-- -->FOO</A>
respectively (assuming Cody Pendant swaps characters for entities).
LAI
:eof | [reply] |
The point is the detect wrong or illegal HTML, so assuming
the given text validates is silly. If it would validate, the
whole excercise would be futile. Also, the first example is
valid HTML, and has always been valid HTML. In the second example, no nesting is going on. There's just one A element.
Abigail
| [reply] [d/l] |
As I understood the problem, the goal was not necessarily to detect wrong or illegal HTML, but to make sure the output was valid so that posts further down the page are not screwed up. I never suggested that the input be assumed to be valid; in fact the way I built my suggested solution was to detect valid anchors and to render everything else as text (with entities). I feel that my suggestion, while not complete, at least lends itself to being able to prevent user mistakes or ignorance from affecting other posts.
Oh, and when I mentioned nesting, I meant that the comment inside the anchor element would be treated by my regex like nesting. I know that what you wrote was in fact an example of a legal comment inside a single A element, but since there is no reason for a user to comment the code in a BBS post I felt the mangling of that was an acceptable loss.
LAI
:eof
| [reply] [d/l] |
The first is definitely not:
HTML 4.01
XHTML 1.0
However, it will still display correctly in browsers. A better breaking example might be:
<a href = li'nk>FOO</a>
| [reply] [d/l] |