Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: let's have valid html

by diotalevi (Canon)
on Nov 17, 2004 at 19:29 UTC ( [id://408541]=note: print w/replies, xml ) Need Help??


in reply to let's have valid html

PM will never be xhtml compliant because its users don't write their nodes in xhtml. If you declare that a page is xhtml then browsers tend to hold you to it and throw will actual parsing errors. I'm just noting this so that when we finish getting PM's code into W3C compliance we don't also make the mistake of telling the browser to expect anything but highly suspect markup.

Replies are listed 'Best First'.
Re^2: let's have valid html
by Aristotle (Chancellor) on Dec 19, 2004 at 17:01 UTC

    No, they don't. It depends on the MIME type. The spec says application/xhtml+xml pages MUST be parsed with draconian error handling, but pages with text/html need not. Note that text/html is deprecated for XHTML 1.0 but invalid for XHTML 1.1.

    In other words, if we use XHTML 1.0 Transitional served with MIME type text/html we shouldn't run into problems.

    Of course, we're already scrubbing users' HTML, so I don't see why it would be too difficult to clean it using f.ex the tagsoup algorithm.

    Makeshifts last the longest.

      I prefer the current PM algorithm for normalizing user HTML over tagsoup's. We could add more knowledge about allowed parent tags but such would be used to escape tags that aren't in the proper parent rather than to close and reopen tags.

      And I feel quite strongly that the priority of goals should weigh practical matters much, much higher than technical milestones like strict compliance with a standard.

      For example, <p> tags will probably never be forced to be strictly nested because there is no practical way to accomplish this given the current state of users and browsers.

      The disadvantage that this prevents using a (compliant) XML parser on some filtered user HTML w/o first filtering <p> tags is a practical disadvantage but of less importance than the practical advantage of allowing people to easily enter their own HTML mark-up that displays well for most of our audience.

      While the disadvantage of <p> tags not strictly complying with any particular standard is not a practical matter. Strict compliance can lead to practical benefits but complance itself is not a practical benefit. So strict compliance can be desirable for many reasons such as setting a good example, geeky pride, pedantic intollerance, etc., but such concerns don't even cast a shadow IMO compared to even relatively minor practical advantages if such conflict with each other.

      - tye        

        Hmm… what's the problem in simply dropping in a closing </p> as soon as an opening one is encountered, while another <p> is already open?

        Tagsoup was just an example anyway.

        Compliance is no goal in itself, but I also don't think compliance should be neglected off-hand. We should rather try to do all we can to get as close to compliance as is realistically doable within the constraints of the goals.

        Makeshifts last the longest.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://408541]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (4)
As of 2024-04-25 15:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found