http://www.perlmonks.org?node_id=126653


in reply to Re: Larry vs. Joel vs. Ovid
in thread Larry vs. Joel vs. Ovid

I wouldn't want the browser to pop up with a big freaky modal dialog every time I loaded a page with somewhat suspect HTML, but I would like to see some indication that the page isn't well-formed -- perhaps a message in the ubiquitous status bar to the effect of "This page is not valid HTML, and may not be displayed properly". As a user, I'd like to be told if the page I'm looking at is garbled (often one can tell immediately if a page hasn't rendered the way the designer would have liked, but I think it's reasonable to believe that some pages would be mis-rendered with subtle, important errors), and as a developer I'd sure like to know. (Of course, as a developer, I have plenty of HTML validators at my fingertips.)

And bringing this back on topic... this point generalizes fairly well to programming in general. That's why compilers have warnings as well as error messages. I see no reason why all programs, especially those that talk to people, should insist on conservatively correct input. If you get suspicious input, emit a warning and do the best you can. That way, if you're relying on vaguely bogus input from someone else's software (for instance), you can still get work done. The difference between this model and the web browser problem is that Our Favourite Web Browsers(tm) don't give any (easily accessible) warnings about malformed HTML, so Joe Luser has no idea that they've just written awful markup.

That said, within your own code strict adherance to contracts (for instance) is an excellent idea. If you're generating bogus data, you're going to want to know about it, not fudge it and hope for the best, and it's much more difficult to ignore a confess than a carp.

--
:wq
  • Comment on Re(2): Larry vs. Joel vs. Ovid vs. Masem vs. Web browsers

Replies are listed 'Best First'.
Re: (FoxUni) Re(2): Larry vs. Joel vs. Ovid vs. Masem vs. Web browsers
by merlyn (Sage) on Nov 21, 2001 at 05:19 UTC
    I would like to see some indication that the page isn't well-formed
    Yes, and iCab does that, by putting a little smiley-face/frowny-face icon on the location panel. Of course, it's always frownyface on PerlMonks, so you can press it to get the errors. Here are the errors for this page as I type this:
    http://www.perlmonks.org/index.pl?title=%28FoxUni%29%20Re%282%29%3A%20 +Larry%20vs.%20Joel%20vs.%20Ovid%20vs.%20Masem%20vs.%20Web%20browsers& +parent=126653&lastnode_id=126653&node=Offer%20your%20reply&parent_nod +e=126653 Altogether 79 errors found. Only 25 errors are listed below. Error (9/4): The tag <layer> is not part of HTML 4.0. Error (9/93): The end tag </layer> is not part of HTML. Warning (9/162): In the tag <TD> the attribute "WIDTH" should only con +tain absolute pixel values. Error (9/285): In the tag <IFRAME> the attribute "FRAMESPACING" is not + allowed. Warning (9/606): In the tag <TD> the attribute "WIDTH" should only con +tain absolute pixel values. Warning (9/665): In the tag <TD> the value of the attribute "WIDTH" mu +st be enclosed in quotes. Warning (9/665): In the tag <TD> the attribute "WIDTH" should only con +tain absolute pixel values. Error (10/166): In the tag <INPUT> the attribute "BORDER" is not allow +ed. Warning (16/106): The tag <FONT> should no longer be used since HTML 4 +.0. Error (16/120): The character '&' must be written as '&amp;'. Error (16/120): The character '&' must be written as '&amp;'. Error (16/200): The character '&' must be written as '&amp;'. Error (16/263): The character '&' must be written as '&amp;'. Error (16/1175): The character '&' must be written as '&amp;'. Warning (21/1): The tag <CENTER> should no longer be used since HTML 4 +.0. Warning (24/3): In the tag <TD> the attribute "WIDTH" should only cont +ain absolute pixel values. Error (27/78): The character '&' must be written as '&amp;'. Warning (84/3): In the tag <TD> the attribute "WIDTH" should only cont +ain absolute pixel values. Error (85/32): The color name "eedddd" is not valid. Error (86/1): In the tag <TR> white space is missing as separator afte +r the attribute "BGCOLOR". Error (86/1): In the tag <TR> the value of attribute "BGCOLOR" is miss +ing. Error (86/1): The attribute "000000" is not part of HTML. Error (86/1): In the tag <TR> the value of attribute "" is missing. Warning (88/8): The tag <FONT> should no longer be used since HTML 4.0 +. Warning (95/4): The tag <FONT> should no longer be used since HTML 4.0 +. Error (102/1): In the tag <TR> white space is missing as separator aft +er the attribute "BGCOLOR". Error (102/1): In the tag <TR> the value of attribute "BGCOLOR" is mis +sing. Error (102/1): The attribute "000000" is not part of HTML. Error (102/1): In the tag <TR> the value of attribute "" is missing. Warning (104/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (111/4): The tag <FONT> should no longer be used since HTML 4. +0. Error (112/13): The character '&' must be written as '&amp;'. Warning (112/79): The tag <FONT> should no longer be used since HTML 4 +.0. Warning (113/1): The tag <FONT> should no longer be used since HTML 4. +0. Error (114/286): The character '&' must be written as '&amp;'. Error (121/1): In the tag <TR> white space is missing as separator aft +er the attribute "BGCOLOR". Error (121/1): In the tag <TR> the value of attribute "BGCOLOR" is mis +sing. Error (121/1): The attribute "000000" is not part of HTML. Error (121/1): In the tag <TR> the value of attribute "" is missing. Warning (123/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (133/1): The tag <FONT> should no longer be used since HTML 4. +0. Warning (135/254): The tag <FONT> should no longer be used since HTML +4.0. Warning (135/1488): The tag <FONT> should no longer be used since HTML + 4.0. Warning (147/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (154/4): The tag <FONT> should no longer be used since HTML 4. +0. Warning (164/8): The tag <FONT> should no longer be used since HTML 4. +0. Warning (168/31): The tag <FONT> should no longer be used since HTML 4 +.0. Warning (168/148): The tag <FONT> should no longer be used since HTML +4.0. Warning (168/277): The tag <FONT> should no longer be used since HTML +4.0. Warning (168/438): The tag <FONT> should no longer be used since HTML +4.0.
    I'm still trying to work with the Everything-engine people to get them to put the right ampersand escaping in URLs. Most of the rest of that goes away if you start using CSS instead of explicit tags. But there's still the odd things, like unquoted parameters, for which all we can say is "sloppy coding, please fix up!".

    -- Randal L. Schwartz, Perl hacker

      Erm. Not sure if you really needed to post all 25 errors. But anyway, it seems like a nice feature - is there an option to change the level of errors it detects? For instance, if I don't care about anything implemented in HTML 3 or above (say I'm on a contract for a company that has Netscape 2 as the default browser company wide or something)? I'll probably be able to answer this myself soon though as I'm downloading iCab now.
Re: (FoxUni) Re(2): Larry vs. Joel vs. Ovid vs. Masem vs. Web browsers
by Masem (Monsignor) on Nov 21, 2001 at 02:03 UTC
    I was going to comment on this to tye's post, but it's just as valid here.

    "What if" Mosaic 0.9 had a pop up dialog that warned of invalid HTML, from day one? Where would we be now? Let me extrapolate:

    Since it would be expected that web page designers would check their own pages using the first generation browsers, they would early on discover their page errors and fix them. The typical end user would have never seen these errors save from sloppy HTML writers that didn't test.

    When the first generation HTML editors would be introduced, they would be careful to make sure that they produced clean, valid HTML code as to make less work on the end writer to clean up this code before it was put on the web.

    From that point on, you'd get the same circle of dependacy as we had in reality, but this time with adherence on strictness. All HTML that would be published today, save by those that lack any QA, would be clean and well-formed...

    ...however, there is the Microsoft factor to consider here. It can be easily suggested that MS would have been to first to disable this pop up dialog to an option that could be turned off after the first instance, or disabled it completely. The implications of this are hard to determine; it could have speed up their 'domination' of the web by offering a solution that allowed 'bad' code through mostly unnoticably, or it could have caused them to be shuned by the community for trying to hide bad code. But once someone did that, others would have followed, and we might be back exactly where we are today, save for the lack of some HTML atrocities like BLINK and FRAME.

    In the today of this history, we would have never accepted that pop-up message during causal browsing, but if history was slightly different, we may have been upset to not find it there when it was needed.

    That's why I refer to the strength of XML; we as a collective computer community are not simply looking at XML as tag soup as HTML was originally, but as a well-structured document in terms of opening and closing tags with attributes. Assuming that you can properly write out this format (which is easy) and read in this format without accepting flaws (difficult, but people have put solutions in place for this already, no need to reinvent said wheel), then the rest of XML which allows for free format of data items is in place. And people do realize that, and are making sure that while they may be sending XML documents that have extra or lacking data, the XML is well formed and does not fail in parsing. This is a very good first step in more adaptable and usable data formats.

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important