Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Fixing Bad HTML

by pg (Canon)
on Nov 17, 2002 at 00:02 UTC ( [id://213475]=note: print w/replies, xml ) Need Help??


in reply to Fixing Bad HTML

Use stack. When you see an open tag, push it on to the stack, see a close tag, compare it with the last element in the stack, match than pop it out, otherwise deal with the error. If the tag is self-closed, either don't push it, or push then pop, depends on the way you treat the content.

Replies are listed 'Best First'.
Re: Re: Fixing Bad HTML
by Cody Pendant (Prior) on Nov 17, 2002 at 01:25 UTC
    Thanks for that. That's a structure at least. But what if the thing to be closed isn't the last item in the stack, like if someone's crossed over tags:
    blah blah <B>blah blah<I> blah blah</B></I>
    which is bad HTML, but not problematic in this context?
    --
    ($_='jjjuuusssttt annootthheer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;

      Either do as saouq sez and just don't create a mis-feature or... jump right in and do the beastly thing yourself (you do no one favors by enabling bad behaviour). If I were to actually do this you could also consider keeping track of how many tags have been opened and be sure to close them before ending your user-accessible section.

      __SIG__ use B; printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE;
      which is bad HTML, but not problematic in this context?

      Opt for being strict. Disallow such crappy markup.

      -sauoq
      "My two cents aren't worth a dime.";
      
      Though I'd be inclined to disallow sloppy markup like this (as others have suggested), one option I've used in the past is to backtrack up the stack looking for a matching tag and autoclosing any open tags I pass along the way.

      In this case that would proceed something like this. You get to the </B> and look at the tag at the top of the stack. It's not a <B>, it's an <I>, so you generate a </I> yourself and pop that off the stack, then try again. This time it is a <B> so you can just pop it off the top and you move on.

      The next closing tag is </I>. Since there's no matching open tag on the stack, you simply remove it.

              $perlmonks{seattlejohn} = 'John Clyman';

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://213475]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2025-01-14 15:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Which URL do you most often use to access this site?












    Results (42 votes). Check out past polls.