|
|
| Problems? Is your data what you think it is? | |
| PerlMonks |
Unbalanced Tagsby lzcd (Pilgrim) |
| on Jan 19, 2001 at 01:50 UTC ( [id://52917]=perlquestion: print w/replies, xml ) | Need Help?? |
This is an archived low-energy page for bots and other anonmyous visitors. Please sign up if you are a human and want to interact.lzcd has asked for the wisdom of the Perl Monks concerning the following question:
Howdy, Iâm in the process of writing what could be considered a poor mans version of Everything and have finally wandered around to the section dealing with submitted HTML stuff (a.k.a Node editing). While Iâm fairly sure I can fiddle around with modules such as HTML:: to sift out any tags outside of the relatively safe ones (eg. Br,hr,p,b,strong and & chars), I am unsure as how to proceed with the whole issue of unmatched tags. I know there are some HTML tricks thatâll allow the later series of browsers to âoverlookâ such nasties as unbalanced tags but I would prefer to keep it safe and produce nice clean HTML 1.+ type code. My current thinking on the subject is going along the lines of creating a small hash to hold a âlevel countâ for each tag, adding or subtracting from the count through the parse process and then dumping a series of close tags for any tags that still appear âopenâ. This approach, IFAIK, will work fine for the simpler tags, where overlapping is okay but Iâm worried about what happens if I ever decide to progress to more complex tags, such as table handing, where the order of closing is important. Is there a super dooper HTML::Parse->CloseAllDemTagsProperly call that Iâve missed? In the process of producing the PM site and the like, has somebody refined a handstrung routine to the point where it does everything short of write the legal notice? Thank you for your Infomercial time.
Back to
Seekers of Perl Wisdom
|
|
||||||||||||||||||||||||||||||||||||