http://www.perlmonks.org?node_id=43852
Category: Web Stuff
Author/Contact Info
Description:

This Perl filter fixes bad HTML comments, such as <!----------- really ---------bad ------ comments ---------->. (Such comments are bad because, according to the spec, each -- -- pair within <! > delimits a comment. This means <!-- -- --> is not a complete comment, for example.)

The code reads in the entire file, then finds each occurrence of <!-- ... --> and uses tr/// to squash each run of hyphens to a single hyphen. The assignment to $x is necessary because $1 is read-only.

#!perl -p0

s/<!-(-.*?-)->/ (my $x = $1) =~ tr,-,,s; $x = '--' if $x eq '-'; "<!-$
+x->" /gse;

UPDATE: Fixed so it no longer turns <!----> into <!--->. Thanks to extremely for pointing that out.

UPDATE: Caveat: don't use this code on files where Perl code may be embedded in an HTML comment, as in this HTML::Mason example: <!-- <% $x-- %> -->. Thanks to extremely for pointing this out too.

Replies are listed 'Best First'.
Re: fix bad HTML comments
by extremely (Priest) on Nov 29, 2000 at 09:56 UTC
    Great so when I put a HTML::Mason tag in a comment for testing porpoises (yeah, fishing for errors) like this: <!-- <% $x-- %> --> you gonna blow it down? Oh yeah, what if they do <!---->?

    I shouldn't pick on you but you should really search about on this site a bit, super search on keywords qw( regex HTML fix ); and see what you get...

    --
    $you = new YOU;
    honk() if $you->love(perl)