I admit I don't quite get the scope of the whole problem. But as for only allowing certain tags, you could start with:
#!/usr/local/bin/perl -l -w
my $str="<bad tag><a good tag> hello there<br></bad tag></a>";
my @good_tags = qw(p a font br h1 h2 h3 h4 h5 h6);
my %good_tags;
@good_tags{@good_tags} = ();
$str =~ s!(</?(\w*).*?>)!exists $good_tags{lc($2)} ? $1 : ''!eg;
print $str;
You can replace the '$1' by some function to replace characters as you see fit, or capture the '.*?' to $3 and pass it along with $2 to a function to verify whether or not you allow extra attributes with that particular tag. Either way I'd do it in more than one step.
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|