Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Re: Regex help

by sauoq (Abbot)
on Oct 01, 2002 at 21:32 UTC ( #202121=note: print w/replies, xml ) Need Help??

in reply to Regex help

If you are using this to strip potentially malicious code, you should be more liberal in what you match.

1. Use /i because tags can be upper, lower, or mixed cases.

<script>CODE</script> <SCRIPT>CODE</SCRIPT> <ScRiPt>CODE</ScRiPt>

2. Be careful of whitespace in tags.

<script >CODE</script> <script>CODE</script >

3. Be careful of what gets left behind after you strip it. (The following example is a good reason not to use a non-greedy match.)


I'd use something like jeffa's and eliminate as much as possible. I don't see any immediate problems with this: s#<script.*script\s*>##gis; but I didn't test it very thoroughly and there may be some. You might consider substituting repeatedly until nothing matches in order be sure you've avoided the 3rd issue above but that may well be overkill.

"My two cents aren't worth a dime.";

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://202121]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (8)
As of 2018-06-24 17:34 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (126 votes). Check out past polls.