Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re: Regex help

by sauoq (Abbot)
on Oct 01, 2002 at 21:32 UTC ( #202121=note: print w/replies, xml ) Need Help??

in reply to Regex help

If you are using this to strip potentially malicious code, you should be more liberal in what you match.

1. Use /i because tags can be upper, lower, or mixed cases.

<script>CODE</script> <SCRIPT>CODE</SCRIPT> <ScRiPt>CODE</ScRiPt>

2. Be careful of whitespace in tags.

<script >CODE</script> <script>CODE</script >

3. Be careful of what gets left behind after you strip it. (The following example is a good reason not to use a non-greedy match.)


I'd use something like jeffa's and eliminate as much as possible. I don't see any immediate problems with this: s#<script.*script\s*>##gis; but I didn't test it very thoroughly and there may be some. You might consider substituting repeatedly until nothing matches in order be sure you've avoided the 3rd issue above but that may well be overkill.

"My two cents aren't worth a dime.";

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://202121]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (9)
As of 2016-10-27 19:29 GMT
Find Nodes?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?

    Results (369 votes). Check out past polls.