Re: Regex help

I see a few potential problems here.

The regex you show won't work, because you didn't escape your / in </script>
the dot (.) doesn't match _anything_, by default it doesn't match newlines. The /s modifier at the end of the regex changes that behavior to what you want. (see perlre)

So s/<script>(.*?)<\/script>//sg; should do what you want. I can't speak to the validity of what you're trying to do, but that should make the perl work :)

Update: Paren typo corrected per fglock below. (Was (.*)?, which would be a greedy match, with the ? essentially pointless, acting on a * modified group) I left the parens in to show a capture, but fglock is completely correct that you don't need the parens.

Comment on Re: Regex help Download Code

Replies are listed 'Best First'.
Re: Re: Regex help by fglock (Vicar) on Oct 01, 2002 at 20:34 UTC
You mean `(.?)` Actually you don't need parenthesis: `s/<script>.?<\/script>//sg;`	[reply] [d/l] [select]
Re: Regex help by Anonymous Monk on Oct 01, 2002 at 23:40 UTC
Nope. The parenthesis are optional, but can be VERY useful. For example, say you want to remove the <script> and </script>, but be able to give some sort of warning about the script tags. For example, you may filter out: `<script> malicious_code_to_do_something_nasty </script>` [download] If you use your regex as <script>(.?)</script>, it saves the smallest amount (the ?) of anything (the .) into a variable. That variable name depends on how many sets of parenthesis you've used. If it's the first (and only) time you use them, it gets saved into $1. If the second time, $2, and so forth. You can use it for something like this: `$text = "my name is john q user\n"; $text =~ s/^my name is (.?) .$/$1/; # removes "my name is ", saves the next word, essentially, into $1, re +moves the rest print "hello, $text!\n"; # prints "hello, john!\n"` [download] This is VERY useful in extracting information from strings. -dingoStick.com	[reply] [d/l] [select]

In Section Seekers of Perl Wisdom