Re: Can't match negated words.
by Abigail-II (Bishop) on Jun 24, 2004 at 12:44 UTC
|
If you just want to match a line where a particular word doesn't appear in, !~ does the trick. But if "not this word" is part of a regular expression, the !~ will not do it. Instead, you basically have to "progress carefully", looking ahead on each step on your way. That is, match a character (any character) after you've concluded it doesn't start a forbidden word. If no character in the (sub)string you match doesn't start a forbidden word, no forbidden words will be in the matched string. How do you check? Use negative lookahead:
/^(?:(?!throw).)*$/s
Abigail | [reply] [d/l] |
|
Hi folks,
Thanks for all the replies!! This is great. I'm used to forums that take ages or never get back to you..
Abigail, ye've nailed it. That's exactly what I'm looking for, where "not throw" is part of the expression.
I'm not familiar with the look ahead stuff but I've a couple of places with examples and I'll look at them.
Incidentally are there any online resources you guys could recommend?
Thanks again everyone,
Mark.
| [reply] |
Re: Can't match negated words.
by Fletch (Bishop) on Jun 24, 2004 at 12:26 UTC
|
You need to reread perldoc perlre. You've got a negated character class that's looking for one character that's not one of the ones listed. What you're really wanting is !~ to negate the sense of the search.
| [reply] [d/l] [select] |
Re: Can't match negated words.
by rinceWind (Monsignor) on Jun 24, 2004 at 12:28 UTC
|
[^throw] matches a single character other than t, h, r, o or w. I don't think this is what you mean.
You are better off using !~, as regexp matching is positive matching rather than negative matching.
-- I'm Not Just Another Perl Hacker
| [reply] [d/l] |
Re: Can't match negated words.
by kesterkester (Hermit) on Jun 24, 2004 at 12:29 UTC
|
Hi Mark--
The /[^throw]/ regexp isn't doing what you think it's doing-- you've created a character class (with the square brackets) containing the letters 't', 'h', 'r', 'o', and 'w', and are matching on anything that doesn't contain those (the ^ negates a character class.
So if the throw line in your Java contains ANY characters that aren't t,h,r,o, or w, it'll match. This is why you're getting unexpected behaivior.
You're on the right track with using !~. That should do what you mean.
Try running "perldoc perlre" on your local system for a good intro to this type of thing.
| [reply] [d/l] [select] |
Re: Can't match negated words.
by perlgags78 (Acolyte) on Jun 24, 2004 at 14:58 UTC
|
Hi folks,
Thanks for pointing that out Hugo.
I'm currently trying to extend it so that for the line
/* sdfthrow OtherException
prints 'Opening extended comment'
but does not for the line
throw new Exception() /* asdfasdf
Basically so long as the '/*' hasn't been preceeded with
a throw then it should print out opening the comment.
I've got the following expression in place.
if ($line =~ /^.*\/\*((?:(?<!throw)).*$/ )
{
debug("Opening extended comment");
}
Is my understanding this expression correct? Would folks mind if I explain what I think is going on from left-right?
/^ matches the start of the $line string
.* says that the start can be proceeded by any number of characters
\/\* matches the '/*' string
(?:(?<!throw)) means so long as /* isn't preceeded in the string by the word throw $line still matches
.*$ means that any number of characters can proceed the /* upto the end of the line
My understanding's obviously incorrect cos eh.. it don't work for a lad.
Any help would be greatly appreciated,
Mark.
| [reply] [d/l] [select] |
|
/* throw /*
you might want to use something like:
m!^(?:(?!throw).)*/[*]!s
although that can probably be optimized (and the more you know about where you are going to match against, the more possibilities for optimizing there are).
Abigail | [reply] [d/l] [select] |
|
Hi Abigail,
I'd like to match the first '/*' and check that there's no throw declared before it.
I think it should look something like this?
m!^/[*].*?(?:(?!throw).)*/[*]!s
I can't really understand what the '.)*' means after the throw?
You could explain it briefly could you?
Thanks,
Mark.
| [reply] [d/l] |
|
|
Putting dot-star next to an anchor is pointless. Just throw out the anchor and the dot-star. That leaves you with:
/\/\*((?:(?<!throw))/
You've got capturing parentheses around non-capturing parentheses, around a negative lookbehind. You only need the parens for the negative lookbehind:
/\/\*(?<!throw)/
Ok, now you've matched "/*", and at that point, you're looking back to ensure that what comes before you isn't "throw". It can't be, because it ends in "/*". You can't really check everything up to the "/*" with a negative lookbehind, because negative lookbehinds can't be variable-length, and your line can be. You can do it with negative lookahead:
if ($line =~ /^(?:(?!throw).)*?\/\*/)
That will be any number of characters that isn't the start of "throw", followed by "/*". The *? makes it take the first "/*" rather than the last.
Please see this node about YAPE::Regex::Explain for a helpful module.
We're not really tightening our belts, it just feels that way because we're getting fatter.
| [reply] [d/l] [select] |
Re: Can't match negated words.
by perlgags78 (Acolyte) on Jun 24, 2004 at 13:28 UTC
|
Hi folks,
I may have screwed my original post up.
I wanted to simply reply but instead I seem to have deleted the original message.
I amended my code as follows
if ($line =~ /(?:(?!throw).)/ )
{
debug ("Doesn't contain throw");
}
yet it still printed the statement for the line
//throw OtherException
I've put '.*' at the start and end of the reg exp but it still doesn't see the throw in the throw line.
Thanks,
Mark. | [reply] [d/l] [select] |
|
if ($line =~ /^(?:(?!throw).)*$/s) {
debug ("Doesn't contain throw");
}
An alternative formulation, which I find slightly cleaner, is to recast it as a single negative lookahead: if ($line =~ /^(?!.*?throw)/) { ... }
Hugo | [reply] [d/l] [select] |
|
hi Hugo,
I see that you've a ? before the throw. What function has this? Is the '?' associated with the .* or the throw?
Also I'm having hassle getting return characters to appear in my posts so they kinda look like one line posts. Any ideas?
Thanks,
Mark.
| [reply] |
Re: Can't match negated words.
by perlgags78 (Acolyte) on Jun 24, 2004 at 13:50 UTC
|
Hi folks,
I misinterpretted Abigail's original code suggestion
and as soon as I swapped it in it worked fine. Thanks again
for that Abigail.
Thanks,
Mark.
| [reply] |
Re: Can't match negated words.
by perlgags78 (Acolyte) on Jun 24, 2004 at 12:53 UTC
|
Hi folks,
Thanks for all the replies!! This is great. I'm used to forums that take ages or never get back to you..
Abigail, ye've nailed it. That's exactly what I'm looking for, where "not throw" is part of the expression.
I'm not familiar with the look ahead stuff but I've a couple of places with examples and I'll look at them.
Incidentally are there any online resources you guys could recommend?
Thanks again everyone,
Mark. | [reply] |
Re: Can't match negated words.
by perlgags78 (Acolyte) on Jun 24, 2004 at 17:17 UTC
|
Can anyone explain what is mean by clustering and capturing?
Or even the difference between them?
I'm reading the docs and came across this.
This is for clustering, not capturing; it groups subexpressions like "()", but doesn't make backreferences as "()" does. So
@fields = split(/\b(?:a|b|c)\b/)
is like
@fields = split(/\b(a|b|c)\b/)
Thanks,
Mark.
| [reply] |
|
Clustering is grouping, like in an algebraic expression. Parentheses limit how far back and forward an alternator (vertical bar) applies:
/foo|bar/; # Matches "foo" or "bar"
/fo(o|b)ar/;# Matches "fooar" or "fobar"
Grouping also allows quantifiers to apply to more than one atom:
/foo{3}/ # Matches "foooo"
/(foo){3}/ # Matches "foofoofoo"
Capturing is storing the parenthesized portion of the match somewhere that you can refer back to it (as $1, or as an element of the list returned by a match, for example). Ordinary parentheses are capturing parentheses. Special parentheses (any that have a ? after the opening paren) are non-capturing. All parentheses group their contents.
We're not really tightening our belts, it just feels that way because we're getting fatter.
| [reply] [d/l] [select] |