Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Negative Lookahead Assertion Problem

by Limbic~Region (Chancellor)
on Jul 18, 2005 at 23:05 UTC ( #475910=perlquestion: print w/replies, xml ) Need Help??
Limbic~Region has asked for the wisdom of the Perl Monks concerning the following question:

In #perl tonight on IRC, someone asked how to match 'Foo::Bar' but not match 'Foo::Bar\s*(' Not having very strong fu, I looked up perlre and found a negative lookahead assertion
I was told that it was inadequate because it didn't work in the general case (where Foo::Bar is getting replaced by \w+::\w+). I found that Foo could change to \w+ without a problem but that Bar couldn't. I believe this has to do with the 0-width part about assertions but why, in plain english, doesn't the following work?
And - how would I properly answer this question in the future? While I have never tried to claim I had more regex knowledge then I have - I find not knowing the answer to this a bit embarrasing after using the language for 3 years.

Cheers - L~R

Update:Thanks all! I really should spend some time with The Owl and TFM.

Replies are listed 'Best First'.
Re: Negative Lookahead Assertion Problem
by itub (Priest) on Jul 18, 2005 at 23:13 UTC
    Let's say you are trying to match /\w+::\w+(?!\s*\()/ against 'Foo::Bar ('. Initially the \w+::\w+ will match "Foo::Bar"; then it notices that the assertion fails (because what follows is " ("), so it will backtrack to Foo::Ba, where it will try the assertion again, and it will succeed (because "r (" isn't matched by \s*( ).

    To get what you want, you could try to ensure that you reached the end of the word, with something like /\w+::\w+\b(?!\s*\()/ .

      /\w+::(?>\w+)(?!\s*\()/ might be better for the person that has to read your code later
Re: Negative Lookahead Assertion Problem
by Enlil (Parson) on Jul 18, 2005 at 23:40 UTC
    Since your question as to why it doesn't work has been answered. (Basically it will backtrack till it does, and by moving a character back satisfies the regex.)

    This should work:/\w+::(?>\w+)(?!\s*\()/

    #!/usr/bin/perl use strict; use warnings; my @strings = ( '1. Foo::Bar', '2. Foo::Bar(', '3. Foo::Bar (', '4. Foo::Bar ' ); for ( @strings ) { print $_,$/ if /\w+::(?>\w+)(?!\s*\()/; } __END__ 1. Foo::Bar 4. Foo::Bar
    Have a look at perlre where it reads about (?>pattern)


Re: Negative Lookahead Assertion Problem
by BrowserUk (Pope) on Jul 18, 2005 at 23:41 UTC

    How about

    /Foo::Bar(?!\s*\()(?=\W|$)/ #or /\w+::\w+(?!\s*\()(?=\W|$)/

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Re: Negative Lookahead Assertion Problem
by monarch (Priest) on Jul 18, 2005 at 23:58 UTC
    Surely trying to match \s* is futile in a negative-lookup? Because \s* can match nothing it will always succeed, thereby never matching?? I tried using \s+.

    The following code:

    #!/usr/bin/perl -w use strict; my @strings = ( "Foo::Bar()", "Foo::Bar ()", "Foo::Bar ()", " Foo::Bar()", " Foo::Bar ()" ); my %test = ( 1 => qr/Foo::Bar(?!\s*\()/, 2 => qr/Foo::Bar(?!\s+\()/ );
    produces the following output
    ---Foo::Bar()--- test 1 does not match test 2 matches ---Foo::Bar ()--- test 1 does not match test 2 does not match ---Foo::Bar ()--- test 1 does not match test 2 does not match --- Foo::Bar()--- test 1 does not match test 2 matches --- Foo::Bar ()--- test 1 does not match test 2 does not match
      You missed the literal ( after the \s*. Together, \s*( does specify a meaningful assertion. In your tests, all your sample data is supposed to not match by my understanding of the OP. Only something such as "Foo::Bar is my favorite module" should match.
        I'm confused by your post. Breaking down qr/Foo::Bar(?!\s+\()/ we get:
        Foo::Bar # literal 'Foo::Bar' (?! # start of negative look-ahead \s+ # one or more white space chars \( # literal opening bracket ) # end of negative look-ahead
        so to answer your reply, I believe I provided the literal ( after the \s*.

        To answer your other statement, that something such as "Foo::Bar is my favourite module" should match, I agree - that is how I read the initial requirement from the thread. In which case my answer of using \s+ kept within the spirit of the original statements made by the thread author. However I wanted to keep my example in line with the example that the author had tried, hence the use of the literal (. Hope this clears up my answer.

      I have to agree with ysth. The \s* is not meaningless because it is followed immediately by a literal (which is what his comment was referring to). To be a complete match it has to match 0 or more whitespaces and an open paren. In your solution there has to be at least 1 space - which doesn't fit the bill.

      Cheers - L~R

Re: Negative Lookahead Assertion Problem
by Anonymous Monk on Jul 18, 2005 at 23:19 UTC
    someone on IRC pointed out: /(\w+?::\w+)\b(?!\s*\()/ It's not the answer to your question, though
      you have an extra ?
Re: Negative Lookahead Assertion Problem (Perl 6)
by kelan (Deacon) on Jul 19, 2005 at 12:10 UTC

    Here's how to do it in Perl 6 (I believe):

    rx/ \w+ \:\: \w+ :: <!before \s*(> / # # ^^ important
    This is basically your example with a small addition. The double colons pointed out above tell the engine not to backtrack once it reaches that point, so it will work this way where it wouldn't in Perl 5.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://475910]
Approved by blokhead
[Corion]: A good daystart to everybody!
[Corion]: Just a quick poll - is anybody actively relying on https://perlmonks. I plan to retire that URL in favour of moving all our servers onto the same HTTPS certificate for perlmonks.{com, net,org}
[Corion]: Actually bsd_glob '{www.,}perlmonks .{com,net,org}', plus I think
[Corion]: Sad that Let's Encrypt does not allow wildcard certificates, but they could be somewhat of a hassle to verify

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (9)
As of 2017-09-26 07:45 GMT
Find Nodes?
    Voting Booth?
    During the recent solar eclipse, I:

    Results (292 votes). Check out past polls.