Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Regex AND NOT with zero-width negative lookahead assertion

by mldvx4 (Scribe)
on Mar 25, 2020 at 11:39 UTC ( #11114633=perlquestion: print w/replies, xml ) Need Help??

mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to use zero-width negative lookahead assertions to add an AND NOT logical clause to my pattern. I wish to match /aaa/ and /aaa\/aaa/ but not /aaa/aaa/ which is to say a span delimited by unescaped slashes. What I have so far matches too much:

#!/usr/bin/perl use strict; use warnings; while (<DATA>) { # print $+{pattern},qq(\n) if ( # for now, show the whole line print if ( m, ^ (?!\x23) (?:.*?) (?<pattern> m? (?<delimiter>[/]) (?:(?:\\?+.)*?)*? \g{delimiter} ) ,x ); } __DATA__ #!/usr/bin/perl foo bar foo/bar /src/bin/oops/otehnoes if(/ok .*$/) { print "not OK\n"; } # skip a comment if(m/a good match/( { print "not ok\n"; } # do not /print/ this line either $string =~ s/[a-f]//g; # but print this line ( $string ) = ( $string =~ /(show this)/)); /but show\/this, too/ my $butdontprintthis = "/var/cache/dictionaries-common";

However, I have not found a successful way to match /aaa/ while at the same time rejecting /aaa/aaa/

What should I append towards the end of the pattern to ensure that /aaa/aaa/aaa/ and other similar strings are not accepted by the pattern yet /aaa/ alone would be? I have tried many scores of permutations of what to tack on, but no sucess yet. The script above produces the following result:

/src/bin/oops/otehnoes if(/ok .*$/) { if(m/a good match/( { $string =~ s/[a-f]//g; # but print this line ( $string ) = ( $string =~ /(show this)/)); /but show\/this, too/ my $dontprint this = "/var/cache/dictionaries-common";

But it should produce the following instead:

if(/ok .*$/) { if(m/a good match/( { $string =~ s/[a-f]//g; # but print this line ( $string ) = ( $string =~ /(show this)/)); /but show\/this, too/

Replies are listed 'Best First'.
Re: Regex AND NOT with zero-width negative lookahead assertion
by hippo (Chancellor) on Mar 25, 2020 at 12:13 UTC

    I think this might do what you want.

    use strict; use warnings; use Test::More tests => 1; my $want = <<'EOT'; if(/ok .*$/) { if(m/a good match/( { $string =~ s/[a-f]//g; # but print this line ( $string ) = ( $string =~ /(show this)/)); /but show\/this, too/ EOT my $have = ''; while (<DATA>) { $have .= $_ if m~^(?!#)[^/]*/[^/]+(\\/|/(?!\w+/?))~; } is $have, $want; __DATA__ #!/usr/bin/perl foo bar foo/bar /src/bin/oops/otehnoes if(/ok .*$/) { print "not OK\n"; } # skip a comment if(m/a good match/( { print "not ok\n"; } # do not /print/ this line either $string =~ s/[a-f]//g; # but print this line ( $string ) = ( $string =~ /(show this)/)); /but show\/this, too/ my $butdontprintthis = "/var/cache/dictionaries-common";

      Thanks. (And thanks to those below, too.)

      That seems to achieve the desired results, if I escape the pound sign (#) there. Otherwise that gets interpreted as a comment.

        How odd. It works fine exactly as posted for me in Perl 5.20.3. Which Perl version are you using which requires the escape?

        Update: also tested successfully on v5.10.1, v5.16.3, v5.26.1 and v5.30.0.

Re: Regex AND NOT with zero-width negative lookahead assertion
by Veltro (Hermit) on Mar 25, 2020 at 13:07 UTC

    Haven't looked at hippo's answer yet but in the mean while I tried:

    #!/usr/bin/perl use strict; use warnings; while( <DATA> ) { chomp ; print "$_ -> " ; if ( m/^\/(?:[^\/]|(?<=\\)\/)+\/$/ ) { print " matched\n" ; } else { print " not matched\n" ; } } __DATA__ /aaa/ /\a\a\a\/\b\b\b/ /aaa\/bbb/ /aaa/bbb/

    Output

    /aaa/ -> matched /\a\a\a\/\b\b\b/ -> matched /aaa\/bbb/ -> matched /aaa/bbb/ -> not matched
Re: Regex AND NOT with zero-width negative lookahead assertion
by Anonymous Monk on Mar 25, 2020 at 12:41 UTC
      Specifically, this might serve -
      use Regexp::Common qw /delimited/; $str =~ /^$RE{delimited}{-delim=>'/'}$/;
      The $RE{delimited} regex accepts backslash escapes by default.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11114633]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (6)
As of 2020-04-04 13:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The most amusing oxymoron is:
















    Results (32 votes). Check out past polls.

    Notices?