http://www.perlmonks.org?node_id=606495

lokiloki has asked for the wisdom of the Perl Monks concerning the following question:

The following presumably means match anything that does not contain either "a" or "b":

[^ab]+

How can I write code that means match anything that does not contain, for instance, "ah" or "bh"? Would this work:

[^(ah|bh)]+

Or this:

(^(ah|bh))+

Replies are listed 'Best First'.
Re: RegEx: How to negate more than one character?
by Sidhekin (Priest) on Mar 26, 2007 at 00:15 UTC

    How can I write code that means match anything that does not contain, for instance, "ah" or "bh"? Would this work:
    [^(ah|bh)]+

    Your question is not as simple as it seems. Strings of characters aren't quite interchangeable with characters. There are several semantics availible that all reduce to the same case for single-character strings.

    This one may match, as the first character of the match, an "h" from "ah" or "bh":

    qr/(?:(?!ah|bh).)+/s;

    This one may match, as the last character of the match, an "a" or "b" from "ah" or "bh":

    qr/(?:.(?<!ah)(?<!bh))+/s;

    This one may match neither of the above; for instance, only "xx" will be matched in "bhxxah":

    qr/(?:(?!ah|bh).(?<!ah)(?<!bh))+/s;

    This one may match either or both (or, with /g, all) of the above, for instance "b", "hxxa", and/or "h" from "bhxxah":

    qr/(?!ah|bh).(?:.(?<!ah)(?<!bh))*|(?:a|b)/s;

    And that's just for simple two-character strings ...

    print "Just another Perl ${\(trickster and hacker)},"
    The Sidhekin proves Sidhe did it!

Re: RegEx: How to negate more than one character? ((?!...).)
by tye (Sage) on Mar 26, 2007 at 00:54 UTC
Re: RegEx: How to negate more than one character?
by shigetsu (Hermit) on Mar 25, 2007 at 23:07 UTC

    I recommend reading perlrequick and perlretut to gather some insight on how regular expressions work.

    Nevertheless, here's a possible implementation:

    my $re = qr/(?:a|b)h/; print 'abcdef' !~ $re; print 'ahcdef' !~ $re; print 'abbhef !~ $re;
    yields
    1 (0) (0)
      thanks... but I am looking for the particular negation to be part of a much more complex overall regular expression... i.e., how can i do the negation within the expression itself, rather than relying on !~?

      What I am trying to do is match outermost <% ... %> "brackets" while also handling possible internal pairs of my own unique bracketing system:

      $re = qr{\<\%(?:(?>[^(\<\%|\%\>)]+)|(??{$re}))\%\>};

      I guess what I am looking for is a replacement to the [^(\<\%|\%\>)]+ part above (unless that is correct).

        You can proably use a negatively asserted look ahead: $str =~ m/$pattern(?!ab|cd)$morepattern/;. It is a zero-width pattern, but it might do what you want. It's about half-way through perlre.

        -Paul

Re: RegEx: How to negate more than one character?
by Zaxo (Archbishop) on Mar 26, 2007 at 05:28 UTC

    I think

    my $re = qr/(?:[^ab][^h])|(?:[^ab][h])|(?:[ab][^h])/;
    does what you want. That technique will get awkward quickly as more excluded combinations are added.

    Matching a longer string will nearly always succeed unless you anchor the match to some expected position or embed it in a larger expression. You may want to consider negative matching,

    $string !~ /[ab]h/;
    for detecting strings which contain no example of 'ah' or 'bh'.

    After Compline,
    Zaxo

Re: RegEx: How to negate more than one character?
by gube (Parson) on Mar 26, 2007 at 00:56 UTC
    Hi,
    #!/usr/local/bin/perl use strict; use warnings; my $match = "sfajdsllllllllllbhafdssss"; print "Yes...Working.." if ($match !~ m#(ah|bh)#gi);