http://www.perlmonks.org?node_id=899785


in reply to grep trouble

Passing an empty string in $_ creates an empty regexp pattern, as in "m//". This is a special case discussed in perlop, here:

The empty pattern //

If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead.....If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match).

Your first match occurs in that vacuum of acting like a genuinely empty pattern, which will always match. The final test is like asking if "" =~ m/foo/, which doesn't match.


Dave

Replies are listed 'Best First'.
Re^2: grep trouble
by LanX (Saint) on Apr 17, 2011 at 14:03 UTC
    Are there any reasonable use cases/idioms for this feature?

    If not, shouldn't there be a pragma to disable these edge cases?

    Cheers Rolf

      It could be a convenience for a switch with fall-through, or along similar (but more oddball lines), possibly an implementation of Duff's Device.


      Dave

        Thanks!

        IMHO sounds like far less than 1‰ of all CPAN modules might rely on this.


        UPDATE:

        OK found another use-case, I just recently had the need in my ORG-Parser to distinguish the range delimiters and the range "body" with a flip-flop-operator, like for ORG's "BEGIN/END"-blocks.

        This can be simplified (i.e. more DRY), if one has access to the last successful pattern:

        DB<114> for(0..99) {print if (/10/../20/ and not //)} 111213141516171819 DB<115> for(0..99) {print if (/10/../20/)} 1011121314151617181920 DB<116> for(0..99) {print if (/10/../20/ and //)} 1020

        I think that's a more frequent application, I even slightly remember seeing it in Friedl's book.

        But I'd rather prefer an explicit special varą, something like $PATTERN or $&& (in analogy to $MATCH resp. $&) .

        Cheers Rolf

        1) an special var has the advantage that the regex itself can be accessed, e.g. printed.

        update

        for limitations see update in Re^2: Extract table from a block of text (updated)

      It furthermore won't match at the same place twice, being a special case for a zero-length match. So // is just perfect for splitting a string into individual character, using split. I did that just yesterday.

Re^2: grep trouble
by LogMiner (Novice) on Apr 17, 2011 at 14:04 UTC
    Thanks davido. I guess the grep-evaluated block should look like:

    {$_ eq "" or "" =~ /$_/}

    (of course the actual logic in my script is different, as there are easier ways to do the above)
      I'm still not sure what you're trying to achieve here, cause your grep returns the PATTERNs which matched.

      Are those patterns simple words? If yes you could consider constructing an or-regex:

      DB<100> @patterns=qw#one two three# DB<101> $str="one two" DB<102> $re=join "|",@patterns DB<103> print $str =~ m/($re)/g onetwo

      UPDATE:

      just noticed there are still subtle differences:

      DB<106> $str="two one two" DB<107> print $str =~ m/($re)/g twoonetwo DB<108> print grep {$str=~/$_/} @patterns onetwo

      Cheers Rolf

      you also have to check for undef:

      DB<120> print scalar grep {"a"=~/$_/} ("a","",undef) 3

      Cheers Rolf