http://www.perlmonks.org?node_id=62493


in reply to Re (tilly) 1: Perl is psychic?!
in thread Perl is psychic?!

Woo. This one's got me interested.

I've tested this on perl 5.004_04 for sun-solaris, perls 5.004_05 and 5.6 for i686-linux (redhat) and even ActiveState's 5.6.0 for Win32 and _all_ of them show the same behaviour.
What causes the difference between two variations on this bit of code is whether or not the pattern is plain text (as it says above /blah/ may be optimized to an analogue of index()). If there's no regex compilation then $& causes Segmentation faults.

Using
use re 'debug';
shows that the regex isn't re-evaluated when the $& is entered on STDIN, but it does state explicitly Omitting $` $& $' support. Must say I'm at a bit of a loss as to where the value does come from.

If I were to go out on a limb a bit I would say that I'm thinking that maybe the penalty from using $&, etc in your code is because perl links it into plain text matches as well as compiled regexes. ie $&, etc are always there for full compiled regex's, but index() doesn't normally return the pre-match, match and post-match strings, so the "analogue of index()" requires a bit more work to produce them.

Where's japhy? I get the feeling he'll know :o)

There's a bunch of tests and re 'debug' output below if you're interested: <READMORE>
use re 'debug'; 'foo' =~ m/.*/; print eval <STDIN>;
This gives the following output:
Compiling REx `.*'
size 3 first at 2
   1: STAR(3)
   2:   REG_ANY(0)
   3: END(0)
anchored(MBOL) implicit minlen 0
Omitting $` $& $' support.

EXECUTING...

Matching REx `.*' against `foo'
  Setting an EVAL scope, savestack=3
   0 <> <foo>             |  1:  STAR
                           REG_ANY can match 3 times out of 32767...
  Setting an EVAL scope, savestack=3
   3 <foo> <>             |  3:    END
Match successful!
Before waiting for the input. It actually specifies that it's omitting $&, etc support, yet when you do enter $& still gives the expected answer:
Freeing REx: `.*'
foo
If you use a plain text match (like tilly suggested with /ri/ in 'string', you don't get this result at all, as perl doesn't handle the match in the same way, it "guesses" the result, presumably using a more index() like way of making the match:
use re 'debug'; 'foo' =~ m/o/; print eval <STDIN>;
gives the output:
$ perl reg
Compiling REx `o'
size 3 first at 1
rarest char o at 0
   1: EXACT <o>(3)
   3: END(0) 
anchored `o' at 0 (checking anchored isall) minlen 1
Omitting $` $& $' support.

EXECUTING...

Guessing start of match, REx `o' against `foo'...
Found anchored substr `o' at offset 1...
Guessed: match at offset 1
$&
Segmentation fault (core dumped)
$` and $' don't have quite such drastic efects, they simply print blank.
The extra level of compilation that look(ahead|behind)s give the regex also allow $& to produce the required result:
use re 'debug'; 'foo' =~ m/(?<=f)o(?=o)/; print eval <STDIN>;
Giving:
$ perl reg
Compiling REx `(?<=f)o(?=o)'
size 15 first at 1
rarest char o at 0
   1: IFMATCH[-1](7)
   3:   EXACT <f>(5)
   5:   SUCCEED(0)
   6:   TAIL(7)
   7: EXACT <o>(9)
   9: IFMATCH[-0](15)
  11:   EXACT <o>(13)
  13:   SUCCEED(0)
  14:   TAIL(15)
  15: END(0)
anchored `o' at 0 (checking anchored) minlen 1
Omitting $` $& $' support.

EXECUTING...

Guessing start of match, REx `(?<=f)o(?=o)' against `foo'...
Found anchored substr `o' at offset 1...
Guessed: match at offset 1
Matching REx `(?<=f)o(?=o)' against `oo'
  Setting an EVAL scope, savestack=3
   1 <f> <oo>             |  1:  IFMATCH[-1]
   0 <> <foo>             |  3:    EXACT <f>
   1 <f> <oo>             |  5:    SUCCEED
                              could match...
   1 <f> <oo>             |  7:  EXACT <o>
   2 <fo> <o>             |  9:  IFMATCH[-0]
   2 <fo> <o>             | 11:    EXACT <o>
   3 <foo> <>             | 13:    SUCCEED
                              could match...
   2 <fo> <o>             | 15:  END
Match successful!
$&
Freeing REx: `(?<=f)o(?=o)'
o