Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Woo. This one's got me interested.

I've tested this on perl 5.004_04 for sun-solaris, perls 5.004_05 and 5.6 for i686-linux (redhat) and even ActiveState's 5.6.0 for Win32 and _all_ of them show the same behaviour.
What causes the difference between two variations on this bit of code is whether or not the pattern is plain text (as it says above /blah/ may be optimized to an analogue of index()). If there's no regex compilation then $& causes Segmentation faults.

use re 'debug';
shows that the regex isn't re-evaluated when the $& is entered on STDIN, but it does state explicitly Omitting $` $& $' support. Must say I'm at a bit of a loss as to where the value does come from.

If I were to go out on a limb a bit I would say that I'm thinking that maybe the penalty from using $&, etc in your code is because perl links it into plain text matches as well as compiled regexes. ie $&, etc are always there for full compiled regex's, but index() doesn't normally return the pre-match, match and post-match strings, so the "analogue of index()" requires a bit more work to produce them.

Where's japhy? I get the feeling he'll know :o)

There's a bunch of tests and re 'debug' output below if you're interested: <READMORE>
use re 'debug'; 'foo' =~ m/.*/; print eval <STDIN>;
This gives the following output:
Compiling REx `.*'
size 3 first at 2
   1: STAR(3)
   2:   REG_ANY(0)
   3: END(0)
anchored(MBOL) implicit minlen 0
Omitting $` $& $' support.


Matching REx `.*' against `foo'
  Setting an EVAL scope, savestack=3
   0 <> <foo>             |  1:  STAR
                           REG_ANY can match 3 times out of 32767...
  Setting an EVAL scope, savestack=3
   3 <foo> <>             |  3:    END
Match successful!
Before waiting for the input. It actually specifies that it's omitting $&, etc support, yet when you do enter $& still gives the expected answer:
Freeing REx: `.*'
If you use a plain text match (like tilly suggested with /ri/ in 'string', you don't get this result at all, as perl doesn't handle the match in the same way, it "guesses" the result, presumably using a more index() like way of making the match:
use re 'debug'; 'foo' =~ m/o/; print eval <STDIN>;
gives the output:
$ perl reg
Compiling REx `o'
size 3 first at 1
rarest char o at 0
   1: EXACT <o>(3)
   3: END(0) 
anchored `o' at 0 (checking anchored isall) minlen 1
Omitting $` $& $' support.


Guessing start of match, REx `o' against `foo'...
Found anchored substr `o' at offset 1...
Guessed: match at offset 1
Segmentation fault (core dumped)
$` and $' don't have quite such drastic efects, they simply print blank.
The extra level of compilation that look(ahead|behind)s give the regex also allow $& to produce the required result:
use re 'debug'; 'foo' =~ m/(?<=f)o(?=o)/; print eval <STDIN>;
$ perl reg
Compiling REx `(?<=f)o(?=o)'
size 15 first at 1
rarest char o at 0
   1: IFMATCH[-1](7)
   3:   EXACT <f>(5)
   5:   SUCCEED(0)
   6:   TAIL(7)
   7: EXACT <o>(9)
   9: IFMATCH[-0](15)
  11:   EXACT <o>(13)
  13:   SUCCEED(0)
  14:   TAIL(15)
  15: END(0)
anchored `o' at 0 (checking anchored) minlen 1
Omitting $` $& $' support.


Guessing start of match, REx `(?<=f)o(?=o)' against `foo'...
Found anchored substr `o' at offset 1...
Guessed: match at offset 1
Matching REx `(?<=f)o(?=o)' against `oo'
  Setting an EVAL scope, savestack=3
   1 <f> <oo>             |  1:  IFMATCH[-1]
   0 <> <foo>             |  3:    EXACT <f>
   1 <f> <oo>             |  5:    SUCCEED
                              could match...
   1 <f> <oo>             |  7:  EXACT <o>
   2 <fo> <o>             |  9:  IFMATCH[-0]
   2 <fo> <o>             | 11:    EXACT <o>
   3 <foo> <>             | 13:    SUCCEED
                              could match...
   2 <fo> <o>             | 15:  END
Match successful!
Freeing REx: `(?<=f)o(?=o)'

In reply to Re: Re (tilly) 1: Perl is psychic?! by pileswasp
in thread Perl is psychic?! by MrNobo1024

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (7)
    As of 2020-12-01 16:10 GMT
    Find Nodes?
      Voting Booth?
      How often do you use taint mode?

      Results (12 votes). Check out past polls.