Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

my versus our in nested regex

by Len (Friar)
on Oct 18, 2003 at 16:51 UTC ( #300297=perlquestion: print w/replies, xml ) Need Help??
Len has asked for the wisdom of the Perl Monks concerning the following question:

I can't explain this behaviour:
use strict; our $regex = qr /( # Start capture \( # Start with '(', (?: # Followed by (?>[^()]+) # Non-parenthesis |(??{ $regex }) # Or a balanced () block )* # zero or more times \) # Close capture )/x; # Ending with ')' my $text = '(outer(inner(most inner)))'; $text =~ /$regex/gs; print "$1\n";
Why do I have to use our $regex instead of my $regex to get the expected result ?

our $regex prints (outer(inner(most inner)))
my $regex prints (most inner)

Len

Replies are listed 'Best First'.
Re: my versus our in nested regex
by BrowserUk (Pope) on Oct 18, 2003 at 17:08 UTC

    my variables don't come into existance until the end of the statement in which they are declared, which means that when the regex is being compiled, <update> the lexical scalar,<update> $regex doesn't yet exist.

    Using our bypasses this problem.

    If you want to avoid using a global, then pre-declare the lexical.

    use strict; my $regex; $regex = qr /( # Start capture \( # Start with '(', (?: # Followed by (?>[^()]+) # Non-parenthesis |(??{ $regex }) # Or a balanced () block )* # zero or more times \) # Close capture )/x; # Ending with ')' my $text = '(outer(inner(most inner)))'; $text =~ /$regex/gs; print "$1\n"; __END__ P:\test>junk2 (outer(inner(most inner)))

    Updated description in the light of liz's observation below.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Hooray!

      ...which means that when the regex is being compiled, $regex doesn't yet exist.

      Actually, this is incorrect. $regex does exist but refers to an (undefined) global variable with the same name at that stage. Which the following with strict will reveal:

      $ perl -Mstrict -e 'my $foo = $foo' Global symbol "$foo" requires explicit package name at -e line 1. Execution of -e aborted due to compilation errors.

      Liz

      It is interesting that strict doesn't complain in the following:

      use strict; my $regex = qr/(?{{$regex}})/;

      and it isn't just because the parser has seen that variable:

      use strict; my $regex = qr/(?{{$undeclared}})/;

      doesn't complain either.

        Agreed. Quite why strictness doesn't propogate to regex code blocks is a good question.

        You can always enable it yourself:)

        P:\test>perl -le"my $re = qr[(??{ use strict; $re })];" Global symbol "$re" requires explicit package name at (re_eval 1) line + 2. Compilation failed in regexp at -e line 1.

        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "Think for yourself!" - Abigail
        Hooray!

Re: my versus our in nested regex
by shenme (Priest) on Oct 18, 2003 at 17:02 UTC
    If you had turned on warnings you would have seen the message "Use of uninitialized value in pattern match (m//) at len01.pl line 14.".

    I do know that my variables are not _really_ in the symbol table hash, and that might be tripping up things while constructing the compiled regex and then executing it later.

Re: my versus our in nested regex
by pernod (Chaplain) on Oct 21, 2003 at 10:33 UTC

    Jeffrey Friedl talks a bit about this in Mastering Regular Expressions in the section called "A Warning About Embedded Code and my Variables" (page 338-339). His conclusion on the matter is that an embedded code construct is in fact a closure.

    This means that using a lexical variable inside an embedded code construct in a regular expression binds the instance of the lexical variable in existence at the moment the regex is compiled to the regex. As far as I understand, this means that:

    my $regex = qr /( # Start capture \( # Start with '(', (?: # Followed by (?>[^()]+) # Non-parenthesis |(??{ $regex }) # Or a balanced () block )* # zero or more times \) # Close capture )/x; # Ending with ')'

    with regard to liz' remark about lexicals at compile time, is "interpreted" (pardon the hand-waving) as:

    my $regex = qr /( # Start capture \( # Start with '(', (?: # Followed by (?>[^()]+) # Non-parenthesis |(??{ undef }) # <-- Note! 'or undef' )* # zero or more times \) # Close capture )/x; # Ending with ')'

    which will match the innermost parentheses. We don't have any undefs in the target string, so how can this part of the construct match?

    Trying to write this down in a sensible manner proved to be quite a challenge, so I apologize if the preceeding section is hard to understand. But to quote Friedl from the aforementioned section: "Warning: this section is not light reading." :o)

    Hope this helps.

    pernod
    --
    Mischief. Mayhem. Soap.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://300297]
Approved by gmax
Front-paged by demerphq
help
Chatterbox?
[atcroft]: (And Feb. is the odd case, because it is 28, unless it is a year divisible by 4, or if it is divisible by both 100 and 400 (at which point it is 29).)
[james28909]: i know but scroll through your calendar on your computer.
[james28909]: i thiught you were going to say make both hands into a fist and puch yourself in the face
[atcroft]: .oO(Sorry, I probably should have said take two normal hands....)
[atcroft]: james28909: No, unless you are a politician, I wouldn't say that (and even if you are, I still probably wouldn't say that).
[james28909]: i mean how hard can it be? its just subtracting days lol
[atcroft]: james28909: What about October 5, 1582?
[stevieb]: atcroft: "Make both hands into fists..."... is something my Ma taught me in our native lang, but I was to ignorant and young to pay attention. Thanks for that :)
[atcroft]: stevieb: Sad to say that I only recently learned that particular trick, but I have since found it very useful.... :)
[james28909]: ill be back with a solution eventually

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (3)
As of 2017-04-29 04:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I'm a fool:











    Results (531 votes). Check out past polls.