Re: Regex fun

I would assume that quantifier cannot be a '\x' variable, but don't really know.

I think it's important to note that \1 is not a variable ^* (which is why you can't use it outside of a regex); the variable that contains the contents of the first capture group is $1, but that ~~'s empty~~ doesn't take on its new value ^*** until the capture has completed ^**.

I think that the reason that ($rx){\1} isn't allowed is that the regex engine wants to compile the regex before running it. Since the contents of \1, hence the number of times that $rx is supposed to be captured, aren't known until run-time, this interferes with the compilation. For example, /\+32767.{32767}/ is rejected at compile time, but a '+32767' =~ /\+([0-9]*).{\1}/ construct would circumvent this restriction. (“Why, then,” you ask, “is something like /(.)\1/, which suffers from the same compilation problem, OK?” I dunno. :-) )

^* Not a Perl variable, anyway. See Re^3: Regex fun, and probably Re^2: Regex fun as well.
^** Except that (?{ print $1 }) works correctly, which is somewhat miraculous to me and very very helpful for debugging regexes.
UPDATE: ^*** Still false (see Re^6: Regex fun for where realisation finally dawns). It takes on its new value as soon as the capture completes (which explains the miracle referenced above); it's just that the interpolation in the text of the regex has already happened, so that the quantifier doesn't ‘see’ the new value.

Comment on Re: Regex fun Select or Download Code

Replies are listed 'Best First'.
Re^2: Regex fun by JavaFan (Canon) on Dec 15, 2009 at 19:40 UTC
I think it's important to note that \1 is not a variable (which is why you can't use it outside of a regex); But you can, sometimes, use it in the replacement part. think it's important to note that \1 is not a variable (which is why you can't use it outside of a regex); the variable that contains the contents of the first capture group is $1, but that's empty until the capture has completed. But in `/([0-9]+){$1}/`, the first capture is completed before the quantifier. So, that's not the reason. For example, `/\+32767.{32767}/` is rejected at compile time Yes, but that's considered a bug. It's a restriction that should have been removed after the regexp engine was no longer recursive. “Why, then,” you ask, “is something like /(.)\1/, which suffers from the same compilation problem, OK?” That's not the same problem. `{...}` is one of the mini-languages inside regular expressions. Compare it with `[...]`. `[\1]` doesn't refer back to something else either. But one can defer a subpattern. The syntax is `(??{ })`. This is what the OP wants, and this is what the OP ought to use.	[reply] [d/l] [select]
Re^3: Regex fun by JadeNB (Chaplain) on Dec 15, 2009 at 20:22 UTC
But you can, sometimes, use it in the replacement part. Sure, but you're not supposed to: Warning on \1 Instead of $1. But in `/([0-9]+){$1}/`, the first capture is completed before the quantifier. So, that's not the reason. Sorry, I don't understand—not the reason for what? It's a restriction that should have been removed after the regexp engine was no longer recursive. Sorry, I don't understand this, either. Do you mean ‘re-entrant’? (UPDATE: Nope, just my internals-ignorance revealed. Thanks, ikegami!)	[reply] [d/l]
Re^4: Regex fun by ikegami (Patriarch) on Dec 15, 2009 at 20:30 UTC
Regarding the last point, the engine was re-engineered for 5.10. It used to use the C stack, so limits were imposed to prevent stack overflows. Now, the stack it uses is on the heap. The implementation moved away from a recursive model as part of the change.	[reply]
Re^4: Regex fun by JavaFan (Canon) on Dec 15, 2009 at 22:13 UTC
Sorry, I don't understand—not the reason for what? Quoting myself where I am quoting you: the variable that contains the contents of the first capture group is $1, but that's empty until the capture has completed. You're claiming $1 is "empty" until the the capture has completed. I'm pointing that the in the case of the OP, said first capture has completed. Do you mean ‘re-entrant’? No, I don't. The current regexp-engine isn't re-entrant.	[reply]
Re^5: Regex fun by JadeNB (Chaplain) on Dec 15, 2009 at 22:41 UTC
Re^6: Regex fun by JavaFan (Canon) on Dec 16, 2009 at 09:00 UTC

In Section Seekers of Perl Wisdom