MrNobo1024 has asked for the wisdom of the Perl Monks concerning the following question:
From perlre:
WARNING: Once Perl sees that you need one of $&, $`, or $' anywhere in
+ the program, it has to provide them for every pattern match. This ma
+y substantially slow your program. Perl uses the same mechanism to pr
+oduce $1, $2, etc, so you also pay a price for each pattern that cont
+ains capturing parentheses. (To avoid this cost while retaining the g
+rouping behaviour, use the extended regular expression (?: ... ) inst
+ead.) But if you never use $&, $` or $', then patterns without captur
+ing parentheses will not be penalized. So avoid $&, $', and $` if you
+ can, but if you can't (and some algorithms really appreciate them),
+once you've used them once, use them at will, because you've already
+paid the price.
I don't see why they had to make it set them on *every* regex match just because one uses them. Why couldn't they just add a new modifier (like /i, /x, /g, etc), and only set them if the modifier is used?
--MrNobo1024 s]]HrLfbfe|EbBibmv]e|s}w}ciZx^RYhL}e^print
Re: Why can't $` $& $' be enabled on a per-regex basis?
by chromatic (Archbishop) on Aug 10, 2002 at 07:05 UTC
|
Because we're jerks. Patches welcome.
Seriously, it's a hard problem. Consider:
my @options = qw( foo bar );
my $string = $options[ rand @options ];
$string =~ /(foo)/;
$string =~ /(bar)/;
print $1;
Since there's no way of telling which option $string will contain, there's no way of telling which regex will match successfully until one actually does match successfully. (See perlvar, and note the phrase "last successful match".) The regex engine therefore has to keep track of all of the data to populate the magic variables in every regular expression.
japhy had some ideas to make this block scoped, and they may find their way into 5.10, but the problem remains hard.
| [reply] [Watch: Dir/Any] [d/l] |
Re: Why can't $` $& $' be enabled on a per-regex basis?
by Juerd (Abbot) on Aug 10, 2002 at 09:57 UTC
|
$` is the same as "substr($var, 0, $-[0])"
$& is the same as "substr($var, $-[0], $+[0] - $-[0])"
$' is the same as "substr($var, $+[0])"
So:
sub prematch (;$) { substr((@_ ? $_[0] : $_), 0, $-[0]) }
sub match (;$) { substr((@_ ? $_[0] : $_), $-[0], $+[0] - $-[0]) }
sub postmatch (;$) { substr((@_ ? $_[0] : $_), $+[0]) }
my $foo = 'Hello, world!';
$foo =~ /o, w../;
$\ = "\n";
print prematch($foo);
print match($foo);
print postmatch($foo);
=head2 C<prematch>, C<match>, C<postmatch>
Like C<$`>, C<$&> and C<$'>, but not slowing down, and
needing a single argument: the string on which the
most recent regex was performed. Uses C<$_> when no
argument is given.
=cut
Update: s/the variable on which/the string on which/
- Yes, I reinvent wheels.
- Spam: Visit eurotraQ.
| [reply] [Watch: Dir/Any] [d/l] [select] |
|
sub prematch (;$) { substr((@_ ? $_[0] : $_), 0, $-[0]) }
sub match (;$) { substr((@_ ? $_[0] : $_), $-[0], $+[0] - $-[0]) }
sub postmatch (;$) { substr((@_ ? $_[0] : $_), $+[0]) }
my $foo = 'Hello, world!';
$foo =~ s/l+/r/;
print "prematch(",prematch($foo),")\n";
print "match(",match($foo),")\n";
print "postmatch(",postmatch($foo),")\n";
print "`(",$`,")\n";
print "&(",$&,")\n";
print "'(",$',")\n";
produces
prematch(He)
match(ro)
postmatch(, world!)
`(He)
&(ll)
'(o, world!)
- tye (but my friends call me "Tye") | [reply] [Watch: Dir/Any] [d/l] [select] |
Re: Why can't $` $& $' be enabled on a per-regex basis?
by Zaxo (Archbishop) on Aug 10, 2002 at 06:35 UTC
|
They're globals, so local has an influence:
perl -e'$_="fafbfc";/fc/;{local $`;/bf/;print $`,$/} print $`, $/'
prints:
faf
fafb
After Compline, Zaxo | [reply] [Watch: Dir/Any] [d/l] |
|
Actually the point of a question was trying to invent a way
to optionally provide those variables on demand, and not how it is done now - once seen in a script, those variables are provided for every regex, thus slowing program down significantly.
That question sometimes arise on p5p, and, as I remember, an answer is probably because all bits for modifiers are occupied, and there is no way to add another one without enlarging corresponding C struct.
Courage, the Cowardly Dog
| [reply] [Watch: Dir/Any] |
|
|