Perl Bug in Regex Code Block?


"be consistent"
	PerlMonks

Perl Bug in Regex Code Block?

by Hofmator (Curate)

on Sep 03, 2001 at 15:45 UTC ( [id://109847]=perlquestion: print w/replies, xml )

Need Help??

Hofmator has asked for the wisdom of the Perl Monks concerning the following question:

Playing around with regexes (abusing them :) on the weekend I came across the following (on Perl 5.6.1, ActiveState Build 626). Executing the same regex on the same string in a loop multiple times yields different results for the first run and the remaining runs. I'm running on Win2K, but I don't think this plays a role here. The code:

#!/usr/bin/perl
use strict;
use warnings;
use re qw/eval /;

my $pattern = q/(.)(?{
        print ++$counts[0];
})^/;

my $line = 'ab';

for (0..2) {
    my @counts = (0);

    print "$_: ";
    # $pattern .= '(?=.)';
    $line =~ /$pattern/;
    print "; \@counts = (", join(', ', @counts), ")\n";
}
print "\@main::counts = (", join(', ', @::counts), ")\n";
[download]

This prints - apart from the warning about the last line:

0: 12; @counts = (2)
1: 34; @counts = (0)
2: 56; @counts = (0)
@main::counts = ()
[download]

which means, it works the first time as expected but the next times my @counts doesn't get modified by the regex. However, inside the regex the variable seems to retain its value from execution to execution.

When using a package variable by changing my @counts to our @counts the program works as expected and prints:

0: 12; @counts = (2)
1: 12; @counts = (2)
2: 12; @counts = (2)
@main::counts = (2)
[download]

When uncommenting the $pattern .= line (and going back to my) - effectively changing the pattern in every loop (remark: this does not effect the working of the regex!), the code also works as expected printing:

0: 12; @counts = (2)
1: 12; @counts = (2)
2: 12; @counts = (2)
@main::counts = ()
[download]

My question - is this a known bug? Is it a bug at all or might I have overlooked a (well) documented feature ;-) and how does this behave in other versions of perl?

-- Hofmator

Comment on Perl Bug in Regex Code Block? Select or Download Code

Replies are listed 'Best First'.

Re: Perl Bug in Regex Code Block?
by japhy (Canon) on Sep 03, 2001 at 17:22 UTC

### update: fixed
### thanks Hof -- I condensed working code poorly :(

use re 'eval';

my @r;
my $p = q/.(?{ ++$x[0] })^/;
for (0..2) {
  my @x = (0);
  "ab" =~ $p;
  push @r, \@x;
}
print "$_->[0]" for @r;
[download]

@x

@x

If you were to use qr// instead, you'd be changing the global array.

You're doing some funny-looking scope-crufting. I'd stay away from it if I were you. This situation is the sort of thing I fear having to write about and explain in my book.

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re2: Perl Bug in Regex Code Block?

by Hofmator (Curate) on Sep 03, 2001 at 19:41 UTC

Thanks for the explanation, japhy++! Playing around with your code I think I understand it now and how I accidently created a closure. I still have some questions, though.

Why is the regex not recompiled, I'm not using the /o modifier. I thought perl recompiles a regex /$p/ which contains a variable interpolation every time. And is there a way to force a recompile?
I was not trying to do anything funny with the different scopes. What I want is execute some code which manipulates a variable in a regex. And I'd like to use a lexical variable so that I don't pollute the global namespace. Is there a way to do that? Taking the my @x declaration out of the loop like this
```
my @x;
for (0..2) {
  @x = (0);
  "ab" =~ $p;
  push @r, \@x;
}
[download]
```
fixes it here but what if the whole thing is in a subroutine, then I can't call it more than once, can I?
I think I'm not wrong in saying that this is slightly underdocumented ... especially since 5.6.0 seems to behave differently as others have posted here.

-- Hofmator

Re: Re2: Perl Bug in Regex Code Block?

by japhy (Canon) on Sep 03, 2001 at 20:13 UTC

experimental

Second thing second: use a local array, and copy its contents to a lexical one. I know you don't want to use a global array, but I'm telling you that you should. This is an example from my book:

"12180" =~ m{
  (?{ local @n = () })
  (?: (\d) (?{ local @n = (@n, $1) }) )+
  \d
  (?{ @d = @n })
}x;
[download]

local @n;
/(.)(?{ ++$n[0] })^/;
@d = @n;
[download]

$p = '\w+-\d+';
/$p/;
/$p/;
[download]

$p = '\w+-\d+';
for $i (1,2) { /$p/ }
[download]

Now, if you've heard "if you have a regex, and it has variables in it, and the variables change, the regex has to be recompiled" that's technically incorrect:

($x,$y) = ('a+', 'b');
for (1,2) {
  /$x$y/;
  ($x,$y) = ('a', '+b');
}
[download]

I can't take credit for figuring this out on my own -- a couple months ago, Dominus gave me the hint about the string representation. Now I understand.

So that answers your question, I think.

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

[reply]
[d/l]
[select]

Re: Perl Bug in Regex Code Block?
by stefan k (Curate) on Sep 03, 2001 at 16:53 UTC

counts

my

use strict

Name "main::counts" used only once: possible typo at ./re-code.pl line
+ 20.
0: 12; @counts = (0)
1: 34; @counts = (0)
2: 56; @counts = (0)
@main::counts = (6)
[download]

my

our

blblblblblblblblblblb

You know what? I'm even more confused than before I started studying the code. At least I could present another results from another perl version as you wished.

Regards... Stefan

you begin bashing the string with a +42 regexp of confusion

[reply]
[d/l]
[select]

Re2: Perl Bug in Regex Code Block?

by Hofmator (Curate) on Sep 03, 2001 at 17:32 UTC

OK, to your first problem

It [@counts] is first used outside the for-loop

Name "main::counts" used only once: possible typo at ./re-code.pl line
+ 20.
0: 12; @counts = (0)
1: 34; @counts = (0)
2: 56; @counts = (0)
@main::counts = (6)
[download]

{
  my $num = 0;
  $main::num = 5; # this instead of the regex
  print $num;     # prints 0
}
print $num;       # prints 5

# or under use strict
print $main::num; # prints 5 as well
[download]

btw, the warning can be ignored in this case

-- Hofmator

[reply]
[d/l]
[select]

Re: Perl Bug in Regex Code Block?
by demerphq (Chancellor) on Sep 03, 2001 at 17:38 UTC

0: 12; @counts = (0)
1: 34; @counts = (0)
2: 56; @counts = (0)
@main::counts = (6)
[download]

#!/usr/bin/perl
use strict;
use warnings;
use re qw/eval /;

my $line = 'ab';
my $pattern = q/(.)(?{print ++$counts[0]})^/;

for (0..2) {
    my @counts = (0);

    print "$_: ";
    $line =~ /$pattern/;

    print "; my \@counts = (@counts)\n";
}

{
no strict; no warnings;
print "our \@counts = (".join(",",@counts).")\n";
#print "our \@counts = (@counts)\n";
}
[download]

In string, @counts now must be written as \@counts at .\counts.pl line
+ 22, near "our \@counts = (@counts"
Execution of .\counts.pl aborted due to compilation errors.
[download]

And I have another point of weirdness to note in the regex you are using you have placed a '^' caret at the END of the regex, which for some reason makes your print statement fire twice. If I remove the ^ it prints once. Either way I dont see what is going on here at all....

Yves
--
You are not ready to use symrefs unless you already know why they are bad. -- tadmc (CLPM)

[reply]
[d/l]
[select]

Re2: Perl Bug in Regex Code Block?

by Hofmator (Curate) on Sep 03, 2001 at 20:07 UTC

Some answers to your questions

Concerning the error message in connection with the last print statement ... I cannot reproduce that, it works fine both ways (commented and uncommented). With strict and warnings I get
```
Possible unintended interpolation of @counts in string at bug line 20.
Global symbol "@counts" requires explicit package name at bug line 19.
Global symbol "@counts" requires explicit package name at bug line 20.
[download]
```
and it dies as expected. Adding the explicit @main:: solves the problem altogether.
And I have another point of weirdness to note in the regex you are using you have placed a '^' caret at the END of the regex
This is weirdness, you are right and actually not necessary for the thing in question here. It is a left-over from the code where I originally encountered the problem. But I can explain the behaviour ... consider this simpler regex "ab" =~ /.^/; it matches any character and after that the beginning of the line, so it can never match! Nevertheless the regex tries to match. First the a, then it sees that that doesn't work out and so tries the b after which it fails. If we now sneak in a code block like this "ab" =~ /.(?{print 'hello!'})/; the regex passes this block twice! And you can do very nice things with that (see e.g. my twiddle code) ... the original code came from a nonogram solver which I will post here in a couple of weeks (I have to find time to clean up the code a bit :)

Update: I forgot to mention use re 'debug'. It is always helpful when you don't understand a pattern match.

-- Hofmator

[reply]
[d/l]
[select]

Re: Perl Bug in Regex Code Block?
by MZSanford (Curate) on Sep 03, 2001 at 16:38 UTC

my @counts = (0);
[download]

my

my

for (0..2) {}

@counts

my

_{can't sleep clowns will eat me}

[reply]
[d/l]
[select]

Re2: Perl Bug in Regex Code Block?

by Hofmator (Curate) on Sep 03, 2001 at 16:59 UTC

When you use my, you are creating a variable which will disappear when it goes out of scope.

inside

Maybe you have misunderstood my question, I'm not confused that the last line of my code doesn't print anything. It was only included for the (working) run with our instead of my. I want to know, why it's changing its behaviour inside the loop.

I hope this clarifies my problem ...

-- Hofmator

Back to Seekers of Perl Wisdom

Log In^?

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: perlquestion [id://109847]
Approved by root
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others contemplating the Monastery: (4)

As of 2024-04-24 00:26 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found