Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Bug in substitution operator?

by polettix (Vicar)
on Jan 10, 2009 at 20:15 UTC ( [id://735426]=perlquestion: print w/replies, xml ) Need Help??

polettix has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I'm bothering you only to ask if this behaviour is what's actually expected (and why, of course):

#!/usr/bin/perl use strict; use warnings; my @lines = split /\n/, <<'END'; begin{verbatim} hello all end{verbatim} END s{((?:begin|end))\{verbatim\}}{$1\{whatever\}} for @lines; print "$_\n" for @lines; __END__ poletti@Polebian:~/sviluppo/perl/pwc/tmp$ perl bug.pl Use of uninitialized value in substitution iterator at bug.pl line 11. Use of uninitialized value in substitution iterator at bug.pl line 11. hello all
The fact is that:
$1\{whatever\}
is being interpreted as $1{whatever}, i.e. the value corresponding to the whatever key in hash $1, instead of the value of $1, followed by the string {whatever}.

This particular behaviour is a consequence of the unlucky usage of the substitution part delimiters, i.e. {} braces. Should we use another one this would not be triggered, i.e. the following works like a charm (note the square brackets instead of braces in the substitution part):

#!/usr/bin/perl use strict; use warnings; my @lines = split /\n/, <<'END'; begin{verbatim} hello all end{verbatim} END s{((?:begin|end))\{verbatim\}}[$1\{whatever\}] for @lines; print "$_\n" for @lines; __END__ poletti@Polebian:~/sviluppo/perl/pwc/tmp$ perl nobug.pl begin{whatever} hello all end{whatever}
So, it seems that the escape character \ is being used to allow for embedding braces into braces. But this is not needed at all, as we can see from the following:
#!/usr/bin/perl use strict; use warnings; my @lines = split /\n/, <<'END'; begin{verbatim} hello all end{verbatim} END my %hash = (whatever => 'world'); s{((?:begin|end))\{verbatim\}}{$hash{whatever}} for @lines; print "$_\n" for @lines; __END__ poletti@Polebian:~/sviluppo/perl/pwc/tmp$ perl bug2.pl world hello all world
Perl is perfectly happy with the unescaped braces inside the substitution part, even when this very part is delimited by braces.

Both 5.8.8 and 5.10.0 show the same behaviour. In my view it seems to be a bug; thoughts?

perl -ple'$_=reverse' <<<ti.xittelop@oivalf

Io ho capito... ma tu che hai detto?

Replies are listed 'Best First'.
Re: Bug in substitution operator?
by rhesa (Vicar) on Jan 10, 2009 at 20:24 UTC
    Just disambiguate the $1 with braces as well: ${1}:
    #!/usr/bin/perl use strict; use warnings; my @lines = split /\n/, <<'END'; begin{verbatim} hello all end{verbatim} END s{((?:begin|end)){verbatim}}{${1}{whatever}} for @lines; print "$_\n" for @lines; __END__ begin{whatever} hello all end{whatever}
      Hi rhesa, thanks for your answer. As a matter of fact, this is exactly what I put in my code to solve this problem in the first place.

      Anyway, the point in this post is not how work this situation around, but to understand if this is a bug or not, i.e. if this is something to work around.

      perl -ple'$_=reverse' <<<ti.xittelop@oivalf

      Io ho capito... ma tu che hai detto?
        update: my example went the wrong way. This is the issue:
        use strict; my $v = 'x'; print qq"$v{a}"; __END__ xa
        versus
        use strict; my $v = 'x'; print qq{$v{a}}; __END__ Global symbol "%v" requires explicit package name at -e line 3.
        I think the following quote from perlop under "Gory details of parsing quoted constructs" sheds some light on this quirk:

        Note also that the interpolation code needs to make a decision on where the interpolated scalar ends. For instance, whether "a $b -> {c}" really means:
        "a " . $b . " -> {c}";
        or:
        "a " . $b -> {c};
        Most of the time, the longest possible text that does not include spaces between components and which contains matching braces or brackets. because the outcome may be determined by voting based on heuristic estimators, the result is not strictly predictable. Fortunately, it’s usually correct for ambiguous cases.

        I don't see this as a bug. Remember that the right-hand-side of s/// is just a double-quoted string, subject to the usual interpolation rules.

        The same disambiguation is needed in regular interpolated strings. Consider this snippet:

        use strict; my $var = "this"; print "$var{whatever}"; __END__ Global symbol "%var" requires explicit package name at -e line 3.
        To fix the syntax error, and get the desired output, you do this:
        use strict; my $var = "this"; print "${var}{whatever}"; __END__ this{whatever}
Re: Bug in substitution operator?
by setebos (Beadle) on Jan 10, 2009 at 20:50 UTC
    Dont' you mean the following?
    s/\{verbatim\}/\{$hash{whatever}\}/ for @lines;
      The actual substitution isn't important in the original post; the fact is that using braces to enclose the substitution part leads to some behaviour that's not what I expected.

      As you may see in the OP, I already found a solution/workaround to get what I wanted (i.e. using square brackets instead of braces); what I'm asking here is if this behaviour is buggy or not.

      perl -ple'$_=reverse' <<<ti.xittelop@oivalf

      Io ho capito... ma tu che hai detto?
Re: Bug in substitution operator?
by fullermd (Vicar) on Jan 11, 2009 at 08:22 UTC

    Interesting. It does sound buggy.

    And it does the same thing with brackets too, notice. Simpler example:

    @s = ("foo bar baz") x 4; $bar = "rab"; %bar = (1 => "eek"); @bar = (undef, 'oof'); $s[0] =~ s{(bar)} {$bar\{1\}}; $s[1] =~ s{(bar)} [$bar\{1\}]; $s[2] =~ s{(bar)} {$bar\[1\]}; $s[3] =~ s{(bar)} [$bar\[1\]]; for $i (0..3) { print "$i -> '$s[$i]'\n"; }

    Yields

    0 -> 'foo eek baz' 1 -> 'foo rab{1} baz' 2 -> 'foo rab[1] baz' 3 -> 'foo oof baz'

    (5.8 here)

    So something about having the delimiter be the same as your subscripting character makes it real hard to escape.

    For extra fun, notice that using two \'s in the sub fails as "replacement not terminated", and using 3 leaves one sitting in your string:

    4 -> 'foo rab\[1] baz'

    So it doesn't seem like you can all that easily work around it except by changing the delimiters. Maybe playing with escaping... yes, that seems to work:

    $s[4] =~ s{(bar)} [$bar\E\[1\]]; 4 -> 'foo rab[1] baz'

    Wackiness.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://735426]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (5)
As of 2025-01-15 18:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Which URL do you most often use to access this site?












    Results (48 votes). Check out past polls.