http://www.perlmonks.org?node_id=207511

v_thunder has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have some code that does this:

s/\[\[(\w+)\]\]/$helper->($1)/eg;

Where $helper is an anonymous function that returns the text to be substituted. This works great with perl 5.6, but breaks in 5.8 with:

panic: sv_pos_b2u: bad byte offset at [...].

I'm not sure what the problem is, I've also tried:

s/\[\[(\w+)\]\]/@{[$helper->($1)]}/g;

which fails in the same way. I briefly thought about doing something sort of like:

while (m/\[\[(\w+)\]\]/) { my $foo = $helper->($1); s/\[\[(\w+)\]\]/$foo/; }

But the problem with that (besides it being inefficient) is that m// and s/// seem to use the same pointer to mark where in the string the last search ended--so the substitution changes the *next* occurrence of the pattern, not the one the m// matched.

So I'd like to know:

I await enlightenment :-)

-Dan

Replies are listed 'Best First'.
Re: Functions in substitutions (s///) and Perl 5.8.
by kvale (Monsignor) on Oct 23, 2002 at 22:47 UTC
     perldoc -f pos yields
    pos SCALAR pos Returns the offset of where the last "m//g" search left off for the variable in question ($_ is used when the variable is not specified). May be modi- fied to change that offset. Such modification will also influence the "\G" zero-width assertion in regular expressions. See perlre and perlop.
    so yes, you can use  pos() to reset the position.

    The error

    panic: sv_pos_b2u: bad byte offset at [...].
    looks like a perl bug to me, This routine converts a byte position to a unicode character position and translation should be transparent to the user.

    -Mark

Re: Functions in substitutions (s///) and Perl 5.8.
by teichman (Novice) on Oct 24, 2002 at 16:14 UTC
    Rewriting it to use pos() like this works for me:
    while (m/\[\[\w+\]\]/) { my $oldpos = pos; m/\[\[(\w+)\]\]/; my $expansion = $helper->($1); pos = $oldpos; s/[[$1]]/$expansion/; }
    It's a little nasty due to matching more times than I'd like, but it works. :)
      Grr, that'll teach me to make changes without testing. The substitution needs to have those square brackets escaped:
      while (m/\[\[\w+\]\]/) { my $oldpos = pos; m/\[\[(\w+)\]\]/; my $expansion = $helper->($1); pos = $oldpos; s/\[\[$1\]\]/$expansion/; }
Re: Functions in substitutions (s///) and Perl 5.8.
by Anonymous Monk on Oct 24, 2002 at 19:35 UTC
    That panic error sure sounds like a perl bug; I couldn't reproduce it in 5.8.0 in a quick test. Can you provide a very short but complete example that exhibits the error?

    Off the top of my head, you might try introducing some interpolation to see if that would make a difference:

    s/\[\[(\w+)\]\]/$helper->("$1")/eg;
    or maybe:
    s/\[\[(\w+)\]\]/"@{[$helper->($1)]}"/eg;
    or even an explicit copy:
    s/\[\[(\w+)\]\]/my $match = $1; $helper->($match)/eg;
    Functionally, these should be no different, but if this is an internal perl bug, these alternatives may take a different code path and might not trip over the bug... (If you're lucky!)

      You're right - a simple test case doesn't have any problems. It turns out that our (I'm working with v_thunder on this project) helper function was trampling over $_, which caused the failed assertion only in combination with some other buggy code.

      Rewriting the helper to use a different local variable makes the original code work as expected.

      I'm having trouble producing a simple case that shows the problem. I think we just happened to hit a case where buggy code happened to work on earlier perls, but doesn't in 5.8. :)

Re: Functions in substitutions (s///) and Perl 5.8.
by mce (Curate) on Oct 25, 2002 at 09:39 UTC
    Hi,

    This code works fine for me.

    #!/usr/local/bin/perl require 5.008; use strict; my $helper=sub { print @_ }; $_="[[This]] [[is]] [[a]] [[test]]"; s/(\w+)/$helper->($1)/eg;
    On which platform are you running?
    (mine is RH 7.1)
    ---------------------------
    Dr. Mark Ceulemans
    Senior Consultant
    IT Masters, Belgium
Re: Functions in substitutions (s///) and Perl 5.8.
by v_thunder (Scribe) on Oct 25, 2002 at 17:13 UTC

    Hello again,

    After some testing, I am even more baffled than before. After attempting to create some small test cases, I'm sure there is a perl bug lurking here. Here are our (teichman's and my) findings:

    This code will break:

    #!/usr/bin/perl -w $data = { key1 => 'value1', key2 => 'value2', }; sub macro_replace { $_ = shift; my $helper = sub { $_ = shift; return $data->{$_} if $data->{$_}; return undef; }; chomp (my $pwd = `pwd`); s/\[\[(pwd)\]\]/$pwd/g; s/\[\[(\w+)\]\]/@{[$helper->($1)]}/g; return $_; } print macro_replace ("blah [[key1]] [[key2]] [[pwd]] blah blah\n");

    Note the use of $_ in the helper function. Changing this to use another variable makes it all work as expected, as teichman said.

    Strangely, however, taking out the pwd substitution (i.e., commenting out lines 15 and 16) seems to stop it from crashing as well, but only in this test case. I haven't been able to reproduce the exact behavior we see in our program yet.

    This is being tested on Red Hat Linux 8.0 and SuSE Linux 8.1, both of which come with perl 5.8.0 (+/- a few patches of their own--which I believe are all unrelated to this).

    Thanks for the useful tips!

    -Dan

      I've a similar problem on RH8. I'm trying to port a a running application, and the xml parser is returning the same panic error when my handler returns. I tried unsuccessfully to use the variable substitution you mentioned, so I was wondering if you're any futher along.
      s{ <\?nm( # opening angle bracket (?: # Non-backreffing grouping paren [^>'"] * # 0 or more things that are neither > nor ' nor " | # or else ".*?" | # a section between double quotes (stingy match) | # or else '.*?' | # a section between single quotes (stingy match) ) + # repetire ad libitum # hm.... are null tags <> legal? XXX )\?> # closing angle bracket }{my $var = $1; my_handler($r, $var)}geisx; # mutate