Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Progressive matching w/substitutions

by argv (Pilgrim)
on Aug 09, 2008 at 03:35 UTC ( [id://703225]=perlquestion: print w/replies, xml ) Need Help??

argv has asked for the wisdom of the Perl Monks concerning the following question:

The ORA Programming Perl book discusses progressive matching, but only uses m// matching, leaving the question of whether s/// works, too. One would assume so, but if you use the sample code and modify it to use s/// substitutions, none of the examples work anymore.

I also tried using combinations of uses that start with m// in the "while" loop, but using s/// inside the block to modify the string, and then reset the regex position using pos, but again, nothing works. here's sample code:

my $string = "abc abc abc abc"; my $x = 1; while ($string =~ /abc /ig) { my $i = pos($string); $string =~ s/(abc) /$1 def($x)/; print "$string\n"; pos($string) = $i + 6; } print "\nDone. string = '$string'\n";

The desired result should look like this:

abc def(1) abc def(2) abc def(3) abc def(4)

Instead, it produces "abc def(3)def(2)def(1)abc abc abc" and the reason is obvious: the regex always starts back at 0, and it does this no matter what I do. (The above sample code is but one of a variety of attempts.)

In addition to someone solving the specific problem with the code, I'd like a more specific explanation for what the perl rules are concerning progressive matching and the use of substitutions. Is it different than m//? Is it undefined? Is there a specific method documented that I'm not seeing? And why doesn't the above pos call set the position accordingly? UPDATE: Incidentally, I know I can get what I want using this code:

while ($string =~ s/(abc) ([^d])/$1 def($x) $2/i) ...

The point is to understand the proper way of doing progressive matching with substitutions... my so-called 'workaround' won't work in the actual program I'm using, as I can't be sure what the next letter(pattern) would be after the 'abc' match.

Replies are listed 'Best First'.
Re: Progressive matching w/substitutions
by ikegami (Patriarch) on Aug 09, 2008 at 04:35 UTC

    \G is like ^, but uses pos. And apparently, it works even without the "g" modifier present.

    while ($string =~ /abc/ig) { my $ins = sprintf(" def(%s)", $x++); my $pos = $+[0]; $string =~ s/\G/$ins/; pos($string) = $pos + length($ins); }

    But are you using the substitution operator at all?

    while ($string =~ /abc/ig) { my $ins = sprintf(" def(%s)", $x++); my $pos = $+[0]; substr($string, $pos, 0, $ins); pos($string) = $pos + length($ins); }

    Finally, this is really the task of the "e" modifier.

    $string =~ s/(abc)/ sprintf("%s def(%s)", $1, $x++) /eig;
      You said:
      Finally, this is really the task of the "e" modifier.
      $string =~ s/(abc)/ sprintf("%s def(%s)", $1, $x++) /eig;
      Now you're getting directly to my point. Let's now rewrite this to using a while loop, as in:

      while ($string =~ s/(abc)/ sprintf("%s def(%s)", $1, $x) /eig) { die "oops" if $x++ > 10; }

      Why doesn't this work? If you remove the 'g' modifier, it still doesn't work -- it just runs in an infinite loop.

      I think what I'm getting to is that I assumed that the "progressive" aspect of regex applied to both matching and substitution, since each of those has the common root of matching. (ie., substitution requires at least a match in order to do the substitution) If that were the case, then the above should work. Because it doesn't, the code has to be downgraded to using multiple lines that include the use of pos and such. In the live code where this applies, calls to other functions have to be made, which is why there has to be a loop, not just a simple s/// expression as your example illustrates. Granted, the "substitution" aspect can be as simple as you have, but the value of $x is derived from other code and various tests and manipulations. That's why my "simple" example uses $x++ inside the while loop, to illustrate that it has to be separate from the s/// part.

        Why doesn't this work?

        Read up on the return value of s/// in scalar context.

        calls to other functions have to be made

        What do you think sprintf is? The body of the loop is the replace expression. Put the die in there.

Re: Progressive matching w/substitutions
by dreadpiratepeter (Priest) on Aug 09, 2008 at 04:19 UTC
    wouldn't:
    my $string = "abc abc abc abc";
    my $x = 1;
    $str =~ s/abc/"abc def(" . $x++ . ")"/ge;
    print "\nDone. string = '$string'\n";
    
    work just fine, without the looping?


    -pete
    "Worry is like a rocking chair. It gives you something to do, but it doesn't get you anywhere."
Re: Progressive matching w/substitutions
by JStrom (Pilgrim) on Aug 09, 2008 at 04:23 UTC
    1:
    my $string = "abc abc abc abc"; my $x = 1; while ( $string =~ /abc /ig ) { my $i = pos($string); substr( $string, $-[0], $+[0]-$-[0]) =~ s/(abc) /$1 def($x)/; print "$string\n"; pos($string) = $i + 6; } print "\nDone. string = '$string'\n";
    2:
    my $string = "abc abc abc abc"; my $x = 1; while ( $string =~ /(?=abc )/ig ) { my $i = pos($string); $string =~ s/\G(abc) /$1 def($x)/; print "$string\n"; pos($string) = $i + 6; } print "\nDone. string = '$string'\n";
      Neither of those produces the desired output for me. $x is never incremented. Since you require a space at the end of the string-to-be-substituted, there is no substitution after the last abc, and the desired spacing in the output is wrong:
      abc def(1)abc abc abc abc def(1)abc def(1)abc abc abc def(1)abc def(1)abc def(1)abc Done. string = 'abc def(1)abc def(1)abc def(1)abc'

      Putting $x++; in your while loop after the substitution will fix the former, but not the latter.

      Following on the other examples in this thread, one possibility is the following:

      1:

      my $string = "abc abc abc abc"; my $x = 1; # while ( $string =~ /abc /ig ) { while ( $string =~ /abc/ig ) { my $i = pos($string); # substr( $string, $-[0], $+[0]-$-[0]) =~ s/(abc) /$1 def($x)/; substr( $string, $-[0], $+[0]-$-[0]) =~ s/(abc)/"$1 def(" . $x++ . +")"/e; print "$string\n"; pos($string) = $i + 6; }

      2:

      my $string = "abc abc abc abc"; my $x = 1; # while ( $string =~ /(?=abc )/ig ) { while ( $string =~ /(?=abc)/ig ) { my $i = pos($string); # $string =~ s/\G(abc) /$1 def($x)/; $string =~ s/\G(abc)/"$1 def(" . $x++ . ")"/e; print "$string\n"; pos($string) = $i + 6; } print "\nDone. string = '$string'\n";

        Neither of those produces the desired output for me. $x is never incremented. Since you require a space at the end of the string-to-be-substituted, there is no substitution after the last abc, and the desired spacing in the output is wrong:

        Those problems were present in the OP's posted code. They weren't problems with the post to which you replied. Presumably, the OP's code is different than what he posted, since the output he gave doesn't match the code he posted. Since the OP will have to retrofit the solution anyway, these details aren't important. That's why I took a few small liberties in my own solution.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://703225]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (5)
As of 2024-03-29 11:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found