http://www.perlmonks.org?node_id=647250


in reply to Re: Can you assign to pos() in a s/foo/bar/g
in thread Can you assign to pos() in a s/foo/bar/g

Whats wrong with: ... s/(?<=__)(A-Z_)+//g;

Nothing (other than not needing the g modifier :-) but as I said, there are a lot of ways to tackle the problem and this particular question piqued my interest. I have tried accessing pos() inside the while loop but I get "Use of uninitialized value" warnings. Doing the same with a match rather than a substitution seems fine and I can even assign to pos() to affect where it matches.

#!/usr/bin/perl -l # use strict; use warnings; $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; while ( s{(?<=__)[A-Z]_}{} ) { print pos(); } print; print q{-} x 25; # --------------------- $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; while ( m{_}g ) { print pos(); } print q{-} x 25; # --------------------- $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; while ( m{_}g ) { print pos(); pos() = 23 if pos() >= 12 && pos() < 26; } print q{-} x 25; # ---------------------

produces

DFR7234C__A_B_C_Bonzo_Dog_D_B Use of uninitialized value in print at ./spw644148G line 11. Use of uninitialized value in print at ./spw644148G line 11. Use of uninitialized value in print at ./spw644148G line 11. DFR7234C__Bonzo_Dog_D_B ------------------------- DFR7234C__A_B_C_Bonzo_Dog_D_B 9 10 12 14 16 22 26 28 ------------------------- DFR7234C__A_B_C_Bonzo_Dog_D_B 9 10 12 26 28 -------------------------

I agree with you, oha's solution is very nice indeed.

Thank you for your reply,

JohnGG

Replies are listed 'Best First'.
Re^3: Can you assign to pos() in a s/foo/bar/g
by Jaikov (Initiate) on Feb 25, 2010 at 12:53 UTC

    I have a similar problem but could not get oha's solution to work...

    Problem: I have big files in which all appearances of following class-Attributes shall be shortened like so:
    Bla class="x1" class="x2" class="x3" Blabla
    ==> Bla class="x1 x2 x3" Blabla

    With the substitution

    s/\s*class=\"([^\"]*)\"\s*class=\"([^\"]*)\"\s*/ class="$1 $2" /sg;

    the result is: ==> Bla class="x1 x2" class="x3" Blabla

    A loop delivers of course the correct result:

    1 while s/class=\"([^\"]*)\"\s*class=\"([^\"]*)\"/class="$1 $2"/sg;

    But because the substitution would need several passes to get the result this is not my prefered way.
    I would like to have an option, that the substitution is starting again at the position right before the replacement and not after the replacement.
    I have tried several times to use \G in the substitution like this one:

    s/\G?class=\"([^\"]*)\"\s*class=\"([^\"]*)\"/class="$1 $2"/sg;

    or

    s/\G\s*class=\"([^\"]*)\"\s*class=\"([^\"]*)\"\s*/ class="$1 $2" / while /\s*class=\"([^\"]*)\"\s*class=\"([^\"]*)\"\s*/g;

    but to no success.
    Any Idea, to manage this substitution with one pass?

    PS: in oha's example there is a little error if the string starts with _ as f.e. $_ = q{_FR7234C__A_B_C_Bonzo_Dog_D_B};

      I'm not sure you need the complication of oha's method here as it depends on the fact that the left side of the alternation will still match once you have done the first substitution whereas in your problem it won't. Using just a capture group seems to work.

      $ perl -E ' > $_ = q{Bla class="x1" class="x2" class="x3" Blabla}; > say; > s{" class="([^"]+)}{ $1}g; > say;' Bla class="x1" class="x2" class="x3" Blabla Bla class="x1 x2 x3" Blabla

      Attacking the problem from (literally) a different direction would be another way of achieving the same result without a capture group.

      $ perl -E ' > $_ = q{Bla class="x1" class="x2" class="x3" Blabla}; > say; > $r = reverse $_; > $r =~ s{"=ssalc "}{ }g; > $_ = reverse $r; > say;' Bla class="x1" class="x2" class="x3" Blabla Bla class="x1 x2 x3" Blabla $

      I hope this is helpful.

      Cheers,

      JohnGG

      Update: I forgot to mention, double-quotes are not regex metacharacters so they don't need to be escaped.

        Thx John, your tip was very helpful, as it gives me the right idea...

        Perhaps I did not explain good enough the scenario. But both your solutions have a little error, example:
        "Bla class="x1" class="x2" otheratr="y0" class="x3" Blabla
        ==> Bla class="x1 x2" otheratr="y0 x3" Blabla
        correct would be:
        Bla class="x1 x2" otheratr="y0" class="x3" Blabla
        (perfect would be:
        Bla class="x1 x2 x3" otheratr="y0" Blabla)

        one-pass solution:

        s/(?<=class=")([^"]*)"\s*class="/$1 /g;

        But for smaller files with few iterations the while-loop solution is faster, probably because of the lookbehind operator.

        1 while $s1 =~ s/class=\"([^\"]*)\"\s*class=\"([^\"]*)\"/class="$1 $2"/g;

        Thx again,
        Jaikov