Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Can you assign to pos() in a s/foo/bar/g

by johngg (Canon)
on Oct 25, 2007 at 11:01 UTC ( [id://647133]=perlquestion: print w/replies, xml ) Need Help??

johngg has asked for the wisdom of the Perl Monks concerning the following question:

There were several suggested methods in answer to this thread. My solution repeatedly substituted in a while loop. I got to wondering if you could do a global substitution in one fell swoop instead by assigning to pos in a regex code block to set the match back to the start of the string. This doesn't seem to work although the assignment to pos() does seem to register. Here's a short script to show what I'm trying, firstly the loop variant then the global substitution.

#!/usr/local/bin/perl -l # use strict; use warnings; $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; 1 while s { (?<=__) [A-Z]_ } {}x; print; print q{-} x 25; $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; s { (?<=__) [A-Z]_ (?{ print pos(); pos() = 0; print pos() }) } {}gx; print; print q{-} x 25;

Here's the output

DFR7234C__A_B_C_Bonzo_Dog_D_B DFR7234C__Bonzo_Dog_D_B ------------------------- DFR7234C__A_B_C_Bonzo_Dog_D_B 12 0 DFR7234C__B_C_Bonzo_Dog_D_B -------------------------

As you can see, it doesn't look as if matching is reset so only one substitution is done. Am I trying to do something impossible here?

Cheers,

JohnGG

Replies are listed 'Best First'.
Re: Can you assign to pos() in a s/foo/bar/g
by oha (Friar) on Oct 25, 2007 at 12:14 UTC
    possible if:
    s/(__|\G)[A-Z]_/$1/g

    Oha

      Ace! That's exactly what I was missing. I had tried \G without success but hadn't thought to put it in an alternation. Adapting your solution to use a look-behind so that I'm not replacing something with itself gives us

      #!/usr/local/bin/perl -l # use strict; use warnings; $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; s{(?:(?<=__)|\G)[A-Z]_}{}g; print;

      which produces the desired

      DFR7234C__A_B_C_Bonzo_Dog_D_B DFR7234C__Bonzo_Dog_D_B

      Your solution doesn't require any messing about with pos() which is a plus, but I am still wondering whether such tinkering is possible.

      Thank you,

      JohnGG

Re: Can you assign to pos() in a s/foo/bar/g
by mwah (Hermit) on Oct 25, 2007 at 12:48 UTC

    Whats wrong with:

    my $str=q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; $str =~ s/(?<=__)([A-Z]_)+//g;

    I'd guess, pos() is localized within the code assertion. Pos() would be changeable within the while loop triggered by the regex (1st example). Maybe one of the gods may bring some light into this.

    BTW, oha presented already a very nice solution here.

    Regards

    <mwa

      Whats wrong with: ... s/(?<=__)(A-Z_)+//g;

      Nothing (other than not needing the g modifier :-) but as I said, there are a lot of ways to tackle the problem and this particular question piqued my interest. I have tried accessing pos() inside the while loop but I get "Use of uninitialized value" warnings. Doing the same with a match rather than a substitution seems fine and I can even assign to pos() to affect where it matches.

      #!/usr/bin/perl -l # use strict; use warnings; $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; while ( s{(?<=__)[A-Z]_}{} ) { print pos(); } print; print q{-} x 25; # --------------------- $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; while ( m{_}g ) { print pos(); } print q{-} x 25; # --------------------- $_ = q{DFR7234C__A_B_C_Bonzo_Dog_D_B}; print; while ( m{_}g ) { print pos(); pos() = 23 if pos() >= 12 && pos() < 26; } print q{-} x 25; # ---------------------

      produces

      DFR7234C__A_B_C_Bonzo_Dog_D_B Use of uninitialized value in print at ./spw644148G line 11. Use of uninitialized value in print at ./spw644148G line 11. Use of uninitialized value in print at ./spw644148G line 11. DFR7234C__Bonzo_Dog_D_B ------------------------- DFR7234C__A_B_C_Bonzo_Dog_D_B 9 10 12 14 16 22 26 28 ------------------------- DFR7234C__A_B_C_Bonzo_Dog_D_B 9 10 12 26 28 -------------------------

      I agree with you, oha's solution is very nice indeed.

      Thank you for your reply,

      JohnGG

        I have a similar problem but could not get oha's solution to work...

        Problem: I have big files in which all appearances of following class-Attributes shall be shortened like so:
        Bla class="x1" class="x2" class="x3" Blabla
        ==> Bla class="x1 x2 x3" Blabla

        With the substitution

        s/\s*class=\"([^\"]*)\"\s*class=\"([^\"]*)\"\s*/ class="$1 $2" /sg;

        the result is: ==> Bla class="x1 x2" class="x3" Blabla

        A loop delivers of course the correct result:

        1 while s/class=\"([^\"]*)\"\s*class=\"([^\"]*)\"/class="$1 $2"/sg;

        But because the substitution would need several passes to get the result this is not my prefered way.
        I would like to have an option, that the substitution is starting again at the position right before the replacement and not after the replacement.
        I have tried several times to use \G in the substitution like this one:

        s/\G?class=\"([^\"]*)\"\s*class=\"([^\"]*)\"/class="$1 $2"/sg;

        or

        s/\G\s*class=\"([^\"]*)\"\s*class=\"([^\"]*)\"\s*/ class="$1 $2" / while /\s*class=\"([^\"]*)\"\s*class=\"([^\"]*)\"\s*/g;

        but to no success.
        Any Idea, to manage this substitution with one pass?

        PS: in oha's example there is a little error if the string starts with _ as f.e. $_ = q{_FR7234C__A_B_C_Bonzo_Dog_D_B};

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://647133]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2024-03-19 05:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found