Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Manipulating the Capture(s) of Regular Expressions

by monarch (Priest)
on Jan 12, 2009 at 04:29 UTC ( [id://735601]=perlmeditation: print w/replies, xml ) Need Help??

A couple of recent newbie-class queries (see Replace part of a regex match, replace text with function ?) have brought to my attention something I've naturally considered in the past but dismissed as I've known alternative ways to solve the problem.

Often text substitution is performed using the s/// operator (see perlop). E.g.

my $text = "The cow jumped over the moon"; $text =~ s/(\w+\s+)jumped/$1tripped/;

One might often consider a more natural way of expressing this as:

if ( $text =~ m/cow\s+(jumped)\s+over/ ) { $1 = "tripped"; }
but of course this is not possible in Perl 5.

In fact it's not been uncommon to be finding myself wishing I could express such if match-replace expressions, not least because I might have other triggers that need to occur in the event that the match was made, e.g.:

if ( $text =~ m/cow\s+(\w+)\s+over/ ) { do_action( $1 ); $1 = calculate_new_action( $1 ); }
rather than
if ( $text =~ m/cow\s+(\w+)\s+over/ ) { do_action( $1 ); my $new_action = calculate_new_action( $1 ); $text =~ s/(cow\s+)(\w+)(\s+over)/$1$new_action$3/; }

I wonder if others consider allowing the captured variables to be modified as possible and/or pragmatic?

Replies are listed 'Best First'.
Re: Manipulating the Capture(s) of Regular Expressions
by ysth (Canon) on Jan 12, 2009 at 04:58 UTC
      And by extension,
      sub transform { my ($ref) = @_; $$ref = 'tripped'; } $text = 'The brown cow jumped over the moon'; if ( $text =~ m/cow\s+(jumped)\s+over/ ) { my $ref = \substr($text, $-[1], $+[1] - $-[1]); transform($ref); } print("$text\n"); # The brown cow tripped over the moon

      Update: Expanded the example to be runnable.

        You two just shot to the top of my mental 'monks who really know Perl and should be listened to' list... :)

        (Edit: well, maybe just after TimToady... :)


        Life is denied by lack of attention,
        whether it be to cleaning windows
        or trying to write a masterpiece...
        -- Nadia Boulanger
      Wow; this is perfect, and I thought I'd read the Camel book (I was wrong). I just looked through it again now (3rd edition), and indeed it's documented in the Special Variables section.
Re: Manipulating the Capture(s) of Regular Expressions
by JavaFan (Canon) on Jan 12, 2009 at 09:59 UTC
    I'd write that as:
    $text =~ s{cow\s+\K(\w+)(?=\s+over)} {do_action($1); calculate_new_action($1)}e;
Re: Manipulating the Capture(s) of Regular Expressions
by Jenda (Abbot) on Jan 12, 2009 at 15:09 UTC

    I would rather not. What if the string was modified in the meantime? What if the $1 doesn't reference the string you think it does? The later is dangerous even now, but if you can modify some other variable by modifying $1, the errors will be much harder to debug.

    The only place in which I would consider modifying the $1,$2,.. variables would be inside the replacement code of something like

    $text =~ s{cow\s+(\w+)\s+over}{ do_action( $1 ); $1 = calculate_new_action( $1 ); }E;

    That is something similar to s///e, but leting you replace the individual captures instead of the whole matched substring. Sometimes it might be more convenient. But I don't think it's worth implementing.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://735601]
Approved by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-03-29 02:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found