http://www.perlmonks.org?node_id=408346

bobf has asked for the wisdom of the Perl Monks concerning the following question:

Oh wise ones,

I am trying to write a function that returns a string that is assembled from a regex, but the order that the captured substrings are assembled in is determined by one of the passed arguments. Here is an overly simplified example of what I'm trying to do:

my $newstring = munge_string( 'one_two_three', '312' ); sub munge_string { my ( $string, $patternkey ) = @_; # patterns can also be 'text$1$2$3', etc my %patterns = ( 123 => '$1$2$3', 312 => '$3$1$2' ); # in reality the regex is also dynamic (based on $patternkey) $string =~ m/(\w+)_(\w+)_(\w+)/; return $patterns{$patternkey}; # want 'threeonetwo', got '$3$1$2' }
How can I get the function to return a double-interpolated value (once to get the hash value, then again to get the captured substrings)?

I tried using eval (to no avail), and then s///e instead of m//, but I only got single interpolation:

$string =~ s/(\w+)_(\w+)_(\w+)/$patterns{$patternkey}/e; print $string; # '$3$1$2'
If I put the replacement pattern in directly, though, it works fine:
$string =~ s/(\w+)_(\w+)_(\w+)/$3$1$2/; print $string; # 'threeonetwo'
or
$string =~ s/(\w+)_(\w+)_(\w+)/join( '', $3, $1, $2 )/e; print $string; # 'threeonetwo'

Is there a way to do this cleanly, or am I going about this the wrong way? Thanks in advance.

Replies are listed 'Best First'.
Re: Double interpolation of captured substrings
by gaal (Parson) on Nov 17, 2004 at 09:59 UTC
    You might consider assigning your matches into an array, and subscripting that programmatically.

    my %patterns = ( 123 => [ 1, 2, 3 ], 312 => [ 3, 1, 2 ] ); @matches = $string =~ /regexp/; $string = join "", @matches[ @{ $patterns{$patternkey} } ];
    (You need error handling here, of course; and it probably makes sense to promote the scope of %patterns a bit so that it doesn't get constructed every time you enter the sub.)
Re: Double interpolation of captured substrings
by Arunbear (Prior) on Nov 17, 2004 at 10:05 UTC
    Use array slicing:
    my %patterns = ( 123 => [0,1,2], 312 => [2,1,0] ); my @matches = $string =~ m/$some_pattern/; return join '', @matches[@{$patterns{$patternkey}}];
Re: Double interpolation of captured substrings
by ihb (Deacon) on Nov 17, 2004 at 10:21 UTC

    I would recommend using another way than interpolation to do what you want, but it can still be a fun exercise.

    If you don't want to exercise and really need the functionality, look at String::Interpolate.

    To show how to do what you tried to do, for educational purposes:

    # Use something more sophisticated which escapes properly. sub quote { qq!"$_[0]"! } my ($x, $y, $z) = qw/ X Y Z /; my $str = 'Foo $x bar $y baz $z'; print $str; # Foo $x bar $y baz $z print quote($str); # "Foo $x bar $y baz $z" print eval quote(str); # Foo X bar Y baz Z
    First generate a string that is double-quoted. This string then becomes what you would use in Perl if you had a change to type it yourself. Since it now is Perl code, use eval to evaluate it i.e. interpolate the variables.

    ihb

    See perltoc if you don't know which perldoc to read!
    Read argumentation in its context!

Re: Double interpolation of captured substrings
by BrowserUk (Patriarch) on Nov 17, 2004 at 10:26 UTC

    You can do away with the hash completely by passing the order as a list rather than a string.

    #! perl -slw use strict; sub munge { my( $str, $regex ) = ( shift, shift ); my @matches = ( undef, $str =~ $regex ); return join'', @matches[ @_ ]; } print munge( 'one_two_three', '^([^_]+)_([^_]+)_([^_]+)$', 2, 1, 3 ); __END__ [10:24:12.59] P:\test>junk twoonethree

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

      You could also allow for referencing the whole match as 0 with:

      $str =~ $regex; my @matches = map substr( $str, $-[$_], $+[$_] - $-[$_] ), 0..$#-;
Re: Double interpolation of captured substrings
by TedPride (Priest) on Nov 17, 2004 at 10:26 UTC
    my $newstring = munge_string( 'one_two_three', '312' ); print $newstring; sub munge_string { $_ = $_[1]; split /[^a-zA-Z]+/, $_[0]; s/(\d)/$_[$1-1]/g; return $_; }
    I'm assuming you don't want to modify the original strings. I'm also assuming that your words are divided on non-letter boundaries.
Re: Double interpolation of captured substrings
by bobf (Monsignor) on Nov 17, 2004 at 23:26 UTC

    Thank you all for your responses. gaal and Arunbear's suggestion to save the matches in an array and then stringify them later looks like the best option (at least for now). Some of the patterns contain other (non-captured) text, so I'll have to use something a bit trickier than just taking a slice, but the overall idea should work fine.

    ihb's recommendation of String::Interpolate is definitely one I'll file away for later, and the quote/eval example clearly showed what I was missing in my initial attempts. I like the code provided by BrowserUk and Fletch, as it is more generalizable than what I have now, but one of my goals is to keep the formats of the returned strings encapsulated in the function itself. That way if one of the formats is changed I only have to update the values in the hash, rather than track down every call that uses that specific format and update each one individually. TedPride's solution is a clever use of split, but unfortunately the variety of input data is such that coming up with a generalizable split would be difficult.

    In summary, I think I'll start by throwing the captured values into an array and then using indices to access them, rather than trying to interpolate $1, etc directly. Thanks again for all of the creative (and better yet, functional!) examples. I've definitely got a good start now, but if any alternative ideas come to mind, I'd love to hear them!