Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Define regex substitution $1,$2,... from a string

by afoken (Chancellor)
on May 24, 2016 at 06:03 UTC ( [id://1163926]=note: print w/replies, xml ) Need Help??


in reply to Define regex substitution $1,$2,... from a string

s/// has an /e option that can be used twice:

Update: example

>perl -E '$_="h:/p/f.zip";$replace=q[qq<x?f=$1>];s|([^/]+)$|$replace|e +e;say' h:/p/x?f=f.zip

(End of update)

Make sure that your strings are properly validated, because the double /e option is nothing else than a string eval. So if the replacement string contains qx|rm -rf /|, you'll have a bad day:

>perl -E '$_="ABC";$replace="qx|head -5 /etc/passwd|";s/B/$replace/ee; +say' Aroot:x:0:0::/root:/bin/bash bin:x:1:1:bin:/bin:/bin/false daemon:x:2:2:daemon:/sbin:/bin/false adm:x:3:4:adm:/var/log:/bin/false lp:x:4:7:lp:/var/spool/lpd:/bin/false C

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^2: Define regex substitution $1,$2,... from a string
by Athanasius (Archbishop) on May 24, 2016 at 07:40 UTC

    Hello afoken,

    ++ for the s///ee solution. In sylph001’s original problem, your version of the replacement string would become:

    my $to = '"file_publish?file=$1"';

    FWIW, an explicit concatenation could also be used:

    my $to = '"file_publish?file=" . $1';

    I always have trouble coming up with the correct syntax for the replacement part (RHS) of an s///ee expression. It seems the first /e just de-stringifies the contents of $to, and then the second /e evaluates the resulting expression. (In this case, the evaluation consists in a variable interpolation and a string concatenation.)

    Hope that helps (someone),

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re^2: Define regex substitution $1,$2,... from a string
by sylph001 (Sexton) on May 24, 2016 at 06:46 UTC

    I spent some more time and am able to reproduce the example now.

    Thank you again for this.

     

    On the other hand, you are right that the evaluated string may expose some risk especially I'm planning to let other people define this regex string from command parameters.

    So, just limit it in the perl program scope, do you have any suggetions how we could easily check through these regex strings to make sure they do no harm?

      On the other hand, you are right that the evaluated string may expose some risk especially I'm planning to let other people define this regex string from command parameters.

      So, just limit it in the perl program scope, do you have any suggetions how we could easily check through these regex strings to make sure they do no harm?

      Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. ;-)
      -- attributed to jwz

      Yes, string eval is problematic with arbiatary input. But: How much harm can be done?

      If the script runs with user privileges (not root privileges) from a login shell, the worst thing that can happen is that the user executes some perl code with his/her own privileges. Nothing worse can happen than when you allow the user to execute perl -e SOMETHING. "Here's some rope, try not to hang yourself."

      If the script runs with user privileges, but from a network context (started from a webserver, from a mail handler, from an IRC bot, from a restricted shell invoked by some networked program), s///ee might be a bad idea. It allows an attacker to execute arbitary perl code, and from there, arbitary code on the machine, with user privileges.

      If the script runs with root privileges, perhaps using a setuid wrapper, s///ee is a plain stupid idea. Any user could run arbitary perl code, and from there, arbitary code on the machine, with root privileges.

      Root privileges + network + s///ee is the digital equivalent of the Darwin award.


      Now what?

      You could try a two-step aproach. First extract all interesting parts of the input, collect that in an array. Then use a template string and replace markers in the template with array elements.

      #!/usr/bin/perl use strict; use warnings; # user input: my $regexp='^http://www.example.com/(\d+)/([a-z]+.zip)$'; my $template='https://secure.foobar.lan/fetch?file=$2&id=$1&missingVal +ue=$3'; # data my $url='http://www.example.com/12345/foo.zip'; # actual program my @array=($url=~/$regexp/); my $result=$template; do { no warnings 'uninitialized'; $result=~s/\$(\d+)/$array[$1]/ge; }; print "'$url' rewritten to '$result'\n";

      Output:

      'http://www.example.com/12345/foo.zip' rewritten to 'https://secure.fo +obar.lan/fetch?file=&id=foo.zip&missingValue='

      <Update>Note that @array is empty if nothing was matched (to be more precise: nothing was captured). You may want to abort with an error if @array is empty.</Update>

      So, we can do that with just one /e. Is it secure now?

      NO, sorry. Regular expression patterns may contain arbitary perl code:

      my $regexp='^(?{system "uname -a"})http://www.example.com/(\d+)/([a-z] ++.zip)$'; # everything else unchanged

      My relatively new perls (5.14.1, 5.18.1) refuses to execute the code, even the old 5.8.8 from a fresh Slackware 12.1 (dated 2008) installation in a virtual machine:

      Eval-group not allowed at runtime, use re 'eval' in regex m/^(?{system + "uname -a"})http://www.example.com/(\d+)/([a-z]+.zip)$/ at pmre.pl l +ine 15.

      The really ancient 5.005_03 from Slackware 4.0 (released in 1999) has a different error message:

      /^(?{system "uname -a"})http://www.example.com/(\d+)/([a-z]+.zip)$/: E +val-group not allowed at runtime, use re 'eval' at pmre.pl line 15.

      BUT: Once you use re 'eval';, or someone manages to trick perl into importing eval from re, this little bit of protection is gone and all perls back to at least 5.005_03 start executing arbitary code from the rexexp:

      $ perl pmre.pl Linux slack121 2.6.24.5-smp #2 SMP Wed Apr 30 13:41:38 CDT 2008 i686 A +MD FX(tm)-8320 Eight-Core Processor AuthenticAMD GNU/Linux 'http://www.example.com/12345/foo.zip' rewritten to 'https://secure.fo +obar.lan/fetch?file=&id=foo.zip&missingValue=' $

      Access to perl without s///ee, and thus the ability to execute arbitary code. Ouch.

      <Update>You may want to use Safe for executing the critical code. But then, Safe is trying to prevent "evil" things instead of only allowing "secure" things. One bug in Safe and you loose again. And even Safe can't prevent everything, see RISKS in Safe.</Update>

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re^2: Define regex substitution $1,$2,... from a string
by sylph001 (Sexton) on May 24, 2016 at 06:37 UTC

    Thank you for your suggestion.

    Could you kindly let me know what's wrong with my commands when I'm trying reproduce your example as below?

    @$> perl -e '$_="h:/p/f.zip";$replace=q[qq<x?f=$1>];s/([^/]+)$/$replac +e/e+e;say' Scalar found where operator expected at -e line 1, near "s/([^/]+)$/$r +eplace" Final $ should be \$ or $name at -e line 1, within string syntax error at -e line 1, near ";s/([^/]+)$/" Execution of -e aborted due to compilation errors. @mnsdev10> perl -v This is perl, v5.8.8 built for sun4-solaris Copyright 1987-2006, Larry Wall Perl may be copied only under the terms of either the Artistic License + or the GNU General Public License, which may be found in the Perl 5 source ki +t. Complete documentation for Perl, including FAQ lists, should be found +on this system using "man perl" or "perldoc perl". If you have access to + the Internet, point your browser at http://www.perl.org/, the Perl Home Pa +ge. @$> perl -e '$_="h:/p/f.zip";$replace=q[qq<x?f=$1>];s|([^/]+)$|$replac +e|ee;say' @$>
      There's no e+e . Also, if you want to use s/// with slashes, you need to backslash literal slashes. Moreover, there's no say in 5.8.8. Use perl -l and print instead.
      perl -le '$_="h:/p/f.zip";$replace=q[qq<x?f=$1>];s=([^/]+)$=$replace=e +e;print'

      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1163926]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2024-04-25 19:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found