Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: Define regex substitution $1,$2,... from a string

by sylph001 (Sexton)
on May 24, 2016 at 06:46 UTC ( [id://1163932]=note: print w/replies, xml ) Need Help??


in reply to Re: Define regex substitution $1,$2,... from a string
in thread Define regex substitution $1,$2,... from a string

I spent some more time and am able to reproduce the example now.

Thank you again for this.

 

On the other hand, you are right that the evaluated string may expose some risk especially I'm planning to let other people define this regex string from command parameters.

So, just limit it in the perl program scope, do you have any suggetions how we could easily check through these regex strings to make sure they do no harm?

  • Comment on Re^2: Define regex substitution $1,$2,... from a string

Replies are listed 'Best First'.
Re^3: Define regex substitution $1,$2,... from a string
by afoken (Chancellor) on May 25, 2016 at 21:06 UTC

    On the other hand, you are right that the evaluated string may expose some risk especially I'm planning to let other people define this regex string from command parameters.

    So, just limit it in the perl program scope, do you have any suggetions how we could easily check through these regex strings to make sure they do no harm?

    Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems. ;-)
    -- attributed to jwz

    Yes, string eval is problematic with arbiatary input. But: How much harm can be done?

    If the script runs with user privileges (not root privileges) from a login shell, the worst thing that can happen is that the user executes some perl code with his/her own privileges. Nothing worse can happen than when you allow the user to execute perl -e SOMETHING. "Here's some rope, try not to hang yourself."

    If the script runs with user privileges, but from a network context (started from a webserver, from a mail handler, from an IRC bot, from a restricted shell invoked by some networked program), s///ee might be a bad idea. It allows an attacker to execute arbitary perl code, and from there, arbitary code on the machine, with user privileges.

    If the script runs with root privileges, perhaps using a setuid wrapper, s///ee is a plain stupid idea. Any user could run arbitary perl code, and from there, arbitary code on the machine, with root privileges.

    Root privileges + network + s///ee is the digital equivalent of the Darwin award.


    Now what?

    You could try a two-step aproach. First extract all interesting parts of the input, collect that in an array. Then use a template string and replace markers in the template with array elements.

    #!/usr/bin/perl use strict; use warnings; # user input: my $regexp='^http://www.example.com/(\d+)/([a-z]+.zip)$'; my $template='https://secure.foobar.lan/fetch?file=$2&id=$1&missingVal +ue=$3'; # data my $url='http://www.example.com/12345/foo.zip'; # actual program my @array=($url=~/$regexp/); my $result=$template; do { no warnings 'uninitialized'; $result=~s/\$(\d+)/$array[$1]/ge; }; print "'$url' rewritten to '$result'\n";

    Output:

    'http://www.example.com/12345/foo.zip' rewritten to 'https://secure.fo +obar.lan/fetch?file=&id=foo.zip&missingValue='

    <Update>Note that @array is empty if nothing was matched (to be more precise: nothing was captured). You may want to abort with an error if @array is empty.</Update>

    So, we can do that with just one /e. Is it secure now?

    NO, sorry. Regular expression patterns may contain arbitary perl code:

    my $regexp='^(?{system "uname -a"})http://www.example.com/(\d+)/([a-z] ++.zip)$'; # everything else unchanged

    My relatively new perls (5.14.1, 5.18.1) refuses to execute the code, even the old 5.8.8 from a fresh Slackware 12.1 (dated 2008) installation in a virtual machine:

    Eval-group not allowed at runtime, use re 'eval' in regex m/^(?{system + "uname -a"})http://www.example.com/(\d+)/([a-z]+.zip)$/ at pmre.pl l +ine 15.

    The really ancient 5.005_03 from Slackware 4.0 (released in 1999) has a different error message:

    /^(?{system "uname -a"})http://www.example.com/(\d+)/([a-z]+.zip)$/: E +val-group not allowed at runtime, use re 'eval' at pmre.pl line 15.

    BUT: Once you use re 'eval';, or someone manages to trick perl into importing eval from re, this little bit of protection is gone and all perls back to at least 5.005_03 start executing arbitary code from the rexexp:

    $ perl pmre.pl Linux slack121 2.6.24.5-smp #2 SMP Wed Apr 30 13:41:38 CDT 2008 i686 A +MD FX(tm)-8320 Eight-Core Processor AuthenticAMD GNU/Linux 'http://www.example.com/12345/foo.zip' rewritten to 'https://secure.fo +obar.lan/fetch?file=&id=foo.zip&missingValue=' $

    Access to perl without s///ee, and thus the ability to execute arbitary code. Ouch.

    <Update>You may want to use Safe for executing the critical code. But then, Safe is trying to prevent "evil" things instead of only allowing "secure" things. One bug in Safe and you loose again. And even Safe can't prevent everything, see RISKS in Safe.</Update>

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1163932]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (4)
As of 2024-04-25 12:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found