Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

quoting for system() and friends

by pileofrogs (Priest)
on Mar 06, 2007 at 01:42 UTC ( #603334=perlquestion: print w/ replies, xml ) Need Help??
pileofrogs has asked for the wisdom of the Perl Monks concerning the following question:

In DBI there's a method 'quote' to the database handle object. It takes whatever string you passed in and escapes any quotes so you can pass the results on to your database server knowing it's a little harder for baddies to tack on extra SQL commands.

My question is, is there a similer method/function somewhere to escape data before sending it on to the shell? I immagine it would have to be shell specific...

(Ok, I know, I'm supposed to use system in list mode or some other things so I never have to call the shell. Well, I'm curious anyway...)

Comment on quoting for system() and friends
Re: quoting for system() and friends
by xdg (Monsignor) on Mar 06, 2007 at 02:45 UTC

    If you pass arguments as a list, then Perl takes care of reassembling them appropriately escaped if it needs to on certain platforms (e.g. Win32) -- that's one reason why the recommendation is pass a list to system. That said, I did see some logic for this buried in Shell, but it's not easily available for external use.

    But that speaks to ensuring quotes, spaces, etc. make it through to the shell as intended. If you're trying to make a potentially malicious string safe before passing it to the shell, that seems like a tremendously large security risk if you get it wrong so I'd think about whether there's another way, particularly if you want it to be cross-platform, cross-shell, etc.


    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: quoting for system() and friends
by kyle (Abbot) on Mar 06, 2007 at 02:57 UTC

    I'd encourage you to look at the system documentation and pass a list rather than try to quote things.

    That having been said, I've used this at times:

    sub shell_escape { my ( $string ) = @_; $string =~ s/\\/\\\\/g; $string =~ s/\"/\\\"/g; $string =~ s/\$/\\\$/g; $string =~ s/\`/\\\`/g; return $string; }

    I use this to quote a string that I'm going to pass to the shell in double quotes.

    my $suspect = shift; my $quoted_suspect = shell_escape( $suspect ); system( qq{echo "$quoted_suspect"} );

    As I recall, I got the list of characters to quote from the bash man page somewhere, but I don't recall where. I'm comfortable using it to pass my arguments as I like, but I'm not sure I'd trust it to correctly escape a string created by a malicious attacker. In that case, use at your own risk.

      You might not be able to inject anything using it, but a NUL can confuse things:
      $ perl -e '$qs=chr(0); system( qq{echo "$qs"} );' Syntax error: Unterminated quoted string

      I wonder if there's a UNICODE character other than " and \ whose UTF-8 encoding contains the byte 34 (") or 92 (\)... ( No, doesn't appear to be. I bet that if I looked into how UNICODE points are transformed into UTF-8, I'd find the presence of those bytes impossible. The 8th bit seems to always be set. )

        I bet that if I looked into how UNICODE points are transformed into UTF-8, I'd find the presence of those bytes impossible.

        Definitely a safe bet. In UTF8, it's either ASCII, or it's part of a wide (multi-byte) character. All bytes of every UTF8 wide character always have their 8th bit set (i.e. are not ASCII). There's a section of the perlunicode manual that explains UTF8 encoding, and here is the very helpful chart that summarizes (slightly modified):

        Scalar Value UTF-8 (UTF-16 range) 1st byte 2nd byte 3rd byte 4th by +te 0000.0000,0xxx.xxxx 0xxx.xxxx (U0000 - U007F) (00 - 7F) 0000.0yyy,yyxx.xxxx 110y.yyyy 10xx.xxxx (U0080 - U07FF) (C2 - DF) (80 - BF) zzzz.yyyy,yyxx.xxxx 1110.zzzz 10yy.yyyy 10xx.xxxx (U0800 - UFFFF) (E0 - EF) (A0 - BF) (80 - BF) u.uuuu,zzzz.yyyy,yyxx.xxxx 1111.0uuu 10uu.zzzz 10yy.yyyy 10xx.x +xxx (U010000 - U10FFFF) (F0 - F7) (90 - BF) (80 - BF) (80 - +BF) 1101.10ww,wwzz.zzyy * 1101.11yy,yyxx.xxxx * uuuuu = wwww + 1 (i.e. uuuuu - 1 = wwww, given 10000(b) >= uuuuu >= + 1)
        (The rows that match /^[10xyz., ]+$/ are showing the bit patterns that demonstrate how the 16-bit character code point value is distributed over one or more bytes in UTF8; the rows containing hex numbers show the ranges implied by the bit patterns.)

        Note that unicode defines characters with code points beyond the 16-bit range, and these are cleanly stored as 4-byte characters in utf8; they're a bit messy in utf-16 (involving 16-bit code points in the evil "surrogate range").

      As I recall, I got the list of characters to quote from the bash man page somewhere

      On systems with sh, sh (not bash) is used (even if the user's default shell is bash). You need to consult the sh man page. On FreeBSD, the man page for sh says:

      Enclosing characters within double quotes preserves the literal meaning of all characters except dollarsign (`$'), backquote (``'), and backslash (`\'). The backslash inside double quotes is historically weird. It remains literal unless it precedes the following characters, which it serves to quote: $  `  "  \  \n

Re: quoting for system() and friends
by graff (Chancellor) on Mar 06, 2007 at 03:29 UTC
    If you're taking a string from some source that could be supplied by "baddies", I think your best bet is to have a finite set of shell operations that are possible, a finite set of args for those operations, etc, and only execute a shell command if the input matches the various available choices. Using character quoting and escapes to limit particular shell facilities such as compound commands, redirection and variable substitution is still inadequate -- there may be potentially ruinous commands that don't involve any of the special non-alphabetic characters.

    On the other hand, if the purpose of your script is to facilitate some shell operations for "trusted" (and qualified) users -- e.g. those who have shell login access on this machine (and perhaps are members of a specified group or access-control list) -- maybe you don't want to be so restrictive: go ahead and let these people use as wide a range of shell facilities as may be helpful to them. The use of semicolons, pipes, ampersands and angle-brackets should be okay for these people, because they would be able to use these things anyway, without the benefit of your perl script. Why limit the potential utility of your script in this case?

    So, my point is, you're asking the wrong question. It's not a matter of figuring out how to "sanitize" an arbitrary shell command line string. Either it's a matter of how to specify the exact range of shell operations permitted for strangers (constructing a command line from choices you make available), or else it's a matter of making your perl script as transparent and flexible as possible in assisting qualified/trusted users when they need to do shell operations.

    (updated to fix grammar)

Re: quoting for system() and friends
by snoopy (Deacon) on Mar 06, 2007 at 06:18 UTC
    There's a function quotemeta:

    To quote from the doco:

    Returns the value of EXPR with all non-alphanumeric characters backslashed. (That is, all characters not matching /A-Za-z_0-9/ will be preceded by a backslash in the returned string, regardless of any locale settings.) This is the internal function implementing the \Q escape in double-quoted strings.

      Forgot to check the requirements? quotemeta is useless here.

      $ perl -e '$qs=quotemeta("|"); system( qq{echo "$qs"} );' \|
        If you're constructing SQL statements, you avoid putting quotation marks around a string that allready been formatted with DBI's quote method.

        This is analogous as to how shell arguments constructed using the quotemeta function should be treated, ie:

        $ perl -e '$qs=quotemeta("|"); system( qq{echo $qs} );'

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://603334]
Approved by GrandFather
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2014-10-31 22:26 GMT
Find Nodes?
    Voting Booth?

    For retirement, I am banking on:

    Results (225 votes), past polls