Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Dangerous Characters for system calls

by derekstucki (Acolyte)
on Oct 15, 2013 at 21:48 UTC ( #1058359=perlquestion: print w/ replies, xml ) Need Help??
derekstucki has asked for the wisdom of the Perl Monks concerning the following question:

I'm writing an email form who's content has to be passed through various Linux system calls. What are all the characters I need to escape in the string to not pose a risk in system()? The only ones I can think of are ';' and '|', but I'm sure there's more that I'm missing.

Comment on Dangerous Characters for system calls
Re: Dangerous Characters for system calls
by hippo (Curate) on Oct 15, 2013 at 22:03 UTC

    Don't forget the ampersand, brackets, braces, quotes, dollars and all that nasty whitespace.

    I sincerely hope you are running in taint mode.

Re: Dangerous Characters for system calls (sendmail)
by Anonymous Monk on Oct 15, 2013 at 22:53 UTC

    Um, do it in such a way as to not require that :)

    For example see TFMail,

    or use four argument open $fh, '-|', 'sendmail', $arg, $arg, $arg, $arg

    or use String::ShellQuote

Re: Dangerous Characters for system calls
by graff (Chancellor) on Oct 15, 2013 at 23:53 UTC
    Following up on the 2nd reply (++ on that one), I think it's hard to imagine a situation where the content from an email form "has to be passed through various Linux system calls." Maybe you think it has to, but I suspect you're wrong.

    Whatever Linux processes you're talking about, there are bound to be ways to do what you intend to do without exposing untrusted text to a shell command line.

    As for what the "risky" characters are, it's likely that all ASCII characters that match [^^/%@+\w-] are able to invoke "non-literal meanings" in a bash command line. Some (like ~ or #) might only do this if they occur in certain positions.

    As for any non-ASCII characters that might happen to show up from a web form, well, who knows... I'd rather not have to experiment with that.

Re: Dangerous Characters for system calls
by CountZero (Bishop) on Oct 16, 2013 at 06:02 UTC
    Run it in taint mode and only accept what is allowed and reject all other.

    [a-zA-Z0-9] seems a safe set, but ultimately it will depend on what system commands you want to run.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

    My blog: Imperial Deltronics
      Perl has a built-in function for [A-Za-z_0-9]: quotemeta()

        Perl has a built-in function for A-Za-z_0-9: quotemeta()

        Nearly there. Too bad, Anonymous Monk cannot fix his own typos. Correctly said and taken from perldoc -f quotemeta:

        quotemeta

        Returns the value of EXPR with all the ASCII non-"word" characters backslashed. (That is, all ASCII characters not matching "/[A-Za-z_0-9]/" will be preceded by a backslash in the returned string, regardless of any locale settings.) This is the internal function implementing the "\Q" escape in double-quoted strings. (See below for the behavior on non- ASCII code points.)

        ...

        Cheers, Sören

        Créateur des bugs mobiles - let loose once, run everywhere.
        (hooked on the Perl Programming language)

        Not at all! That will escape "dangerous" characters for Perl but does not at all guarantee that the resulting string is safe (or will even work at all) for anything else but Perl.

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        My blog: Imperial Deltronics
Re: Dangerous Characters for system calls
by soonix (Curate) on Oct 16, 2013 at 08:11 UTC

    If really "various" system calls are necessary, chances are that there are characters that need to be escaped in one case and shouldn't be escaped in another. For example, ISTR different regex engines have different conventions for grouping parentheses (escaped for literal, unescaped for groups, or the other way around for "older" engines).

    And depending on how various the called systems are, escaping is not always by backslash, and the treatment of "unnecessary" escapes also depends on the called system.

    Of course, you have to draw a line somewhere before becoming paranoid. Re: Dangerous Characters for system calls seems to be a good start, although I'd add some whitespace :-)
    [a-zA-Z0-9 ]

Re: Dangerous Characters for system calls
by QM (Vicar) on Oct 16, 2013 at 09:18 UTC
    Another problem not mentioned above is that chaining calls together, each will eat some escapes. So the number of backwhacks will be dependent on the length of the chain.

    It's almost better if you were on Windoze, as that would make escaping correctly nigh impossible, and certainly masochistic. Unfortunately there is a chance of getting something to work on *nix.

    -QM
    --
    Quantum Mechanics: The dreams stuff is made of

Re: Dangerous Characters for system calls
by derekstucki (Acolyte) on Oct 30, 2013 at 22:07 UTC
    Thanks for all the good advice. Perhaps I should clarify what I'm doing to help narrow down the advice I need. I'm opening a filehandle that goes something like "| commanda | commandb | sendmail", and then need to feed the form input, after some header information, into this filehandle. Is this a safer or more dangerous way than doing some other thing? Is there even a need to escape characters that are being piped into a command filehandle?

      Briefly: Avoid the shell! Use the form of system with multiple arguments that bypasses the shell (see also exec). Your security worries about anything [^A-Za-z0-9] will go away. Get your piping needs a different way:

      system() won't let you capture the command's STDOUT, you can get that with the LIST form of open (see also Safe Pipe Opens) or the easier-to-use capturex from IPC::System::Simple. Or, the commands you are calling may have switches that get them to write their output to a (temporary) file.

      The commands you are calling probably don't require their input on STDIN? Write your input for each command into a temporary file and pass the filename to the command via its command line.

      There's also IPC::Run3, which can avoid the shell (pass an arrayref as the first argument) and which allows you to redirect STDIN, STDOUT and STDERR. (One small downside being there's no support for piping stuff directly from one command to the next, so as above you'll have to keep things in memory or in temp files in between commands.)

      As for any other modules, carefully read their documentation to see if they allow you to bypass the shell or not, in my experience many of them don't.

      Lastly, there are usually lots of system commands that can be emulated in Perl directly (often with CPAN modules), such as sendmail, so you might not need all of those external programs in the first place.

      Yes, it's a little more work and a couple of extra temp files than just using good old shell pipes, but I've learned to love having less potential issues to worry about :-)

      Note: File::Temp's UNLINK option is useful.

      Sorry if this reply is coming too late to be helpful. The situation you describe, of printing the text from the form directly into a pipeline of chained commands, could be relatively risk-free, provided that the commands behave reasonably well when presented with untrusted input on stdin.

      That is, if "commanda" and "commandb" really are just (fairly robust) stdin-stdout filters - and if the strings to run those commands are fully defined in your code (i.e. do not contain untrusted strings from the web form), then you won't really be exposing any untrusted data as part of a command line.

      Obviously, if "commanda" or "commandb" are not robust when given untrusted input (e.g. if they assume line-oriented input but don't know how to handle really long input lines, or they assume ASCII-only input and do unpredictable things with non-ASCII data), then your process is still facing risks, unless you filter the data appropriately before writing it to the pipe.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1058359]
Approved by marinersk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (15)
As of 2014-09-22 17:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (198 votes), past polls