http://www.perlmonks.org?node_id=36306


in reply to RE: Warning our Fellow Monks
in thread Warning our Fellow Monks

Would a regular expression around the query string solve the problem?
Basically, what a regex does to untaint a variable is to ensure that only valid characters are in the variable. The variable in question, in this example, is $siteinfo[1]. Unfortunately, it's very, very difficult to find an appropriate regex that can perfectly secure user-supplied data that is passed to the shell. What I would probably do in this case is creates a hash that has the user data as the key and the hash value as the actual file to be opened. I say "probably" because if you have hundreds of files that the user could open, you would want to take a different solution.
my %files = ( colors => 'colors.dat', tests => 'test.txt', names => 'names.dat', bribes => 'politicians.txt' ); open IN, "<$filepath/$files{$siteinfo[1]}" or someErrorRoutine( "Can't open $filepath/$siteinfo[1] for reading: +$!" );
In the above example, the user data is used to pull a value from a hash. Since you have created those hash values, you know they are safe. If the hash value doesn't exist, then someErrorRoutine() is called. Depending upon how your site is set up, you might want someErrorRoutine to log the details of the failure. The key point to remember here is that user data never gets close to the shell (and that you should always check to see if your open statement failed).

Also, taint checking should still be used here. The following regex is interesting to me:

$siteinfo[1] =~ /^[^.]+/(\w+\.\w+)$/ or someErrorRoutine( "Regex failed" ); my $fileToOpen = $1; # $fileToOpen untainted
In this case, we are making sure that there are NO periods prior to the final slash and only one period in the actual filename. Note that the error subroutine is called on failure. If it's not and that fails, $1 might contain an undesireable value that gets assigned to $fileToOpen.

The Moral: When untainting data, make sure you check for failure of your regex to match.

For a little more information about exploits like this, read this article (thanks to tilly for that link).

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just go the the link and check out our stats.

Replies are listed 'Best First'.
RE (tilly) 3 (untaint): Warning our Fellow Monks
by tilly (Archbishop) on Oct 12, 2000 at 00:27 UTC
    Yes, check for failure. But regular expressions to perfectly secure data passed to the shell are actually pretty easily come by. You just need to remember the cardinal rule, deny everything that is not explicitly allowed. Trying to trap everything that can go wrong is hard. Allowing very limited input, is not. Try the following:
    /^( (?:\w+\/)* # Directory components? \w+ # Start of filename (?:\.\w+)? # Extension? )$/x
    This will work for most filenames you have a reason to allow, and I guarantee you that on a standard Unix system with a standard directory structure (unless someone has placed poor symlinks, etc), there is no way to name anything that passes this which has shell problems.

    You could test length($1) if you are afraid of buffer overflows. :-)

    What you will find though is that at some point some developer wants to break the rule. That is fine, just become more lenient (eg allow an optional period in a directory) but step by step keep the philosophy the same. Only that which you have guaranteed safe shall pass.

re: why extra slash in regex? (was "Warning our Fellow Monks")
by t'mo (Pilgrim) on Mar 16, 2001 at 23:07 UTC

    The line:

    $siteinfo[1] =~ /^[^.]+/(\w+\.\w+)$/ or ...

    in Ovid's example has me confused. The extra slash (/) in the middle is throwing me off. At first glance, it looks like he meant to use the s/// operator. At second glance, it becomes apparent that it's meant as a directory separator (to exclude things like "../../". However, doesn't placing a '/' in a regexp delimited by '/' characters require an escape character (\)?

    (I must be getting a mild case of LTS)

      Yes, you need to escape the / (with a \) or change the regex delimiter.

              - tye (but my friends call me "Tye")
      10 Print "I am not worthy" + CHR(10); 20 GOTO 10
      ----
      Would be interested to see if the panel thought there's mileage in suggesting that it's unlikely that the user should want to open *any* type of file - and hence we could either assume an extension (and add it on explicitly before passing the filename to the ope() ?) - or offer a list, which would map to extensions (not just add on the field value (which could be faked to blank or a pipe)) Also - could the script not be forceably chrooted ? JAPHN (JAPH Novice)