comment on

So, you wrote a bunch of code without considering security; and now you want to 'fix' Perl; rather than fix your own code.

If I was infallible and always remembered to sanitize user inputs against malicious characters, then I wouldn't need taint mode in the first place would I? So yeah, I do expect taint mode to catch as many of my mistakes as it possibly can. But as it stands, it lets way too many things go by, many of which could easily have been stopped if it had been designed to be less lenient.

Consider the following scenario involving a reasonably qualified programmer (let’s call him Alain for the sake of argument ;-)) who makes one honest mistakes and ends up with several security threats, none of which are being caught by taint mode (although they could easily have been caught by a better designed taint mode). It goes like this…

Alain adds a new input to his CGI application, and for some reason, he forgets to sanitize it. Yet Perl doesn’t complain because for now, Alain is not using this tainted input to do anything dangerous. You might say that this isn’t a big deal because no actual vulnerability has been introduced yet. Still, it would be nice if, upon exit, Perl told Alain that this particular variable became tainted and was never untainted afterwards. This would allow Alain to address the problem right away, before that tainted variable has any chance to cause damage. But nevermind.

Fast forward a few months. Alain has used the original tainted input to derive other variables, which have themselves been used to derive other variables, and so on. So he now has a dozen tainted variables floating around his system. But Perl is not complaining, because none of them is used to do anything dangerous. But that’s about to change. ..

Today, Alain uses one of those tainted variables to compose a HTML string which he prints to his CGI script’s standard output. As a result, he is now exposing all his users to cross site scripting attacks. Yet, Perl still isn’t complaining because it doesn’t consider printing a tainted variable to STDOUT as being dangerous.

Now, you might say that this is Alain’s fault. He should know that taint mode doesn't protect him against printing tainted variables to STDOUT. So he should never print a string to STDOUT without first checking it for taintedness with the tainted() function. But this seems very error prone. If Alain forgets to do this even once, he will be exposing his users to xss attacks. Wouldn’t it be simpler and safer if taint mode did what I am suggesting above, namely, report any tainted variable that never got untainted by the end of the process?

Fast forward another couple of months. The tainted variables have spread even more, and several of them are now being printed to STDOUT, resulting in a growing number of vulnerabilites, and increasing the odds that one of them may be actually exploitable. But Perl still isn’t complaining because none of the tainted variable is being used in a system call. So, while Alain’s users are exposed to xss attacks, his own server is still unaffected by these tainted variables. But this too is about to change.

Today, Alain takes one of those tainted variables and executes a regexp match with group capture on it, to carry out a task that he never intended as a security sanitization operation. This was bound to happen at some point, given that Alain (like all Perl programmers) is deeply in love with regexps and uses them for all sorts of things (most of which have nothing to do with sanitizing malicious inputs). The net result of this action is that Alain now has a variable which is considered to be untainted, eventhough it was derived from a tainted variable and nothing was ever done to sanitize it against malicious content. Let’s call this an “inadvertently untainted malicious variable”. Later on, Alain uses one of the captured group values to compose a shell command that he passes to system() and Bam! His server has now been compromised.

Here too, you might say that this is Alain’s fault. He should know that invoking a regexp on a variable can result in inadvertently untainting the malicious captured groups. Before doing any regexp match, he should always check the string for taintedness. So now, we are saying that Alain should be doing this before every print to STDOUT and every regexp with group capture. So, even more opportunities for Alain to slip.

But wait, that’s not all. At some point Alain passes one of tainted variables to a CPAN function which, unbeknownst to him, does a regexp match. Again here, the developers of the CPAN function never meant this match to act as a security sanitization operation. The function uses a captured group to compose the return value, and as a result, the return value of the function is now an inadvertently untainted malicious variable. Alain uses the return value to compose another system() command, resulting in another vulnerability for his server

Maybe that too is Alain’s fault. He should know that third party functions can inadvertently untaint malicious variables. So before making any such call, he should always check all arguments for taintedness.

Or maybe Alain’s problem is that he doesn’t think before invoking a system call. When he is about to issue a system call, he shouldn’t trust that tainted mode did its job of identifying tainted variables that were never explicitly sanitized. Instead, he should manually trace back everything that went into composing the system command. Hum… that sounds pretty hard and error prone, but I guess that's why Alain is paid the bick bucks.

But wait… what if Alain passes one of those inadvertently untainted malicious variables to a CPAN function that, unbeknownst to him, uses it to compose a system() command? Is Alain supposed to think about that too and inspect every third party method he invokes to make sure it can’t ever result in a system command that includes one of the inputs that he passed to it? Hum... that too sounds incredibly error prone. So, maybe our last two recommendations aren't good and the best thing after all is for Alain to always check argument for taintedness before passing them to a third party function for taintedness.

So, all in all, we are saying that Alain should do this kind of check:

before every print to STDOUT
before every regexp with group capture
before every call to a third party library

Does this sound like Alain is doing all the hard work and taint mode is hardly providing any protection at all? Yes. Is Alain confident that he will always remember to do this before every print, regexp or call to a third party library? No. Could taint do a better job at helping Alain? Yes. In particular, I believe that what I propose on this page would prevent all the problems described above: http://www.perlmonks.org/?node_id=1002107.

That proposal would protect Alain against most (if not all) accidental slips like: adding 5 new inputs and forgetting to explicitly sanitize one of them. But of course, it can’t protect Alain against conscious, “premeditated” acts of negligence like: labeling a new input with untainted(), just to get rid of Perl’s error message, without actually doing anything to sanitize it (or doing a quick, half-ass job of if). As many people have pointed out, nothing can protect Alain against this kind of negligence. If Alain does something that stupid, then THAT is definitely his fault. But everything else should be caught by taint mode (and I can’t see a technical reason why it can’t be done).

In reply to Re^6: Taint mode limitations by Anonymous Monk
in thread Taint mode limitations by alain_desilets

Are you posting in the right place? Check out Where do I post X? to know for sure.
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
Want more info? How to link or How to display code and escape characters are good places to start.


Perl-Sensitive Sunglasses
	PerlMonks