good chemistry is complicated,
and a little bit messy -LW
So, you wrote a bunch of code without considering security; and now you want to 'fix' Perl; rather than fix your own code.
If I was infallible and always remembered to sanitize user inputs against malicious characters, then I wouldn't need taint mode in the first place would I? So yeah, I do expect taint mode to catch as many of my mistakes as it possibly can. But as it stands, it lets way too many things go by, many of which could easily have been stopped if it had been designed to be less lenient.
Consider the following scenario involving a reasonably qualified programmer (letís call him Alain for the sake of argument ;-)) who makes one honest mistakes and ends up with several security threats, none of which are being caught by taint mode (although they could easily have been caught by a better designed taint mode). It goes like thisÖ
Alain adds a new input to his CGI application, and for some reason, he forgets to sanitize it. Yet Perl doesnít complain because for now, Alain is not using this tainted input to do anything dangerous. You might say that this isnít a big deal because no actual vulnerability has been introduced yet. Still, it would be nice if, upon exit, Perl told Alain that this particular variable became tainted and was never untainted afterwards. This would allow Alain to address the problem right away, before that tainted variable has any chance to cause damage. But nevermind.
Fast forward a few months. Alain has used the original tainted input to derive other variables, which have themselves been used to derive other variables, and so on. So he now has a dozen tainted variables floating around his system. But Perl is not complaining, because none of them is used to do anything dangerous. But thatís about to change. ..
Today, Alain uses one of those tainted variables to compose a HTML string which he prints to his CGI scriptís standard output. As a result, he is now exposing all his users to cross site scripting attacks. Yet, Perl still isnít complaining because it doesnít consider printing a tainted variable to STDOUT as being dangerous.
Now, you might say that this is Alainís fault. He should know that taint mode doesn't protect him against printing tainted variables to STDOUT. So he should never print a string to STDOUT without first checking it for taintedness with the tainted() function. But this seems very error prone. If Alain forgets to do this even once, he will be exposing his users to xss attacks. Wouldnít it be simpler and safer if taint mode did what I am suggesting above, namely, report any tainted variable that never got untainted by the end of the process?
Fast forward another couple of months. The tainted variables have spread even more, and several of them are now being printed to STDOUT, resulting in a growing number of vulnerabilites, and increasing the odds that one of them may be actually exploitable. But Perl still isnít complaining because none of the tainted variable is being used in a system call. So, while Alainís users are exposed to xss attacks, his own server is still unaffected by these tainted variables. But this too is about to change.
Today, Alain takes one of those tainted variables and executes a regexp match with group capture on it, to carry out a task that he never intended as a security sanitization operation. This was bound to happen at some point, given that Alain (like all Perl programmers) is deeply in love with regexps and uses them for all sorts of things (most of which have nothing to do with sanitizing malicious inputs). The net result of this action is that Alain now has a variable which is considered to be untainted, eventhough it was derived from a tainted variable and nothing was ever done to sanitize it against malicious content. Letís call this an ďinadvertently untainted malicious variableĒ. Later on, Alain uses one of the captured group values to compose a shell command that he passes to system() and Bam! His server has now been compromised.
Here too, you might say that this is Alainís fault. He should know that invoking a regexp on a variable can result in inadvertently untainting the malicious captured groups. Before doing any regexp match, he should always check the string for taintedness. So now, we are saying that Alain should be doing this before every print to STDOUT and every regexp with group capture. So, even more opportunities for Alain to slip.
But wait, thatís not all. At some point Alain passes one of tainted variables to a CPAN function which, unbeknownst to him, does a regexp match. Again here, the developers of the CPAN function never meant this match to act as a security sanitization operation. The function uses a captured group to compose the return value, and as a result, the return value of the function is now an inadvertently untainted malicious variable. Alain uses the return value to compose another system() command, resulting in another vulnerability for his server
Maybe that too is Alainís fault. He should know that third party functions can inadvertently untaint malicious variables. So before making any such call, he should always check all arguments for taintedness.
Or maybe Alainís problem is that he doesnít think before invoking a system call. When he is about to issue a system call, he shouldnít trust that tainted mode did its job of identifying tainted variables that were never explicitly sanitized. Instead, he should manually trace back everything that went into composing the system command. HumÖ that sounds pretty hard and error prone, but I guess that's why Alain is paid the bick bucks.
But waitÖ what if Alain passes one of those inadvertently untainted malicious variables to a CPAN function that, unbeknownst to him, uses it to compose a system() command? Is Alain supposed to think about that too and inspect every third party method he invokes to make sure it canít ever result in a system command that includes one of the inputs that he passed to it? Hum... that too sounds incredibly error prone. So, maybe our last two recommendations aren't good and the best thing after all is for Alain to always check argument for taintedness before passing them to a third party function for taintedness.
So, all in all, we are saying that Alain should do this kind of check:
Does this sound like Alain is doing all the hard work and taint mode is hardly providing any protection at all? Yes. Is Alain confident that he will always remember to do this before every print, regexp or call to a third party library? No. Could taint do a better job at helping Alain? Yes. In particular, I believe that what I propose on this page would prevent all the problems described above: http://www.perlmonks.org/?node_id=1002107.
That proposal would protect Alain against most (if not all) accidental slips like: adding 5 new inputs and forgetting to explicitly sanitize one of them. But of course, it canít protect Alain against conscious, ďpremeditatedĒ acts of negligence like: labeling a new input with untainted(), just to get rid of Perlís error message, without actually doing anything to sanitize it (or doing a quick, half-ass job of if). As many people have pointed out, nothing can protect Alain against this kind of negligence. If Alain does something that stupid, then THAT is definitely his fault. But everything else should be caught by taint mode (and I canít see a technical reason why it canít be done).