Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re^3: Taint mode limitations

by BrowserUk (Pope)
on Nov 03, 2012 at 14:51 UTC ( #1002106=note: print w/ replies, xml ) Need Help??


in reply to Re^2: Taint mode limitations
in thread Taint mode limitations

My issue is that Perl tries to "guess" when I have looked at the the input ("gee, the programmer captured some match groups from a regexp match on that input, so it MUST mean that he sanitized it"), instead of letting me tell it when I think I have looked at it closely enough (for example, but invoking a method untainted() on a variable).

Perl isn't "guessing". It is following the clearly laid out rule for 'detainting'. That is:

Perl presumes that if you reference a substring using $1, $2, etc., that you knew what you were doing when you wrote the pattern.

And it goes on to say:

That means using a bit of thought--don't just blindly untaint anything, or you defeat the entire mechanism.

That may not be how you think it should work; but it is the way it does work. For better or worse.

You can try putting forwards your arguments for a different -- presumably better in your eyes -- way of working; but given how long the current mechanism has been in place; that the mechanism is -- has to be -- deeply embedded within the Perl core; and the historic convention that says Perl does not break backward compatibility; and the net result is that you will have to learn to live with what is; because it is very unlikely to change at this point in time.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

RIP Neil Armstrong


Comment on Re^3: Taint mode limitations
Re^4: Taint mode limitations
by alain_desilets (Beadle) on Nov 03, 2012 at 17:37 UTC
    Perl isn't "guessing". It is following the clearly laid out rule for 'detainting'. That is: "Perl presumes that if you reference a substring using $1, $2, etc., that you knew what you were doing when you wrote the pattern."

    The problem with this is that the "clearly laid out rule for 'detainting'" is too ambiguous. In perl, Regexp matches are used to do a lot of different things, and removing malicious characters is only one of them. So for perl to assume that a variable derived from a tainted variable through a regexp match is "clean" is dangerous.

    See what I wrote here: http://www.perlmonks.org/?node_id=1002125

      In perl, Regexp matches are used to do a lot of different things, and removing malicious characters is only one of them. So for perl to assume that a variable derived from a tainted variable through a regexp match is "clean" is dangerous.

      No. You have that backwards. Perl is not "assuming" anything. Perl is not a living entity. It does not make assumptions; nor can it take circumstances into account.

      Perl gives you a simple mechanism, which you can either use correctly; or not.

      It is like speed limits. They may be set at 70mph (or whatever prevails in your part of the world), but that does not absolve you from responsibility.

      If you try and drive your car at 70 in torrential driving rain; thick fog; or when there is likely to be black ice about; don't go blaming the result on the speed limit.

      See what I wrote here...

      So, you wrote a bunch of code without considering security; and now you want to 'fix' Perl; rather than fix your own code.

      I have no say or influence in these matters; but it is a pretty safe bet to assume that Perl tainting isn't going to change any time soon, so you'd best expend your effort fixing your code.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      RIP Neil Armstrong

        So, you wrote a bunch of code without considering security; and now you want to 'fix' Perl; rather than fix your own code.

        If I was infallible and always remembered to sanitize user inputs against malicious characters, then I wouldn't need taint mode in the first place would I? So yeah, I do expect taint mode to catch as many of my mistakes as it possibly can. But as it stands, it lets way too many things go by, many of which could easily have been stopped if it had been designed to be less lenient.

        Consider the following scenario involving a reasonably qualified programmer (letís call him Alain for the sake of argument ;-)) who makes one honest mistakes and ends up with several security threats, none of which are being caught by taint mode (although they could easily have been caught by a better designed taint mode). It goes like thisÖ

        Alain adds a new input to his CGI application, and for some reason, he forgets to sanitize it. Yet Perl doesnít complain because for now, Alain is not using this tainted input to do anything dangerous. You might say that this isnít a big deal because no actual vulnerability has been introduced yet. Still, it would be nice if, upon exit, Perl told Alain that this particular variable became tainted and was never untainted afterwards. This would allow Alain to address the problem right away, before that tainted variable has any chance to cause damage. But nevermind.

        Fast forward a few months. Alain has used the original tainted input to derive other variables, which have themselves been used to derive other variables, and so on. So he now has a dozen tainted variables floating around his system. But Perl is not complaining, because none of them is used to do anything dangerous. But thatís about to change. ..

        Today, Alain uses one of those tainted variables to compose a HTML string which he prints to his CGI scriptís standard output. As a result, he is now exposing all his users to cross site scripting attacks. Yet, Perl still isnít complaining because it doesnít consider printing a tainted variable to STDOUT as being dangerous.

        Now, you might say that this is Alainís fault. He should know that taint mode doesn't protect him against printing tainted variables to STDOUT. So he should never print a string to STDOUT without first checking it for taintedness with the tainted() function. But this seems very error prone. If Alain forgets to do this even once, he will be exposing his users to xss attacks. Wouldnít it be simpler and safer if taint mode did what I am suggesting above, namely, report any tainted variable that never got untainted by the end of the process?

        Fast forward another couple of months. The tainted variables have spread even more, and several of them are now being printed to STDOUT, resulting in a growing number of vulnerabilites, and increasing the odds that one of them may be actually exploitable. But Perl still isnít complaining because none of the tainted variable is being used in a system call. So, while Alainís users are exposed to xss attacks, his own server is still unaffected by these tainted variables. But this too is about to change.

        Today, Alain takes one of those tainted variables and executes a regexp match with group capture on it, to carry out a task that he never intended as a security sanitization operation. This was bound to happen at some point, given that Alain (like all Perl programmers) is deeply in love with regexps and uses them for all sorts of things (most of which have nothing to do with sanitizing malicious inputs). The net result of this action is that Alain now has a variable which is considered to be untainted, eventhough it was derived from a tainted variable and nothing was ever done to sanitize it against malicious content. Letís call this an ďinadvertently untainted malicious variableĒ. Later on, Alain uses one of the captured group values to compose a shell command that he passes to system() and Bam! His server has now been compromised.

        Here too, you might say that this is Alainís fault. He should know that invoking a regexp on a variable can result in inadvertently untainting the malicious captured groups. Before doing any regexp match, he should always check the string for taintedness. So now, we are saying that Alain should be doing this before every print to STDOUT and every regexp with group capture. So, even more opportunities for Alain to slip.

        But wait, thatís not all. At some point Alain passes one of tainted variables to a CPAN function which, unbeknownst to him, does a regexp match. Again here, the developers of the CPAN function never meant this match to act as a security sanitization operation. The function uses a captured group to compose the return value, and as a result, the return value of the function is now an inadvertently untainted malicious variable. Alain uses the return value to compose another system() command, resulting in another vulnerability for his server

        Maybe that too is Alainís fault. He should know that third party functions can inadvertently untaint malicious variables. So before making any such call, he should always check all arguments for taintedness.

        Or maybe Alainís problem is that he doesnít think before invoking a system call. When he is about to issue a system call, he shouldnít trust that tainted mode did its job of identifying tainted variables that were never explicitly sanitized. Instead, he should manually trace back everything that went into composing the system command. HumÖ that sounds pretty hard and error prone, but I guess that's why Alain is paid the bick bucks.

        But waitÖ what if Alain passes one of those inadvertently untainted malicious variables to a CPAN function that, unbeknownst to him, uses it to compose a system() command? Is Alain supposed to think about that too and inspect every third party method he invokes to make sure it canít ever result in a system command that includes one of the inputs that he passed to it? Hum... that too sounds incredibly error prone. So, maybe our last two recommendations aren't good and the best thing after all is for Alain to always check argument for taintedness before passing them to a third party function for taintedness.

        So, all in all, we are saying that Alain should do this kind of check:

        • before every print to STDOUT
        • before every regexp with group capture
        • before every call to a third party library

        Does this sound like Alain is doing all the hard work and taint mode is hardly providing any protection at all? Yes. Is Alain confident that he will always remember to do this before every print, regexp or call to a third party library? No. Could taint do a better job at helping Alain? Yes. In particular, I believe that what I propose on this page would prevent all the problems described above: http://www.perlmonks.org/?node_id=1002107.

        That proposal would protect Alain against most (if not all) accidental slips like: adding 5 new inputs and forgetting to explicitly sanitize one of them. But of course, it canít protect Alain against conscious, ďpremeditatedĒ acts of negligence like: labeling a new input with untainted(), just to get rid of Perlís error message, without actually doing anything to sanitize it (or doing a quick, half-ass job of if). As many people have pointed out, nothing can protect Alain against this kind of negligence. If Alain does something that stupid, then THAT is definitely his fault. But everything else should be caught by taint mode (and I canít see a technical reason why it canít be done).

      If you think about "removing malicious characters" you do not understand security! You should never remove the bad, you should always take just the good!

      Jenda
      Enoch was right!
      Enjoy the last years of Rome.

        If you think about "removing malicious characters" you do not understand security! You should never remove the bad, you should always take just the good!

        It appears you don't understand security either :) Consider

        my $good = join '', $bad =~ m/(\w+)/g; my $good = $bad =~ s/\W+//gr;

        Sure, only the m// version untaints successfully in perl, but both versions "remove malicious characters" and both versions "take just the good"

        If you think about "removing malicious characters" you do not understand security! You should never remove the bad, you should always take just the good!
        Good advice, thanks.
Re^4: Taint mode limitations
by alain_desilets (Beadle) on Nov 03, 2012 at 18:02 UTC
    You can try putting forwards your arguments for a different -- presumably better in your eyes -- way of working; but given how long the current mechanism has been in place; that the mechanism is -- has to be -- deeply embedded within the Perl core; and the historic convention that says Perl does not break backward compatibility; and the net result is that you will have to learn to live with what is; because it is very unlikely to change at this point in time.

    Forgot to reply to that bit. I already outlined what I (probably arrogantly) believe would be a better solution in the middle of this page:

    http://www.perlmonks.org/?node_id=1002107

    But sadly, I think you are probably right that we are stuck with the current taint mode implementation. I'm just surprised that it wasn't done that way in the first place.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1002106]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2014-12-25 09:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (159 votes), past polls