Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: Taint mode limitations

by alain_desilets (Beadle)
on Nov 03, 2012 at 13:10 UTC ( [id://1002102]=note: print w/replies, xml ) Need Help??


in reply to Re: Taint mode limitations
in thread Taint mode limitations

when I first started reading about taint mode, I expected that it would identify every single instance of tainted variable and force me to look at it explicitly.
It does! What it cannot do -- which you seem to be expecting -- is decide whether you looked closely enough.

I understand that it's my responsability to make sure I have looked at the input closely enough. My issue is that Perl tries to "guess" when I have looked at the the input ("gee, the programmer captured some match groups from a regexp match on that input, so it MUST mean that he sanitized it"), instead of letting me tell it when I think I have looked at it closely enough (for example, but invoking a method untainted() on a variable).

Using your front desk metaphor, suppose I am a security guard patrolling the corridors of a building. As I go through the front gate in the morning, I notice my front desk colleague making eye contact with a visitor. Later on, I see this visitor wandering the corridors without a pass. Can I assume that this visitor is authorized just because my colleague made eye contact with him? No, of course not!

Replies are listed 'Best First'.
Re^3: Taint mode limitations
by BrowserUk (Patriarch) on Nov 03, 2012 at 14:51 UTC
    My issue is that Perl tries to "guess" when I have looked at the the input ("gee, the programmer captured some match groups from a regexp match on that input, so it MUST mean that he sanitized it"), instead of letting me tell it when I think I have looked at it closely enough (for example, but invoking a method untainted() on a variable).

    Perl isn't "guessing". It is following the clearly laid out rule for 'detainting'. That is:

    Perl presumes that if you reference a substring using $1, $2, etc., that you knew what you were doing when you wrote the pattern.

    And it goes on to say:

    That means using a bit of thought--don't just blindly untaint anything, or you defeat the entire mechanism.

    That may not be how you think it should work; but it is the way it does work. For better or worse.

    You can try putting forwards your arguments for a different -- presumably better in your eyes -- way of working; but given how long the current mechanism has been in place; that the mechanism is -- has to be -- deeply embedded within the Perl core; and the historic convention that says Perl does not break backward compatibility; and the net result is that you will have to learn to live with what is; because it is very unlikely to change at this point in time.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

      You can try putting forwards your arguments for a different -- presumably better in your eyes -- way of working; but given how long the current mechanism has been in place; that the mechanism is -- has to be -- deeply embedded within the Perl core; and the historic convention that says Perl does not break backward compatibility; and the net result is that you will have to learn to live with what is; because it is very unlikely to change at this point in time.

      Forgot to reply to that bit. I already outlined what I (probably arrogantly) believe would be a better solution in the middle of this page:

      http://www.perlmonks.org/?node_id=1002107

      But sadly, I think you are probably right that we are stuck with the current taint mode implementation. I'm just surprised that it wasn't done that way in the first place.

      Perl isn't "guessing". It is following the clearly laid out rule for 'detainting'. That is: "Perl presumes that if you reference a substring using $1, $2, etc., that you knew what you were doing when you wrote the pattern."

      The problem with this is that the "clearly laid out rule for 'detainting'" is too ambiguous. In perl, Regexp matches are used to do a lot of different things, and removing malicious characters is only one of them. So for perl to assume that a variable derived from a tainted variable through a regexp match is "clean" is dangerous.

      See what I wrote here: http://www.perlmonks.org/?node_id=1002125

        In perl, Regexp matches are used to do a lot of different things, and removing malicious characters is only one of them. So for perl to assume that a variable derived from a tainted variable through a regexp match is "clean" is dangerous.

        No. You have that backwards. Perl is not "assuming" anything. Perl is not a living entity. It does not make assumptions; nor can it take circumstances into account.

        Perl gives you a simple mechanism, which you can either use correctly; or not.

        It is like speed limits. They may be set at 70mph (or whatever prevails in your part of the world), but that does not absolve you from responsibility.

        If you try and drive your car at 70 in torrential driving rain; thick fog; or when there is likely to be black ice about; don't go blaming the result on the speed limit.

        See what I wrote here...

        So, you wrote a bunch of code without considering security; and now you want to 'fix' Perl; rather than fix your own code.

        I have no say or influence in these matters; but it is a pretty safe bet to assume that Perl tainting isn't going to change any time soon, so you'd best expend your effort fixing your code.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        RIP Neil Armstrong

        If you think about "removing malicious characters" you do not understand security! You should never remove the bad, you should always take just the good!

        Jenda
        Enoch was right!
        Enjoy the last years of Rome.

Re^3: Taint mode limitations
by AnomalousMonk (Archbishop) on Nov 03, 2012 at 15:50 UTC
    ... [let] me tell it when I think I have looked at it closely enough (for example, [by] invoking a method untainted() on a variable) ...

    But how would you "look at it" in the first place? Almost always by a regex match of some kind. So one would wind up with a statement like
        untaint($hinky) if my @safe = $hinky =~ m{ \A now (get) some (stuff) here \z }xms;
        then_do_safe_stuff_with($hinky, @safe);  # $hinky now safe, too

    But what is to be gained by making explicitly required an action that is already implicit in the successful regex match? Everything still depends on crafting an effective validation regex.

      But what is to be gained by making explicitly required an action that is already implicit in the successful regex match? Everything still depends on crafting an effective validation regex.

      The problem is that regexp matches are typically used to do a lot of different things, and removing malicious characters is only one of them. So assuming that a variable derived from a tainted variable through a regexp match is "clean" is dangerous.

      For example, I have a fairly large code base that I wrote before I became concerned about security issues. In this code base, there are plenty of places where I capture regexp groups on user inputs for reasons that have nothing to do whatsover with removing malicious characters. For example, there are many places where I use regexps to strip out the leading and trailing characters of a user input. As a result, all those strings will be considered kosher by taint mode. In contrast, if taint mode forced me to explicitly label a variable as being untainted, those cases would be correctly identified as being currently tainted.

      I'm not clutching at straws here. This is a real situation, and I am sure there are plenty of folks who have examples of this problem in their code (and I bet this includes a lot of folks who run taint mode).

        ... I have a fairly large code base that I wrote before I became concerned about security issues. In this code base, there are plenty of places where I capture regexp groups on user inputs for reasons that have nothing to do whatsover with [validation].

        You have code written without concern for security. The main body of this code operates freely on input, including using regexes. The code must now be re-written to take security into account. Given the nature of the code that I infer from your description, there is no way to avoid a major re-write of some kind.

        Speaking in the most general terms, it seems to me that some new layer of validation code must be interposed between all input and existing operations on that input. Within that layer, input must be tested (presumably with regexes), and then either implicitly or explicitly untainted. If any input is allowed to reach the existing processing code, you have a security problem. The hermeticity of the new validation layer is the main problem; it seems to make little difference if the untainting done within it is implicit or explicit.

        Update: I just went back and reviewed this thread and saw BrowserUk's reply. I seem to be repeating many of the points made therein, and I don't disagree with those I don't repeat. I sympathize with your desire for a mechanism that when activated would 'light up' the application for any input data not explicitly untainted, but that would not address the basic problem, common to both the current taint mechanism and the one you propose, of designing an effective test for each datum within a newly-designed validation layer. Caveat Programmor.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1002102]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (4)
As of 2024-04-20 02:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found