Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Intercepting compile time blocks like BEGIN {}

by LanX (Canon)
on Aug 09, 2010 at 11:00 UTC ( #853778=perlquestion: print w/ replies, xml ) Need Help??
LanX has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I'm still in shock about this post Re: Vulnerabilities when editing untrusted code... (Komodo), showing that it's not trivial to find BEGIN, CHECK und UNITCHECK compile time blocks (furtheron named "CTBs") by static parsing before the code is executed.

My understanding of whats happening in ''=~('(?{B'.'EGIN{print "owned"}})') is:

  • in a regex (?{..}) is treated like an eval-block.
  • eval-blocks also allow CTBs
  • the parser/lexer optimizes the concatenation of literal strings within the regex away.
  • While compiling the BEGIN-Block is executed.

    After some meditation I think, that having a mechanism to intercept the execution of CTBs is a necessary feature request.

    It would be beneficial to have something like a command line switch to make perl print all CTBs instead of evaling them and not continuing by default.

    something like perl -cc maybe extendable by hooking a function to treat the code string perl -cc='my ($code, $phase, $file, $line); print $code; 0'.

    The return code of those callbacks could be taken to decide about the further continuation of the process. (e.g. based on a file's ownership, path or certificate) ²

    AFAIK CTBs are evaled¹, so in theory it should be easily possible to intercept the evaling routine to do this.

    The possible benefits are:

  • Debugging of CTBs³
  • automated testing if code can be safely syntax checked without executing code

    I took a look into Safe, but it doesn't seem that this case is covered ... or is it possible to hook into eval to achieve this?

    Or is there already any other possibility I missed???

    Cheers Rolf

    1) well not quite ... from perlmod

    It should be noted that "BEGIN" and "UNITCHECK" code blocks are executed inside string "eval()"’s. The "CHECK" and "INIT" code + blocks are not executed inside a string eval, which e.g. can be a prob +lem in a mod_perl environment.

    2) it could also return other code to be used instead, e.g. to wrap the given code into "use Safe;" and "no Safe;" statements,

    3) including tracing and investigating CTBs of alien code.

  • Comment on Intercepting compile time blocks like BEGIN {}
    Select or Download Code
    Re: Intercepting compile time blocks like BEGIN {}
    by sundialsvc4 (Monsignor) on Aug 09, 2010 at 13:16 UTC

      “Sometimes, cleverness is not a virtue.”

      Sometimes, the products of “cleverness” prove to be quite uncontrollable.

      In my humble opinion, BEGIN blocks are one of those things.   And, if we then try to “intercept” them, so as to prevent them from doing what we don’t want them to do in this-case or that, “well, we have only made matters worse, haven’t we?”

      I prize one characteristic of good source code above all others: clarity.   In such code, I am able to quickly read the code and to ascertain, with a very high degree of confidence, that I actually know what it is actually telling the computer to do, and that the computer will actually interpret it in just that way.   This idea of “intercepts” would, IMHO, unfortunately just serve to make the code even more inscrutable than it already may be.

      Of course I do not mean the foregoing to be “a blanket statement, true in every case as though it were inscribed by a divine hand in some stone tablets.”   Instead, call it a rule-of-thumb, offered by a thumb that has been whacked with a hammer too many times.

        BEGIN blocks are a crucial part of the use mechanism and responsible for much of the flexibility many CPAN moduls can offer.

        IMHO other so called "clear" languages/products just offer a multitude of specialized mechanisms which aren't really better controllable when using foreign libraries.

        I doubt that those mechanisms are better suited, because normally they diminish flexibility without really enforcing security.

        What's needed is a mechanism to define and enforce the personal level of trust, thats why I want to be able to hook a call-back into the executions at compile-time.

        Perl's Debugger already has many possibilities to hook call-backs into various aspects and phases of execution, it would only complete this set of possibilities for debugging and introspection.

        Cheers Rolf

        UPDATE: > This idea of “intercepts” would, IMHO, unfortunately just serve to make the code even more inscrutable than it already may be.

        Which code are you talking about? I was talking about a command line switch, not of an extension of the Perl syntax. There is no  use intercept intended!

    Re: Intercepting compile time blocks like BEGIN {}
    by Anonymous Monk on Aug 09, 2010 at 14:11 UTC

      Might the safe module help in this case? Maybe you could allow init to run, as long as just does init in memory... If it keeps its handles to itself and doesn't touch the disk/network, is that ok?

        Well that's one of my questions. :)

        As I said "I took a look into Safe" but couldn't figure out how to achieve this goal.

        Safe seams mainly to intercept dedicated opcodes, are there special opcodes for CTBs?

        AFAIK opcodes are executed after compilation (to opcodes) so it should be already to late.

        Cheers Rolf

    Re: Intercepting compile time blocks like BEGIN {}
    by ikegami (Pope) on Aug 09, 2010 at 14:34 UTC

      automated testing if code can be safely syntax checked without executing code

      Actually, disabling BEGIN blocks would greatly reduce the value of s syntax check. For example, it would

      • introduce errors due to missing imports or missing prototypes
      • remove warnings and errors from warnings and strict unless you conditionally allow the use of certain modules
      • remove warnings relating to imports
      • add warnings due to missing globals

      Also, it would prevent syntax checking a module as that requires executing require.

      EPIC uses PPI to parse the script without executing anything. It does a great job of finding errors reliably.

      Anyway, I don't see the problem. If you've installed the module, you've already accepted its evilness. I don't see what good a syntax check of an untrusted module would do. Just like you wouldn't execute it, don't do a syntax check on it.

        > Actually, disabling BEGIN blocks would greatly reduce the value of s syntax check. For example, it would

        > ...

        I'm aware of this, but thats exactly why I was describing a call-back function to control the process.

        For instance the filepath could be taken to make a distinction between trusted and new code.

        And rurban's suggestion to wrap the code into a Safe environment could be chosen to allow execution of BEGIN blocks in untrusted code.

        > EPIC uses PPI to parse the script without executing anything.

        Tell me, PPI can find BEGIN-Blocks like in ''=~('(?{B'.'EGIN{print "owned"}})') ?

        AFAIK PPI can not deal with all kinds of syntax changing mechanisms. So wouldn't be of much help when searching for evil code, since attackers could use these limitations.

        Cheers Rolf

          For instance the filepath could be taken to make a distinction between trusted and new code.

          How does that help? Three of the four examples I gave still stand, and you still can't syntax check a module.

          And rurban's suggestion to wrap the code into a Safe environment

          Safe is considered not safe.

          Tell me, PPI can find BEGIN-Blocks like in ''=~('(?{B'.'EGIN{print "owned"}})') ?

          It shows as a regex literal, which sounds good to me.

          So wouldn't be of much help when searching for evil code, since attackers could use these limitations.

          Using PPI removes the need to detect such attacks. The only reason you need to detect the attacks is that your method is susceptible to them.

    Re: Intercepting compile time blocks like BEGIN {}
    by ikegami (Pope) on Aug 09, 2010 at 23:53 UTC

      AFAIK CTBs are evaled, so in theory it should be easily possible to intercept the evaling routine to do this.

      eval() isn't used. But something is. Does it really matter what that something is?

      well not quite ... from perlmod

      It refers to:

      $ perl -E'eval "BEGIN { say q{foo} }"' foo $ perl -E'eval "UNITCHECK { say q{foo} }"' foo $ perl -E'eval "CHECK { say q{foo} }"' $ perl -E'eval "INIT { say q{foo} }"' $
      But it's not completely true. The problem is that eval would normally be used after the CHECK and INIT blocks have triggered. If you use eval earlier, all four blocks work.
      $ perl -E'BEGIN { eval "CHECK { say q{foo} }" }' foo $ perl -E'BEGIN { eval "INIT { say q{foo} }" }' foo
        > eval() isn't used. But something is. Does it really matter what that something is?

        theoretically no, practically yes, because extending an eval mechanism shouldn't be difficult "something" OTOH could mean anything much more complicated.

        Cheers Rolf

          Quite the opposite. eval is an op. It's meant to be called from Perl land. This may subject it to limitations and make it a poor choice. Limiting yourself to a specific implementation (without knowing anything about it) is definitely not better.

    Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Node Status?
    node history
    Node Type: perlquestion [id://853778]
    Approved by mr_mischief
    Front-paged by Corion
    help
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others cooling their heels in the Monastery: (5)
    As of 2014-08-30 13:26 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      The best computer themed movie is:











      Results (293 votes), past polls