Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

checksum of subroutine

by mnooning (Sexton)
on Aug 10, 2012 at 18:26 UTC ( #986798=perlquestion: print w/ replies, xml ) Need Help??
mnooning has asked for the wisdom of the Perl Monks concerning the following question:

I need a way to get at the code of a subroutine. Not to execute it. Rather, to independently generate the checksum of the subroutine code. It thought it might be easy using a subs' code ref, but a coderef is only good for executing code, not for seeing the code itself. The end goal is to check each of the subs to tell if any subs have been tampered with by a hacker, independently of an overall file checksum.

Any ideas?

Thanks

Comment on checksum of subroutine
Re: checksum of subroutine
by jeffa (Chancellor) on Aug 10, 2012 at 18:34 UTC

    Why would a finer grained inspection of the subroutines be any more suspect than any other part of the code? I would think, once the current state of the file has been blessed, that an overall checksum of the file would be more than sufficient to show that ANY changes have been made when NONE were expected. Once you have a corrupted file identified then you can use something like diff to see what changed.

    Otherwise, there happens to be this dynamic language called Perl is that very good at parsing text. ;) Anything from a simple regex to Parse::RecDescent can be used to extract the bits of text that make up a Perl subroutine.

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    

      A hacker can replace needed bits in a file, then add other bits so that the overall file checksum stays valid. Doing that gets quantum if you have to hack the subroutines and file checksums.

      As for RecDescent, if I could get at the text of a sub I could parse the sub and checksum it myself. The trick is to get at the text of the subroutine. That is where the question lies. Parse::RecDescent needs something like "$text", where $text is the text of the subroutine. You cannot hand it just a coderef. :-)

        Rather than literal checksums (e.g., sum of all bytes) or even CRCs, maybe investigate some modern 'digital signature' technology. Perhaps start with the Cryptographic hash function discussion.

        Use Digest::SHA to calculate a digest over the whole file. Although it is not impossible to make two totally different files with the same digest, it is extremely unlikely that both files will have the same length and both will be working programs. It is not as simple as changing a few instructions and adding a few meaningless bytes at the end to "make up" the checksum.

        CountZero

        A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

        My blog: Imperial Deltronics

        "A hacker can replace needed bits in a file, then add other bits so that the overall file checksum stays valid."

        True, but it's very, very hard. And would be made exponentially harder -- effectively impossible -- by taking two checksums of the file using different algorithms. For example taking both a SHA-512 and Whirlpool hash of the file, then concatenating them.

        perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'

      On second thought, your suggestion contains the answer. Simply use RecDescent to do the parsing just prior to shipping the software, etc.

      Thanks!

Re: checksum of subroutine
by chromatic (Archbishop) on Aug 10, 2012 at 20:18 UTC
    ... but a coderef is only good for executing code, not for seeing the code itself.

    It's enough, with the core module B::Deparse:

    use B::Deparse; my $deparse = B::Deparse->new( '-p', '-sC' ); my $source = $deparse->coderef2text( \&some_func );

      This looks like it will serve the case where the software modules are wrapped up in a single Perl PAR executable, wherein I cannot get at the individual files.

      This begs the question "Why would this situation ever arise? I can only tell you there are reasons.

      I love CPAN. Thanks!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://986798]
Approved by davies
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (12)
As of 2014-12-18 22:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (67 votes), past polls