Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

Analyzing Perl Code

by nitin1704 (Sexton)
on Sep 18, 2012 at 14:39 UTC ( #994294=perlmeditation: print w/replies, xml ) Need Help??

I've written a blog post about analyzing a large number of Perl scripts and modules interacting with each other. I'm fairly new to Perl, still learning about different ways to tackle problems I face. So, any feedback is welcome! :)

Basically, I wanted to add trace statements at the beginning of a script, and also when its constituent subroutines are called. I tried three approaches for that:

1. Parsing with regular expressions: The first and most crude approach was to modify all these scripts, by parsing them, finding subroutine definitions, and adding a print statement right after the beginning of each. ...

2. PPI Module: PPI::Document -> find(‘PPI::Statement::Sub’) ...
for my $child ( $sub->children ) { $child->start->add_content($caller) if ref $child eq "PPI::Structure::Block"; }
3. Hook::LexWrap Module: Adding pre-wrappers to subroutines:
my @all_subs = qw (sub1 sub2); for my $sub (@all_subs) { wrap $sub, pre => sub { print "Calling '$sub' in file: $0\n"; }; }

Replies are listed 'Best First'.
Re: Analyzing Perl Code
by toolic (Bishop) on Sep 18, 2012 at 14:50 UTC
    Since your blog mentions tracing several times, CPAN has handy modules, such as Devel::Trace and Devel::DumpTrace.

    It might be worth copy and pasting key excerpts from your blog here (in readmore tags) in case your link becomes stale.

      Thanks toolic! These modules seem to be great and much easier to use, though, they produce really detailed traces (one line for every statement). I would like just one line for each subroutine entered, because the scripts I'm working with are really long. Nevertheless, I will try these modules and see how it works out. :)
Re: Analyzing Perl Code
by BrowserUk (Pope) on Sep 19, 2012 at 23:02 UTC

    Put the following code in a file called: /yourperl/site/Devel/

    package Devel::Calls; our $lastsub; sub DB::DB { my( $p, $f, $l, $sub ) = caller(1); return if $lastsub and $sub eq $lastsub; printf STDERR "%s(%u): %s()\n", $f//'unknown', $l//'0', $sub//'unknown', $lastsub = $sub; return; } 1;

    And then invoke your scripts with perl -d:Calls

    You'll get a trace showing file, line and subroutine name whenever a subroutine is entered, something like:

    C:\test>perl -d:Calls 1 2 3 unknown(0): unknown() main::aaa() main::bbb() main::ccc() main::ddd() main::eee() main::fff() main::ggg() main::hhh() main::iii() main::jjj() main::kkk() main::lll() main::mmm() main::nnn() main::ooo() main::ppp() main::qqq() main::rrr() main::sss() main::ttt() main::uuu() main::vvv() main::www() main::xxx() main::yyy() main::zzz() 234

    That could obviously be greatly extended, but it may be just what you're looking for.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    RIP Neil Armstrong

Re: Analyzing Perl Code
by talexb (Canon) on Sep 19, 2012 at 12:02 UTC

    Here's the blog post as it stands:

    Recently I came across a bunch of server side code, implemented in the form of Perl scripts and modules – about a hundred of them. Many of them are a few thousand lines long, and they interact with each other in complex ways.

    In order to understand how this system works, I had to figure out the program flow. This was difficult especially because it was not easy to reproduce the input, and I didn’t even know where to start. I decided to take the primitive but effective debugging approach of adding print statements for tracing the flow of execution. Trace statements, as they are formally called, are easier to add in smaller programs though.

    In order to automate this process of adding trace statements, I tried 3 ways, one after the other:

    1. Parsing with regular expressions: The first and most crude approach was to modify all these scripts, by parsing them, finding subroutine definitions, and adding a print statement right after the beginning of each. However, a lot of cases were overlooked while writing the regular expression, and it ended up matching the word “sub” in strings and comments and the result was disastrous. The code base was ruined beyond hope for manual repair, but of course I had backups on my own machine and also on a repository.
    2. PPI: After some research I found a Perl module to parse Perl code. PPI, originally an acronym for Parse::Perl::Isolated, parses Perl code as documents, breaking it down in tokens in a strict hierarchical fashion. More details here. Using this, the task of finding subroutine definitions was simplified and made more reliable. PPI::Document -> find(‘PPI::Statement::Sub’) was all that was needed. Then, finding all the ‘children’ of each sub, and looking for PPI::Structure::Block (by checking their refs) got the beginning of each sub.
      for my $child ( $sub->children ) { $child->start->add_content($caller) if ref $child eq "PPI::Structure::Block"; }
      $caller is the print statement passed as a string.
    3. Hook::LexWrap: If you only want to add a trace statement (or any piece of code) at the beginning / end of a subroutine, Hook::LexWrap is a much cleaner way to do this. It doesn’t need you to change the original subroutines in any way. Just adding a few lines of code at the start of each file will suffice. In the following code, @all_subs is the array containing the names of all subroutines in the current file. The “wrap $sub, pre =>”line pre-wraps a subroutine, i.e. executes a piece of code just before the subroutine is executed.
      my @all_subs = qw (sub1 sub2); for my $sub (@all_subs) { wrap $sub, pre => sub { print "Calling '$sub' in file: $0\n"; }; }
    There must be better ways to analyse how a huge and complicated set of Perl scripts works, but this is what I have discovered so far.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Unless you have permission from the author or there is a posted license/copyright to allow it, wholesale reposts aren't kosher. :|

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://994294]
Approved by toolic
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (7)
As of 2018-07-17 21:42 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (378 votes). Check out past polls.