Optree for entire module

wanna_code_perl has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks,

With B::Concise, it's possible to get a syntax tree for specific subroutines, CODE refs, or the top "main" program. However, what I need is the syntax tree (or trees) for an entire module. I.e., I need some kind of dump of Perl's parse of a module, which might include "dead" code that is never called. I would also want to recursively do the same thing for any other modules/sources pulled in with use or require (I'd probably ignore core modules and a few others, but haven't decided yet).

This will be part of a sort of static analysis tool I'm working on. This therefore imposes the constraint that the target module is essentially unknown code, and should be considered 'read-only' in the sense that I can't require the developer to add instrumentation to the code itself.

Also, requiring a recompile of perl is not an option since this code will be widely distributed. So, unfortunately all perl -D flags are a no-go.

One rather ugly hack I explored briefly was to try to extract all sub names and use/require modules from the source and run B::Concise on each result, but, beyond the fact it's a terrible idea, there's no way that I know of to run B::Concise on anon subs without their compiled CODE ref handy, which I wouldn't, and couldn't, in general, compile. (Whenever sub { ... } shows up in the code, the entire sub just shows up as a single anoncode line.)

Is there a way to do what I'm describing? Efficiency is not high on my list of priorities.

Comment on Optree for entire module Select or Download Code

Replies are listed 'Best First'.
Re: Optree for entire module by Corion (Patriarch) on Aug 26, 2013 at 07:34 UTC
In this area of introspection, Perl is quite good. Each package has a hash that contains all global names ("globs"). If you iterate over the code slots of these globs, you find all code that is connected to a name in that package. This approach will not find stuff that has been declared lexically, you can't get at it in a convenient way. For that, you have to look at PadWalker, or hit the module author until they make globally accessible what in fact is a variable with (module) global scope. `#!perl -wl use strict; use File::Basename; print "Subroutines in File::Basename"; print $_ for keys %File::Basename::;` [download] If you want to make this more parametric, the easiest approach is to switch off `strict` for the section where you go through the namespace: `use strict; use File::Basename; sub dump_keys { my($package)= @_; print "Subroutines in $package"; no strict 'refs'; print $_ for keys %{"$package\::"}; } dump_keys('File::Basename');` [download] I think you can also work your way down the namespace hierarchy by starting at the `%::` hash and going to the `File::` entry and then to the `Basename::` entry, but I find this too much hassle compared to switching off `strict` for a small block.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re: Optree for entire module
by Corion (Patriarch) on Aug 26, 2013 at 07:34 UTC

In this area of introspection, Perl is quite good. Each package has a hash that contains all global names ("globs"). If you iterate over the code slots of these globs, you find all code that is connected to a name in that package.

This approach will not find stuff that has been declared lexically, you can't get at it in a convenient way. For that, you have to look at PadWalker, or hit the module author until they make globally accessible what in fact is a variable with (module) global scope.

#!perl -wl
use strict;
use File::Basename;

print "Subroutines in File::Basename";

print $_
    for keys %File::Basename::;
[download]

If you want to make this more parametric, the easiest approach is to switch off strict for the section where you go through the namespace:

use strict;
use File::Basename;

sub dump_keys {
    my($package)= @_;
    print "Subroutines in $package";

    no strict 'refs';
    print $_
        for keys %{"$package\::"};
}

dump_keys('File::Basename');
[download]

I think you can also work your way down the namespace hierarchy by starting at the %:: hash and going to the File:: entry and then to the Basename:: entry, but I find this too much hassle compared to switching off strict for a small block.

[reply]
[d/l]
[select]

Back to Seekers of Perl Wisdom