SmallTalk-like Message browser

by diotalevi (Canon)
on Jul 03, 2007 at 21:12 UTC ( [id://624789]=sourcecode: print w/replies, xml ) Need Help??
Author/Contact Info Josh.
Description: Examines a code tree and reports on which things are called by which other things.
use strict;
use warnings all => 'FATAL';
use constant EMPTY_ARRAY => [];
use File::Find 'find';
use PPI         ();
use PPI::Dumper ();
use YAML 'Dump';
use Cwd 'abs_path';

my @to_search = map { abs_path($_) } @ARGV ? @ARGV : '.';

# Find all "classes" and the messages they directly accept.
my %subs;
    sub {
        return unless -f and /pm$/;
        my $src = read_file($_);

        $subs{$File::Find::name} = extract_subs($src);

# Remove redundant information from the keys.
my $common_prefix = find_common_prefix( [ keys %subs ] );
if ( $common_prefix ) {
        for my $key ( keys %subs ) {
                my $new_key = $key;
                $new_key =~ s/\A\Q$common_prefix//
                        or next;
                $subs{$new_key} = delete $subs{$key};

# Now figure out what each thing is potentially used by.
my %usage;
for my $file ( keys %subs ) {
        my $subs_href = $subs{$file};

        for my $sub ( keys %$subs_href ) {
                my $messages_aref = $subs_href->{$sub};
        my $messages_id = 0 + $messages_aref;
        $usage{$file}{$sub} = find_others( $sub, $messages_id );

# Report it.
print Dump( \%usage );

sub find_others {
    my ( $name, $id ) = @_;

    my %o;
    for my $file ( keys %subs ) {
        for my $sub ( keys %{ $subs{$file} } ) {
            next SUB if $id == $subs{$file}{$sub};

            for my $word ( @{ $subs{$file}{$sub} } ) {
                next WORD unless $name eq $word;


    return \%o;

sub extract_subs {

    # Accepts perl source and returns a hash reference of subroutines
    # and the messages they might be sending.

    my $doc = PPI::Document->new( \shift @_, readonly => 1 );

    my @uses =
      map {
        my $name = $_->schild(0)->snext_sibling->content;
        my @words =
                        map {
                                ( $_->content =~ /(\w+)/g )[-1]
                                || EMPTY_ARRAY };
        if ( $name =~ /^\w+$/ ) {
            @words = grep { $_ ne $name } @words;
            [ $name => \@words ];
        else {
            [ '???' => \@words ];
      } @{ $doc->find('PPI::Statement::Sub') || EMPTY_ARRAY };

    # It is "possible" that a subroutine might be mentioned more than
    # once so I merge them here. Maybe that is only the ??? sub.
    my %x;
    for (@uses) {
        push @{ $x{ $_->[0] } }, @{ $_->[1] };

    return \%x;

sub read_file {

    # Slurps a file.

    my $file = shift @_;
    open my $fh, '<', $file or die "Can't open file $file: $!";
    local $/ = undef;
    return <$fh>;

sub find_common_prefix {
    my $everything = join '', map { "$_\n" } sort @{ shift @_ };

    my $parts = 1;
    my %prefixes;
    my @lines;
    my $continue = 1;
    while ( $continue ) {
        $continue = 0;

        my $re = qr{^(/(?:[^/\n]+/){$parts,$parts})}m;
        pos( $everything ) = 0;
        while ( $everything =~ /$re/g ) {
            my $pos = pos $everything;
            $continue = 1;
            ++ $lines[$parts]{$1};
            pos( $everything ) = $pos;

        $prefixes{$_} = $lines[$parts]{$_} * $parts for keys %{ $lines
+[$parts] };
        ++ $parts;

    my ($max) = sort { $prefixes{$b} <=> $prefixes{$a} } keys %prefixe
    return $max;
Replies are listed 'Best First'.
Re: SmallTalk-like Message browser
by Ovid (Cardinal) on Jul 05, 2007 at 11:05 UTC

    This looks really interesting, but I confess that some of the output I'm receiving doesn't seem to make sense. I see modules which appear to be completely unrelated, even when I dig through the code, but the output suggests otherwise. Could you possibly explain how to interpret the output?


    New address of my CGI Course.

      The output roughly means the following:
      --- <filename>: <subname>: called from: <filename>:<subname>: <this many times>
      The reason you're seeing odd results is a limitation of static analysis. If you have two packages with identically named methods, then their usage gets lumped together. That is, One->new and Two->new are grouped, and count as two calls to new.

      I don't think you can get meaningful results unless you have an interactive environment -- like Squeak -- that can track the class of the invocant. Still, this is a nice approximation. As long as you keep the limitations in mind, it can be quite useful.

        I haven't seen how I can get Squeak's browser to decide which senders of "new" are the relevant senders. I've been of the opinion that it also punts on the subject of allomorphism. I figure that if ST punts, I'm ok to do so also.

        ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

