Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

'extract' subs from source code

by ceo (Acolyte)
on Oct 24, 2012 at 13:04 UTC ( [id://1000632]=perlquestion: print w/replies, xml ) Need Help??

ceo has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perlmonks, I seek your great wisdom! I need to extract the code and the names of subs from scripts. So I searched CPAN, but when I found nothing, I wrote this quickly:
$moduleText = shift; my %subs = (); my @splitted = split(/([\{\}])/, $moduleText); my %subPrototypes = (); my $brackets = 0; my $pos = 0; while ($pos <= $#splitted) { if($splitted[$pos] =~ /sub (.+?)\s{1,}(\(.+?\)){0,1}\s*/) { $pos += 1; my $this_sub = $1; $subPrototypes{$this_sub} = $2 if $2; while ($pos <= $#splitted) { if($splitted[$pos]) { $brackets++ if($splitted[$pos] eq "{"); $brackets-- if($splitted[$pos] eq "}"); } $subs{$this_sub} .= $splitted[$pos]; if($brackets <= 0) { last; } $pos++; } } else { $pos++; } }
And it works OK in most situations. But there are some special situations like in strings (" and ' could easily been done, but q and qq and ... are really hard to realize) and so on, where you can put opening and closing brackets without effect to the logic of the script itself. So... as I know: Only perl can parse Perl, but ain't there really any other way to do that? Maybe someone has an idea? Any help would be greatly appreciated!

Replies are listed 'Best First'.
Re: 'extract' subs from source code
by toolic (Bishop) on Oct 24, 2012 at 14:29 UTC
Re: 'extract' subs from source code
by 2teez (Vicar) on Oct 24, 2012 at 13:28 UTC

    try this:

    use warnings; use strict; while(<ARGV>){ chomp; print $1,$/ if/^sub\s+?(.+?){/; ## print subroutine name if(/^sub\s+?/.. /^}/){ print $_,$/; } }
    it works for me.
    UPDATE:
    The code above, is based on the assumption that:
    1. the OP's script(s) start every sub on a new line,
    2. the OP's script(s) and question does not cover anonymous subroutine
    However, if these are not so, then all that is needed is to modify the REs used like so:
    ... chomp; print $1, $/ if /^(?:.+)?\bsub\s+?(.+?){/; ## assumption still her +e if ( /^(.+)?\bsub\s+?(.+?)?{/s .. /^(.+)?}/s ) { ## anonymous sub + included print $_, $/; } ...
    Please, it should also be noted that the above may not take care of all subroutrine declaration in perl script.
    rovf thanks for the eagle eye ::)

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
      Hmmmm.... your solution would at best find those where the word sub occurs at the beginning of the line.

      -- 
      Ronald Fischer <ynnor@mm.st>
Re: 'extract' subs from source code
by Anonymous Monk on Oct 24, 2012 at 13:16 UTC
Re: 'extract' subs from source code
by tobyink (Canon) on Oct 24, 2012 at 16:23 UTC

    So let's use perl to parse Perl...

    use v5.10; use strict; use warnings; { package SubLister; use Class::Inspector; sub list_subs { my $number = int rand 999_999_999; our @SUBS = (); eval qq{ package Random$number; }.shift.q{ BEGIN { @SubLister::SUBS = @{Class::Inspector->functions(__PAC +KAGE__)}; die; } }; return @SUBS; } } my $source_code = <<'SRC'; sub foo { 1 } sub bar { 2 } sub baz { 3 } CORE::say("We don't want to execute this!"); SRC say("Got: $_") for SubLister::list_subs($source_code);
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
Re: 'extract' subs from source code
by NetWallah (Canon) on Oct 24, 2012 at 15:41 UTC
    This is just an idea - not a real solution - but hopefully, more knowledgeable monks can either comment on it's viability, or explain it's unsuitability:

    Since sub names are in the "main" namespace, how about using the perl debugger to load and print the sub info, without actually executing the program ?

                 "By three methods we may learn wisdom: First, by reflection, which is noblest; Second, by imitation, which is easiest; and third by experience, which is the bitterest."           -Confucius

      Since sub names are in the "main" namespace...

      They probably are, but they might not be.

      ... without actually executing the program?

      use or BEGIN or other metaprogramming mechanisms may modify what's available. You'll probably be fine with most scripts, but that's no guarantee.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1000632]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-04-23 23:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found