http://www.perlmonks.org?node_id=1000632

ceo has asked for the wisdom of the Perl Monks concerning the following question:

Hello Perlmonks, I seek your great wisdom! I need to extract the code and the names of subs from scripts. So I searched CPAN, but when I found nothing, I wrote this quickly:
$moduleText = shift; my %subs = (); my @splitted = split(/([\{\}])/, $moduleText); my %subPrototypes = (); my $brackets = 0; my $pos = 0; while ($pos <= $#splitted) { if($splitted[$pos] =~ /sub (.+?)\s{1,}(\(.+?\)){0,1}\s*/) { $pos += 1; my $this_sub = $1; $subPrototypes{$this_sub} = $2 if $2; while ($pos <= $#splitted) { if($splitted[$pos]) { $brackets++ if($splitted[$pos] eq "{"); $brackets-- if($splitted[$pos] eq "}"); } $subs{$this_sub} .= $splitted[$pos]; if($brackets <= 0) { last; } $pos++; } } else { $pos++; } }
And it works OK in most situations. But there are some special situations like in strings (" and ' could easily been done, but q and qq and ... are really hard to realize) and so on, where you can put opening and closing brackets without effect to the logic of the script itself. So... as I know: Only perl can parse Perl, but ain't there really any other way to do that? Maybe someone has an idea? Any help would be greatly appreciated!