Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Programs/Methods for analyzing an existing Perl based system

by Anonymous Monk
on May 29, 2002 at 23:05 UTC ( #170245=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

What tools exist for analyzing a Perl based system? I've been handed a CD with a Perl/CGI system on it and I'm supposed to report back on its quality/maintainability. I've run it through perltidy so it now conforms to the "one true Perl style" and is legible. It's 37 .pl files, 10K+ lines and a couple of comments.

What other tools exist for analyzing a Perl program? In other languages I have used programs that produce:

  • System summaries (lines of code, file statistics...)
  • Variable cross-reference
  • Tree structure of the system
I've checked google, google groups, cpan, perl.com and here with no luck. Am I just using the wrong search terms?

Comment on Programs/Methods for analyzing an existing Perl based system
Re: Programs/Methods for analyzing an existing Perl based system
by Ovid (Cardinal) on May 29, 2002 at 23:26 UTC

    There is no real substitute for getting into the system and understanding how it works. Further, you have to know Perl fairly well, including good coding standards. I was recently helping a gentleman in the Netherlands with some Perl issues, when he emailed me a program and asked for feedback. Here's one of the subroutines:

    sub stats { unless ($aantalja) { &pak_getallen; } $totaal_stemmen = ($aantalja + $aantalnee); $eenstem = (100 / $totaal_stemmen); $procentja = ($aantalja * $eenstem); $procentnee = ($aantalnee * $eenstem); $procentja = int($procentja); $procentnee = int($procentnee); if (($procentja + $procentnee) < 100) { if ($procentja > $procentnee) { $procentja+=1; } elsif ($procentja < $procentnee) { $procentnee+=1; } } }

    Right off the bat, I can point to several problems. First, the subroutine is refers to variables declared outside of itself, so it's going to have side-effects that will be difficult to maintain. The indentation is poor, so it's tough to determine scope. Further, there's no sanity check to avoid a divide by zero error in this line:

    $eenstem = (100 / $totaal_stemmen);

    But does it work? Who knows? I don't speak Dutch. While there is plenty of information available in that little snippet, there is no meaning. Ultimately, this means that whatever metrics you want to produce, there is no substitute for understanding the code. Further, whatever metrics someone wants to put down as a standard, I guarantee that I can write code that will hit whatever target they are looking for, but still be an unmaintainable mess. Trust me, you should see some of my production code :)

    Cheers,
    Ovid

    Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

      sub stats { unless ($numberyes) { &grab_numbers; } $total_votes = ($numberyes + $numberno); $onevote = (100 / $total_votes); $percentyes = ($numberyes * $onevote); $percentno = ($numberno * $onevote); $percentyes = int($percentyes); $percentno = int($percentno); if (($percentyes + $percentno) < 100) { if ($percentyes > $percentno) { $percentyes+=1; } elsif ($percentyes < $percentno) { $percentno+=1; } } }


      Greetz
      Beatnik
      ... Quidquid perl dictum sit, altum viditur.
Re: Programs/Methods for analyzing an existing Perl based system
by derby (Abbot) on May 30, 2002 at 02:14 UTC
    Well, there are lots of non-free tools to do what you want as well as loads of "theories" (and some of them are loads). One of the more reasonable approaches is the McCabe Complexity Metric. Basically by loooking at each function (method,procedure,etc)
    • Start with 1 for the "straight" path through the function
    • Add 1 for each for, if, while, and, or.
    • Add 1 for each case in a case statement
    The number you come up with is the "complexity" of the function. Average the complexity of all the functions and you have the complexity of the codebase. The idea here is the lower the number, the less complex the code. The less complex the code, the better chances for higher quality. There's a lot more there (like if the function has a complexity of 1, does it really need to be its own function). I don't know of a tool to do this for perl but there's one on freshmeat for other languages.

    While strict adherence to complexity metrics can drive you crazy, they actually fit nicely into the programming mantra - high cohesion, low coupling. There's pretty strong evidence that if a function is doing a lot of conditionals it probably has low cohesion.

    -derby

      There's something fundamentally wrong with measuring complexity based on low level analyses of code and using the outcome to judge the quality of code.

      Most people will agree the grammar of the musings of Shakespeare is much more complex than Dr Suess books. Does that mean the childrens books have a higher quality than the plays?

      There are other problems as well. Such analyses can only focus on a particular implementation. It doesn't cast any judgement on a proper algorithm. It will favour a linear seach of a sorted array over a binary search, because the linear search requires less conditions to implement.

      It doesn't mean you shouldn't use such a tool. It just means that you have to be very careful with what you do with its results.

      Abigail

        A2,

        There's something fundamentally wrong with measuring complexity based on low level analyses of code and using the outcome to judge the quality of code

        Didn't think I was. Everything else++. Except the part about Shakespeare and Suess - that's just silly.

        -derby

Re: Programs/Methods for analyzing an existing Perl based system
by graff (Chancellor) on May 30, 2002 at 07:13 UTC
    Your situation made me think back to my early days, facing the same sort of problem with masses of C code; back then (before internet connectivity was common), I actually wrote my own code indexer in C to read a bunch of C source files and print a simple table that lists the functions defined in each file, and the subroutines called from within each function. It really helped.

    So, how hard could it be to do that in Perl? Well, not that hard, if you're willing to be content with less than (but usually close to) 100% accuracy in the "precision and recall" measures of subroutine detection.

    It's a real bare-bones, quick-and-dirty, not-too-subtle attempt, but it's here in case you want to try it out.

Re: Programs/Methods for analyzing an existing Perl based system
by Molt (Chaplain) on May 30, 2002 at 09:35 UTC

    I really don't think you're going to find what you're looking for in Perl. The grammatical nature of Perl makes parsing it exceptionally difficult ("Only perl can parse Perl") and so automated programs of this nature are nightmarish to write and even then won't catch everything. Nice things like eval, symbolic references, XS linkage, the ability to tweak so deeply into the engine, and other such things hammer us.

    Yes, it has also been said that 'C doesn't have a grammar, C coders write their own with #define' but C seems a lot more regular than Perl does, and there's a lot more people willing to pay big money to people who can produce tools like this for it. Java obeys a nice simple grammar, which is why you see so many of these kind of tools for it.

    All this being said there does seem to be good progress with the Perl Refactoring Browser, so possibly if someone was determined enough it may be possible to stand on their shoulders and produce code metrics from that. The Browser itself may do so, I've not looked that deeply into it to be honest, this'd help it detect the 'Code smells' that Refactoring is meant to solve.

Re: Programs/Methods for analyzing an existing Perl based system
by rinceWind (Monsignor) on May 30, 2002 at 11:57 UTC
    A few months ago, I did post a code counter which does give a measure of density in terms of tokens per line. Although not designed for analysing perl, it can do so. The limitations with perl relate to token counting and the likes of regexs, heredocs and obscure quoting

    Hopefully this can be of some use to you as a rough measure of code density and scale of effort required.

Re: Programs/Methods for analyzing an existing Perl based system
by chicks (Scribe) on May 30, 2002 at 12:02 UTC
    Perl really needs more tools to support software engineering. First and foremost among those is allowing metrics to be gathered. (Yes, there are bad metrics and there are metrics that can be easily weaseled, but I'm not telling you which metrics to use!) Obviously parsing the raw perl isn't going to be easy enough in perl5. But once perl parses it couldn't we navigate the op tree? That should make it easy to see how many non-local variables are affected and what-not.

    If somebody has enough free time to put something into a project like this, let me know. Particularly if you're familiar with the way the B:: modules work.

Re: Programs/Methods for analyzing an existing Perl based system
by samtregar (Abbot) on May 30, 2002 at 16:57 UTC
    You can use a profiler, like Devel::DProf or my module Devel::Profiler, to extract a call-tree for a given request. If there are a small enough number of request types you might consider producing a call-tree for each request. This might give you an idea of how complex the application is.

    Here's a quick example to get you started. First, run the code under a profiler:

    $ perl -MDevel::Profiler -e 'sub foo { bar(); } sub bar { 1 }; print f +oo();' 1

    Then use the appropriate tool to generate a call-tree. In this case, dprofpp:

    $ dprofpp -T main::foo main::bar

    -sam

Re: Programs/Methods for analyzing an existing Perl based system
by dada (Chaplain) on May 31, 2002 at 10:11 UTC
    What other tools exist for analyzing a Perl program?
    [...]
    Variable cross-reference

    strange that no-one mentioned this:

    perl -MO=Xref yourscript.pl
    Update:
    System summaries (lines of code, file statistics...)

    I haven't tried it, but Perl Metrics seems pretty good at statistics, although unfortunately it doesn't appear to be actively mantained.

    cheers,
    Aldo

    __END__ $_=q,just perl,,s, , another ,,s,$, hacker,,print;

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://170245]
Approved by Ovid
Front-paged by IlyaM
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (9)
As of 2014-12-21 11:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (104 votes), past polls