Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Analyzing large Perl code base.

by dmitri (Priest)
on Apr 14, 2005 at 22:02 UTC ( [id://447985]=perlquestion: print w/replies, xml ) Need Help??

dmitri has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I have finally (after over a year of bitching at all the bugs) forced my boss to allocate time for me to audit, analyze, and fix all of our company's Perl modules. This is a great project.

However, I think I will need some tools. Before writing my own, I googled and searched but with little success. I wonder if some of you may know of existing libraries or tools to analyze Perl code, graph it, etc, etc.

Of course, I will use Perl::Tidy and I also found this (no code, though). Are there other recommendations?

Thank you,

        - Dmitri.

Replies are listed 'Best First'.
Re: Analyzing large Perl code base.
by BrowserUk (Patriarch) on Apr 14, 2005 at 22:17 UTC

    Devel::Xref is extremely useful for picking apart complex code.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco.
    Rule 1 has a caveat! -- Who broke the cabal?
Re: Analyzing large Perl code base.
by gam3 (Curate) on Apr 14, 2005 at 22:14 UTC
Re: Analyzing large Perl code base.
by adrianh (Chancellor) on Apr 15, 2005 at 14:51 UTC

    Devel::Cover is a great tool for code exploration in combination with tests.

    Write an end-to-end test for a bit of functionality. Run it under Devel::Cover. Look at the coverage report to see which chunk of your big ball of mud was related to the task. Refactor that chunk.

    Repeat for the next bit of functionality.

      I don't think that this can be a reliable code-analysis tool in the general case. Too many programs contain complex initialization routines that use a lot of code and make coverage output not very useful.

      At times enabling or disabling something during a text will only differ in a small, barely invisible routine call when looking at the coverage logs.

      Using coverage for analysis has its merits, but it requires some a-priori acquintance with the code.

        At times enabling or disabling something during a text will only differ in a small, barely invisible routine call when looking at the coverage logs.

        Luckily for us we have these wonderful computer thingies which are terribly good at looking through large chunks of data for differences :-)

Re: Analyzing large Perl code base.
by dave0 (Friar) on Apr 15, 2005 at 15:32 UTC
    Having recently done this on a fairly large codebase that grew organically (no design, no refactoring) over the course of four years, I feel your pain.

    Writing a testsuite, on any level, is nearly essential for this. If you're rewriting an existing module, you'll need to ensure it's compatible with the old one, and the only sane way to do that is to test. If the old code is monolithic, it might be difficult to test individual units, but don't let that stop you from testing at a higher level.

    B::Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

    Pretty much the same tactic was used on the actual code. Print it out, annotate it away from the computer, and then sit down with the notes to implement the refactoring. If your codebase is huge (mine was about 4-5k lines in several .pl and .pm files, and was still manageable) you might not want to do this, though.

      2.5 megs of Perl code across 237 different modules, I will need a lot of coffee :-)

      Thanks for your suggestions.

        At least you have modules. You should be able to organize those modules into logical groupings. Once you do that, focus on one grouping at a time, writing lots and lots and LOTS of tests. Test everything, anything ... if it moves, test it. Heck, test it even if it doesn't move. (You want to make sure it doesn't start moving!)

        Note: you will find that many of your tests will be wrong ... and that's good. :-)

        Update: As adrianh says, you shouldn't write whitebox tests - you should be writing tests for how the rest of the code expects your modules to work. In other words, API tests. Remember - you're planning on ripping the guts out ASAP. You just want to make sure that the rest of the code doesn't die while you're working.

Re: Analyzing large Perl code base.
by eyepopslikeamosquito (Archbishop) on Apr 16, 2005 at 05:52 UTC

    The earlier advice from others re writing tests and re-factoring is sound. You are, however, most unlikely to be given enough time to do it all, so you must choose wisely which code to clean up first. How to choose? i) write tests for all recent (and new) bugs; ii) focus on modules you consider to be most vital and highest risk; iii) Go through the user manual and write a test (and refactor where appropriate) for each example given there (i.e. focus on client view of the system). Perhaps more important is to ensure that all new code is developed test-first and with a solid test suite.

    I've been (and am still going) through something similar as mentioned in What is the best way to add tests to existing code?. As expected, and despite earlier assurances, I did not get anywhere near the time and resources I would have liked. Bottom line: this sort of code cleanup, while strategically sound in the longer term, does not bring in immediate revenue.

    Update: You might pick up some good ideas from the book Perl Medic by Peter Scott. Ditto from the node starting to write automated tests.

      I picked up that book several weeks ago, and it's great. I also recommend it to everyone doing Perl code maintenance.

      A stab at reviewing the book is taken here.

Re: Analyzing large Perl code base.
by dragonchild (Archbishop) on Apr 15, 2005 at 00:08 UTC

      To be fair, I think you probably will want some decent documentation first and the analysis tools can help with that :-)


        You don't need to document the dreck you have. You need to document what marketing believes the dreck you have does. The last place you want to look for that is the actual source code.

        Now, I only say this because the OP said (and I paraphrase) "I have a bunch of spaghetti that I can't figure out, so how do I clean it up?" The answer is "Write some tests, then refactor, then write some tests, then refactor, then ..."

        Your testsuite then becomes the basis for your documentation. Obviously, you convert all the ok() calls into English or Swahili or whatever, but it's still the foundation.

        If you do decide to go with documentation tools (before, during, or after, and at the least I would pick after), you might wish to start with these:

        Our own castaway pointed me in the direction of podgen, which will jump-start the process of commenting your monolith.

        DoxyFilt* is a filter than allows the well-known Javadoc-like source-to-documentation tool Doxygen to understand Perl. Once you have your source commented, documentation becomes absurdly easy.

        Using Doxygen before and during the analysis process is often helpful for "getting the lay of the land." There is no reason why you need to limit yourself to using doc tools only once. ;-)


        * Update: 2005-12-28 Kudos to both john_oshea and tfrayner for alerting me to the fact that my link above has been rendered usless by the foul creatures known as spammers... I have found what appears to be a good link to obtain DoxyFilt; the most recent version seems to be from August 24, 2005: Doxygen-0.84.tar.gz. Thanks again, guys!

Re: Analyzing large Perl code base.
by spurperl (Priest) on Apr 15, 2005 at 11:32 UTC
    There are some non-free tools that can help you. For instance, ActiveState has a full-fledged Perl IDE that will organize the code into modules/packages graphically, and from which you can do a lot of code-analyzing (what is called by what, etc).

    There are a few other Perl IDEs online.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://447985]
Approved by tlm
Front-paged by rob_au
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-06-13 14:58 GMT
Find Nodes?
    Voting Booth?

    No recent polls found

    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.