Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re: Analyzing large Perl code base.

by dave0 (Friar)
on Apr 15, 2005 at 15:32 UTC ( #448222=note: print w/replies, xml ) Need Help??

in reply to Analyzing large Perl code base.

Having recently done this on a fairly large codebase that grew organically (no design, no refactoring) over the course of four years, I feel your pain.

Writing a testsuite, on any level, is nearly essential for this. If you're rewriting an existing module, you'll need to ensure it's compatible with the old one, and the only sane way to do that is to test. If the old code is monolithic, it might be difficult to test individual units, but don't let that stop you from testing at a higher level.

B::Xref helped me make sense of the interactions in the old codebase. I didn't bother with any visualization tools or graph-creation, though. I just took the output of perl -MO=Xref filename for each file, removed some of the cruft with a text editor, ran it through mpage -4 to print, and spent a day with coffee and pencil, figuring out how things worked.

Pretty much the same tactic was used on the actual code. Print it out, annotate it away from the computer, and then sit down with the notes to implement the refactoring. If your codebase is huge (mine was about 4-5k lines in several .pl and .pm files, and was still manageable) you might not want to do this, though.

Replies are listed 'Best First'.
Re^2: Analyzing large Perl code base.
by dmitri (Priest) on Apr 15, 2005 at 18:10 UTC
    2.5 megs of Perl code across 237 different modules, I will need a lot of coffee :-)

    Thanks for your suggestions.

      At least you have modules. You should be able to organize those modules into logical groupings. Once you do that, focus on one grouping at a time, writing lots and lots and LOTS of tests. Test everything, anything ... if it moves, test it. Heck, test it even if it doesn't move. (You want to make sure it doesn't start moving!)

      Note: you will find that many of your tests will be wrong ... and that's good. :-)

      Update: As adrianh says, you shouldn't write whitebox tests - you should be writing tests for how the rest of the code expects your modules to work. In other words, API tests. Remember - you're planning on ripping the guts out ASAP. You just want to make sure that the rest of the code doesn't die while you're working.

        Once you do that, focus on one grouping at a time, writing lots and lots and LOTS of tests. Test everything, anything ... if it moves, test it. Heck, test it even if it doesn't move. (You want to make sure it doesn't start moving!)

        While I don't think this is what you're proposing - I think this could be read as "write tests for everything in the legacy code before you change anything" which, IMHO, is a bad practice. As I said here

        A counter productive practice that I've seen is to go through a large piece of legacy code and add developer tests for everything. Doing this with legacy code not driven by tests will produce a test suite that is brittle in the face of change. When you get to the refactoring you're going to find that you're going to be continually throwing away a lot of the new tests so you don't get any benefit from them.

        In my experience it's much more effective to build the test suite around the changes you make to the code. Add tests when you add new features. Add tests around code that you're refactoring. Add tests to demonstrate bugs. In my experience just following those three rules naturallys build a test suite around the most important code.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://448222]
[davies]: LanX: It;s not the Lord Mayor, usually a businessman with the interests of London at heart, but the mayor, a politician with his own interests at heart. The Lord Mayor is responsible for London, i.e. the "square mile"...
[davies]: The mayor is irresponsible for greater London.
shmem .oO( meaning: the mayor is only responsible to a certain degree )
[davies]: erix: The only time I went to Utrecht (for the NLPW), there was a campanile playing. Beautiful music in a beautiful town.
[< & >]: ah I see
[davies]: .oO(shmem: meant to mean that the mayor, like those before him, is an appalling bleep. And utterly irresponsible.)

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (1)
As of 2017-12-15 17:45 GMT
Find Nodes?
    Voting Booth?
    What programming language do you hate the most?

    Results (439 votes). Check out past polls.