Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Large-scale code documentation

by jacques (Priest)
on Nov 04, 2004 at 22:24 UTC ( #405320=perlquestion: print w/replies, xml ) Need Help??

jacques has asked for the wisdom of the Perl Monks concerning the following question:

Recently I have inherited thousands of lines of Perl code, all of which is poorly documented. The code comprises multiple applications which I must learn inside-and-out. I want to start documenting the code and figuring out how the different programs work together.

What I would like to do is create pages containing outlines of the software architecture and explanations of subroutines, variables, etc. It seems like a big job, but I think having those pages would be very useful and help me learn the various applications as well.

I prefer that the doc pages be viewable via a web browser. But this isn't a requirement. Perhaps some of docs would contain graphics (for diagrams and other things). Of course, my first thought is to investigate the various POD tools. But I have never attempted to create so much documentation and I am hoping to get some pointers before proceeding.

The code itself needs to be cleaned up and further commented. The code lives on Linux and is spread over different cgi scripts and custom modules. Another concern I have is whether to put all of the documentation on the same Linux machine (our production server). I also have a Windows server which is generally used for backup, but doesn't have the Perl code.

Replies are listed 'Best First'.
Re: Large-scale code documentation
by tachyon (Chancellor) on Nov 04, 2004 at 23:55 UTC

    documentation generator? web-enabled perldoc? covers a fair bit of ground. The perldoc project looks like it died but might help you out.

    Before doing anything I would start by putting it all under CVS (Short tutorial) With CVS you get a nice cohesive location for backup etc as well as all the other benefits. Then I would simply use Pod and refactor as I went. POD::HTML will give you some nice docs and you can use CVS to rollback any misfactors.

    A significant problem with autodoc generation is the parsing stage. B::Xref does a reasonable job if you were looking to roll your own.

    perl -MO=Xref 2>errs 1>output

    Unpatched it creates a ream of warnings about unit vars, thus the 2>errs. If you examine the output you will see info like this. I ran it on as this is a nice short 10 lines, the more complex the code, the more output you get.

    $ head package UNIVERSAL; # UNIVERSAL should not contain any extra subs/methods beyond those # that it exists to define. The use of Exporter below is a historical # accident that should be fixed sometime. require Exporter; *import = \&Exporter::import; @EXPORT_OK = qw(isa can); 1; $ perl -MO=Xref File Subroutine (definitions) Package UNIVERSAL &VERSION s0 &can s0 &isa s0 Package attributes &bootstrap s0 Subroutine (main) Package Exporter &import 7 Package UNIVERSAL *import 7 @EXPORT_OK 8 syntax OK $

    It includes a lot of the info you need to automatically generate docs and will show lexical as well as global vars.



Re: Large-scale code documentation
by perlcapt (Pilgrim) on Nov 04, 2004 at 23:43 UTC
    What I have done when I inherit someone (no longer available) else's code is the following:
    1. I print out all of the code with pr -f -l55 | lpr. Three hole punch the pages, put them in notebooks. The top of each page will have the filename and date, so you should not get these pages mixed up with others that you might create later.
    2. Get out several highlighters and mark the beginning of ever package and sub, using different colors. Then highlight with vertical indented arrows the nested loops and blocks, etc.
    3. Make sure that all of the code is in a source code control system so that if I screw up, I can go back.
    4. Then, I sit back down at the computer editing program (I prefer Emacs), and start writing comment lines.
    5. When I think that I understand the how and why of functions and section of code, I write the POD entry right above the beginning of that blocks. POD can be specified for specific interpretors, so you can do HTML specific image references if you wish. This can all be done in a later pass throught the code.
    The IDE interfaces facilitate some of the searching for callers and callees. I am now investigating Eclipse with E.P.I.C.. This is free stuff. It takes a little disk space and installation effort, but is an elegant way of tieing complex projects together. It also has an CVS client built in.
      Thanks. I have already printed out some of the code and highlighted different areas. Like you, I enjoy having the code on paper (although that's a lot of paper).

      Why would having the code in a source code control system be vital? In addition to the volume print outs, I also save a copy of any files I am working on and then test my changes. I will also be the only person modifying the code... Are there any benefits of a code control system that I am missing?

        A source code control system allows you to go back to previous versions and do comparsions of versions. This is important because it is not uncommon to break one thing while fixing another. The version comparisons help figure out how that might have happened.

        Secondly, source code control systems are an efficient storage method because they store only the changes to files rather than multiple copies of the files.

        Thirdly, despite the extra pain in the neck that CVS and similar systems are to administrate, they allow multiple copies of the code to be checked out (usually by different people). When the people commit their additions and revisions, the system looks to see if there are conflicts in the areas that have been worked on. It demands that the conflicts be resolved before a new version is released.

        Another benefit of all of these systems is that they enforce a certain discipline of documenting what you have worked on. Actually, I have found that documentation more useful for billing or job review than I have for benchmarking the progress of a job, which is its original intent.

        You may think that only you will be working on this code, and that may be very true. But, you 12 months from now, after working on some other branch of this project or some entirely different project, you will appreciate any structure and documention that exists. It will help you get back into the depths of the code elements.

Re: Large-scale code documentation
by YuckFoo (Abbot) on Nov 05, 2004 at 04:12 UTC
    You might find perltidy useful to get the code in shape. It also might be helpful for documentation.


Re: Large-scale code documentation
by ggg (Scribe) on Nov 05, 2004 at 15:21 UTC
    I don't know enough to give any examples, but from what I've read in other threads, it might be a good idea to develop a test suite that the present code passes. If your updated code still passes, you can have some assurance that you haven't bollixed things up too badly. :-)
      You can use Class::Contract and Params::Validate to create a syntax of "how stuff should work," from which you can write a lexer to create documentation for. Hardish work, but very rewarding when it works.

      Tilly is my hero.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://405320]
Approved by jfroebe
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (7)
As of 2021-10-16 18:37 GMT
Find Nodes?
    Voting Booth?
    My first memorable Perl project was:

    Results (69 votes). Check out past polls.