Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Program structure: subs vs modules vs Selfloader

by bradcathey (Prior)
on Jun 20, 2004 at 13:46 UTC ( [id://368279]=perlquestion: print w/replies, xml ) Need Help??

bradcathey has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,

My program calls a large (200 lines) internal subroutine about 1/3 of the time the script is run. The subroutine requires 4 arguments to run and returns a AoH ref for output to HTML::Template. To complicate matters, it calls other sub routines (5 and 10 liners that are only needed by that large sub) to perform basic formatting tasks, etc. So, things currently look like:

#!/usr/bin/perl use strict; ...etc... ...initiate variables... ...HTML::Template setup... my $AoH = &largesub($a, $b, $c, $d) if $run_large_sub; $template -> param( todisplay => $AoH ); print $template->output(); exit(); sub largesub { my ($e, $f, $g, $h) = @_; my $foo = &smallsub($e); ...process... return \@AoH; } sub smallsub { ...process... }

Questions: 1) do I leave it as is? (seems messy) 2) do I spin off the large sub and it supporting subs as a module? (modules always get loaded) 3) do I use Selfloader to avoid loading all those subs? (subs stay in main script, but don't always get loaded) 4) What is the cleanest, most efficient, most readable solution? Thanks!


—Brad
"Don't ever take a fence down until you know the reason it was put up. " G. K. Chesterton

Replies are listed 'Best First'.
Re: Program structure: subs vs modules vs Selfloader
by matija (Priest) on Jun 20, 2004 at 15:05 UTC
    You are asking the wrong questions:

    If the 200 line routine seems messy now, do you think it will be any better in a separate module? Perhaps you could refactor it into several routines, and make it more readable in the process?

    How big is the rest of the program? If it's 200 lines of the subroutine and five lines of the rest, does it really make sense to create a module just for the routine (it might, if you think other people (or you at another time) might have a use for that subroutine). If it's a 200 line subroutine in a 1000 line program, it will probably make sense to break it into modules - and probably more than one.

    Unless the module was very general, breaking the program into a tiny script and a large module does not improve readability.

    Self loader, of course, does NOTHING for readability - it might improve loading times, sometimes. However, you need to know what you're doing, and I suggest a thorough course of benchmarking to make sure the added complication of Selfloader really is offset by the reduced code size on initial load. It is no way a foregone conclusion.

      Good points matija, thanks. For the record, my main program is only about 400 lines and, per your point, it will not be called by other scripts. It was the thought of having a "large" sub load everytime that got me to thinking. But may it is much to do about nothing, since neither are really huge. I guess I was asking as a matter of principle, especially if I ever do have a large script with subs.

      —Brad
      "Don't ever take a fence down until you know the reason it was put up. " G. K. Chesterton
Re: Program structure: subs vs modules vs Selfloader
by Arunbear (Prior) on Jun 20, 2004 at 14:18 UTC
    Here's one solution:
    # Give largesub and its entourage their own module package ModuleWithLargeSub; sub smallsub { ...process... } sub largesub { my ($e, $f, $g, $h) = @_; my $foo = &smallsub($e); ...process... return \@AoH; } 1; # Then in the cgi script: #!/usr/bin/perl use strict; my $run_large_sub; ...etc... ...initiate variables... ...HTML::Template setup... my $AoH; if($run_large_sub) { require ModuleWithLargeSub; $AoH = ModuleWithLargeSub::largesub($a, $b, $c, $d); } $template -> param( todisplay => $AoH ); print $template->output(); exit();
    Also, serioulsy consider breaking the large sub into smaller ones!
    :-)
      Arunbear, thanks for the example. However, if I'm permitted, I have a follow-up question: does coding it the way you do, by "requiring" the module only if the condition is met, only load the module when it runs?

      Normally I reference the module like this:
      use SomeModule.pm
      at the start of my program, right after the shebang. Thanks.

      Update: Well, I Googled and found this informative piece on modules, libraries, etc. I see now that Perl doesn't deal with a require until it comes across it in the code.

      —Brad
      "Don't ever take a fence down until you know the reason it was put up. " G. K. Chesterton

        If you use a module, Perl deals with it at compile time. If you require a module, Perl deals with it at run time.

        I typically like to use a module because it allows me to check everything via a perl -c someprog.pl after I've finished coding. It I used require instead, I wouldn't find out that something was wrong until I ran the program for the first time. YMMV, just my own personal preference.

        require is handy though if your program needs to do something based on whether you have a module installed on your system. You can test for this at runtime.

        #!/usr/bin/perl -w use strict; # do we have FOO::Bar on this system? eval { require FOO::Bar }; if ($@) { # FOO::Bar not installed } else { # FOO::Bar installed }
        -- vek --
Re: Program structure: subs vs modules vs Selfloader
by Fletch (Bishop) on Jun 20, 2004 at 18:04 UTC

    The difference in compilation overhead between 200 lines and 400 lines is more than likely so negligable that you're probably barking up the wrong tree. If this is in a web context you'd probably get a much better boost by using something like mod_perl or FastCGI.

    That aside, yes you probably should see if you can't break things out into smaller subs (at the least). It should be more maintainable in the long run.

Re: Program structure: subs vs modules vs Selfloader
by CountZero (Bishop) on Jun 20, 2004 at 19:03 UTC
    From a more general point of view, when having to decide whether one should "spin off" a subroutine in a module, I always ask myself "What are the chances that this subroutine or subroutines need(s) to be called from another script?"

    If the subroutine is only useful in this script, I leave it in. With the modern IDE's you can usually "fold" these subs out of sight, so they don't get in the way.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

      If the subroutine is only useful in this script, I leave it in.

      For any sizeable project, I have a tendency to go the other way and put virtually all of the functionality of the program into Modules.

      The front end program(s) just parse command line options and call the right modules. They also hold the POD documentation for the end user. The Modules have the POD documentation for their API.

      Then it is really easy to have separate front end test scripts that set up certain conditions and call the modules the same way the real front end program does.

      I figure even if the module never gets re-used, it is still nice to have a clean well-documented API to it that can be easily exercised by test scripts.

        Well, that is the power of Perl: TIMTOWTDI!

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Program structure: subs vs modules vs Selfloader
by ihb (Deacon) on Jun 20, 2004 at 18:03 UTC

    If you can determine at compile-time whether you want to use the sub or not you can use the if.pm module. Most likely you can't do that in this case, looking at your comments at the top of the script, but perhaps another time.

    ihb

Re: Program structure: subs vs modules vs Selfloader
by adamk (Chaplain) on Jun 22, 2004 at 07:22 UTC
    200 lines is probably way below the threshold for loading things in at runtime.

    On the other hand, I often find myself working in the world of 10 or 20 ( or 50 ) "couple of hundred line" classes. And then loading all of them all of the time becomes more painful, and you use a smaller percentage of them for each call.

    So, to allow for run-time loading when I care, vs load-it-all for development, on a large number of classes, and only changing one line of code, I'd so something like the following (disclosure: Yes, I am the author)

    # In module package Foo; sub bar { print "Hello World!\n"; } 1; # In code use Class::Autouse 'Foo'; Foo->bar;

    Done!
    Class::Autouse just intercepts the method call, loads the normal looking class in on-the-fly and then executes the method as normal. When developing/debugging, you just
    # The following two lines produce an identical result use Class::Autouse ':devel', 'Foo'; # Is the same as use Foo ();

    And it loads in 'Foo' at compile time, just like a normal class. Of course, to get the lovely transparentness, there's the following conditions.

    1. use Foo (); # only loaded, no ->import methods get called.
    2. You only get class-level granularity
    3. Class::Autouse has an overhead of about 200k itself

    Of course, when you get to 50 classes, it's a great trade off to be able to do
    use Class::Autouse; Class::Autouse->load_recursive('Foo');

    And have all 50 children of Foo:: autoload transparently the first time a method gets called.

    Anyways, that's my two cents.
Re: Program structure: subs vs modules vs Selfloader
by Anonymous Monk on Jun 21, 2004 at 13:10 UTC

    Looks to me like there isn't much reason to have the subs larger or small in the first place.

    IMO there are generally two reasons to have a subroutine.

    1. The code will be executed multiple times in the script.
    2. You want to modularize the code for re-use

    Adding subroutines where there's no clear reason for one just adds confusion to a script.

    That being said, if you do break things up into subroutines, it's generally good form to ensure that each subroutine does one thing and one thing only. Short and too the point.

    A run-on subroutine suffers from the same disease as a run-on sentence.

    I recommend reading Code Complete by Steve McConnell (M$ press). It a very good book on coding style and best practices.

    Cheers

      There are many more reasons to separate code into subroutines (and modules) than what you have listed. In fact, it sounds like both of your points are actually the same thing -- the chunk of code in question is going to be called multiple times or somehow used in multiple ways (by "re-use" I'm guessing you mean in other projects... so perhaps describing that as being used in multiple ways is a stretch... but bear with me here). I believe that this might be the reason subs were invented in the first place, but very soon afterward the ancients became aware of yet another purpose for subs. Thus were discovered the offering of Organization and Structure to the Gods.

      A good program reads like a good book. A good book has a nicely structured table of contents, footnotes and references to other works, and references to other sections in the book itself (and a good index!). There is a certain balance to be struck when dealing with references, however. Too many and it will be a dry read with a footnote after every third word. Too few and some connections will be lost upon the reader.

      Sometimes, however, it is not appropriate to refer to other pieces of the book. It may not be appropriate to call some organizational subroutines in your code more than once. The table of contents is still very important, however, and breaking your big sub into many smaller subs (or even modules) will not only appease the gods, but it will also make your program a better read. Your big sub should become the table of contents, and reading the little subs you create from it is the text itself.

      I would say that a run-on subroutine suffers from the same disease as a book with no table-of-contents... something which no number of comments in the code can fix (unless of course the comments themselves break things into sections and lay them out like a table of contents... but as an ancient once said, "Do not say something in a comment which you can say clearly in code.")

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://368279]
Approved by Limbic~Region
Front-paged by matija
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-03-29 07:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found