Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

How Many Modules Is Too Many?

by Belgarion (Chaplain)
on May 29, 2004 at 23:42 UTC ( #357529=perlmeditation: print w/ replies, xml ) Need Help??

I was going through the solutions to the expert Perl Quiz of the Week when I came across Mark Jason Dominus' commentary regarding the Mail::Box::MH module. To wit, he states:

That sounds like a joke, doesn't it? Say "use Mail::Box::MH" and you load *seventy* modules.

I tend to agree that loading seventy modules seems a little excessive; however, where do we define the cut-off point? The common refrain in most programming circles is to re-use a module rather than write your own. The Mail::Box::MH module obviously adheres to this mantra, since it pulls in seventy helper modules.

As another example, I use the Class::DBI module all the time when doing database work. Class::DBI uses thirty-two modules. For what Class::DBI does, I feel it's a fair trade-off.

As an aside, eight of the required modules are actually part of Class::DBI itself, so we should count that as one module. That still leaves twenty-five other required modules. I also do not consider the base, strict, vars, and warnings modules as problematic since virtually every Perl module will include this modules. Therefore, there are still twenty-one other modules required by Class::DBI.

perl -l -MClass::DBI -e 'print join "\n", keys %INC' | sort | uniq __OUTPUT__ /usr/lib/perl/5.8.2/auto/Storable/autosplit.ix AutoLoader.pm Carp.pm Class/Accessor.pm Class/DBI.pm Class/DBI/Column.pm Class/DBI/ColumnGrouper.pm Class/DBI/Query.pm Class/DBI/Relationship.pm Class/DBI/Relationship/HasA.pm Class/DBI/Relationship/HasMany.pm Class/DBI/Relationship/MightHave.pm Class/Data/Inheritable.pm Class/Trigger.pm Config.pm DBI.pm DynaLoader.pm Exporter.pm Exporter/Heavy.pm Fcntl.pm Ima/DBI.pm List/Util.pm Scalar/Util.pm Storable.pm UNIVERSAL/moniker.pm XSLoader.pm base.pm overload.pm strict.pm vars.pm warnings.pm warnings/register.pm

What do other monks think? How many modules can one module require before it is considered too many? Given the speed of today's computers and the amount of memory they have, is this question academic?

Comment on How Many Modules Is Too Many?
Download Code
Re: How Many Modules Is Too Many?
by Wassercrats on May 30, 2004 at 00:59 UTC
    Cpan modules are like free software, so I wouldn't say that any of them aren't good enough. I don't think programmers should waste their time perfecting a module as if it were a tool that they were paid to create. Once they create something useful, they should publish it, and if they want, they or others could improve it as they see fit, at their own convenience.

    Developing a module or any software for hire is different. In that case, whoever is hiring you should give you the specifications, and maybe you would need to cut down on the unneeded features and excess file operations.

      Wassercrats:
      Cpan modules are like free software, so I wouldn't say that any of them aren't good enough. I don't think programmers should waste their time perfecting a module as if it were a tool that they were paid to create. Once they create something useful, they should publish it, and if they want, they or others could improve it as they see fit, at their own convenience.
      I disagree completely. CPAN modules are free software, however they're published. People who publish software to a public archive are under a moral obligation to either maintain their code or to remove their code from publication (or, in the case of CPAN, add a prominent note to their documentation that the module is no longer maintained by the author and may be up for adoption)

      ... or others could improve it as they see fit ...
      Unfortunately there is no ability on CPAN for maintenance by anyone other than the publisher. Thus it is the publishers sole responsibility to keep the module working. It is up to them to adopt functionality change submitted by users.

      Now, given that, there are too many orphaned modules on CPAN. Personally I don't use a module that hasn't been touched in years. Of course that could mean it's perfectly stable, however I don't feel that is often the case.

      Also, I never require a module in another module that either has dependants that are orphaned or that don't pass their tests easily. If I have problems installing something, then I can only assume my users will have the same problems.

      "Get real! This is a discussion group, not a helpdesk. You post something, we discuss its implications. If the discussion happens to answer a question you've asked, that's incidental." -- nobull@mail.com in clpm
        I disagree completely. CPAN modules are free software, however they're published. People who publish software to a public archive are under a moral obligation to either maintain their code or to remove their code from publication (or, in the case of CPAN, add a prominent note to their documentation that the module is no longer maintained by the author and may be up for adoption)

        I'm under no moral obligation to maintain my code. I do (as time permits) but I'm certainly not under obligation to. The openness of CPAN is in my opinion the core reason for its success. I'm in total agreement with Jarkko's The Zen of Comprehensive Archive Networks when he said:

        Code quality? Ratings/reviews? Moderation/metamoderation? "Approved" SDKs? These all are hotly debated subjects and will not be addressed here since the CPAN is and will stay an open and free forum, where the authors decide what they upload. Any further selection belongs to different fora. Besides, adding any rating or approval processes creates bottlenecks, and bottlenecks are bad.

        Now, given that, there are too many orphaned modules on CPAN. Personally I don't use a module that hasn't been touched in years. Of course that could mean it's perfectly stable, however I don't feel that is often the case.

        Also, I never require a module in another module that either has dependants that are orphaned or that don't pass their tests easily. If I have problems installing something, then I can only assume my users will have the same problems.

        This is of course your privilege and CPAN allows you to do this. I'm happy using some older modules because they do the job and CPAN allows me to do this too. I don't want to see them go just because they don't fit your usage pattern of CPAN.

Re: How Many Modules Is Too Many?
by samtregar (Abbot) on May 30, 2004 at 05:47 UTC
    Personally, I think Perl programmers should use (and create!) as many CPAN modules as they possibly can. We used over 75 non-core modules to build Krang and I'm glad we did. They saved us an enormous amount of work.

    When I see that a module has a ton of prerequisites I expect to find a higher quality module. Modules with no prerequisites are the ones to watch out for! Most likely they're reinventing some wheels under the covers.

    -sam

      Modules with no prerequisites are the ones to watch out for! Most likely they're reinventing some wheels under the covers.
      reinventing or just plain stealing the wheels :) I know one monk who eliminates prerequisites by inlining only the functions he uses into his code.

      MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
      I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
      ** The third rule of perl club is a statement of fact: pod is sexy.

        But this doesn't means that with this chunks of codes, that come from other modules to avoid requisites, I'm not adding new things! Generally this chunks of codes are revised, fixed and upgraded.

        But note that here our policy of development is to always create a system not much dependent of others. 1st because our enverioment that we develop need to be easy to install, 2nd because it need to be portable for many OS, and dependecies make this harder.

        For example, XML::Smart has his own parser, XML::Smart::Parser, that is a upgrade and fix of XML::Parser::Lite. Why that? Because XML::Parser, the main XML parser for Perl, uses 26 modules. But the biggest problem is not to load 26 modules, but this modules come from a lot of different distributions (XML::Parser need: URI, HTPP, LWP, libwww...), that need much more things to be installed. So, 1 dependency generally means more sub-dependencies.

        History also shows to us that big dependencies make the probabiblity of bugs bigger, and is harder to fix them, but the biggest problem is the probability to have incompatibilities in the future with new versions. I know that, since I try to use less dependencies as possible, and I had this problem in less than 1 year with 2 modules.

        Graciliano M. P.
        "Creativity is the expression of the liberty".

Re: How Many Modules Is Too Many?
by Zaxo (Archbishop) on May 30, 2004 at 05:54 UTC

    When you check code coverage and find a useed module that you don't actually use, that's one too many modules.

    Use all the modules you can. That's just the right number.

    After Compline,
    Zaxo

Re: How Many Modules Is Too Many?
by PodMaster (Abbot) on May 30, 2004 at 06:27 UTC
    I tend to agree that loading seventy modules seems a little excessive; however, where do we define the cut-off point?
    Why does it seem excessive? Is 20,000 lines of code excessive? Now if you said 70 modules just to print "hello world" on the screen, yes, that would be excessive, but 70 modules to walk on water? 70 modules to save yourself years of work at the cost of 7 seconds startup time? Hardly excessive.

    Somebody pointed out to Dominus http://simon-cozens.org/draft-articles/email.html, Simons thoughts on the design of Mail::Box. So Mail::Box is ripe for refactoring, so what? I wouldn't care if it loaded 150 other modules. It does many things and I didn't have to write it.

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

Re: How Many Modules Is Too Many?
by castaway (Parson) on May 30, 2004 at 06:41 UTC
    I would think 70 is too many, but it depends on what exactly the module does. An actual application as a module I would expect to use a bunch more stuff than just another interim module for use by others.

    There have been several occasions when I go to install a module and get stuck in a web of dependencies so that at some point Im not even sure why I started. Yes, I install modules by hand the old fashioned way, I like to know what I actually have installed and what things it uses. At about a depth of 3 or more than 5 or so extra in one depth, I give up, more often than not.

    Whats more annoying though, is that there are modules that do essentially the same thing, for what ever reason (better wheels?) and modules on top of these that use only one of them. Thus requiring me to have several modules with the same functionality installed. I wish people would research more and attempt to interface with each other module that provides the functionality they want to use, and not just use the one they happened to have installed (but I guess I can wish all I want, unless I start pointing out specific cases to authors.. ,)

    Also, look at your list, and deduct any and all CORE modules. How does it look now?

    DBI.pm Ima/DBI.pm
    Plus the actual Class::DBI modules.. Not very many.

    As for the amount of memory these things take up, I find its usually justified, a module thats being used and developed by several people, or just tested by others, may have a function or two more than you need, but it will do the others better than anything one could put together alone.

    .. How to find these good, well developed modules on CPAN, is another matter.. ;)

    C.

      I would think 70 is too many, but it depends on what exactly the module does.
      I don't think in this case it's relevant what the module does. It depends on how much functionality the program is actually using. If you are pulling in 70 modules to do one little thing, than that could classify as "overkill". OTOH, if you program is using all the functionality of the 70 modules, than it's probably not overkill. (Although with OO programming, it's easy to write a ton of modules to implement one thing, using a long inheritance tree where every subclass does a tiny thing more than its superclass).

      Abigail

How Many Modules Are 'Just Enough'?
by mstone (Deacon) on May 30, 2004 at 08:05 UTC

    As many as you need, and no more.

    There are just too many free variables for any pat answer to be meaningful. The number of acceptable modules depends on the absolute size of your code, the complexity of that code, and the amount of startup latency you can afford.

    On the pro-module side, I happen to be a big fan of using tiny classes. These area basically data structures with a bit of test code thrown in, like arrays with boundary checking, strings which can never be zero length (useful for filenames), timestamps which can never be zero, and so forth. The whole module may contain a dozen lines of code or less, but the module justifies itself by making fifty to a hundred lines of higher level error-checking code unnecessary.

    On the anti-module side, I've read far too many Perl scripts which could stand a good refactoring. The problem, of course, is that many people who use some module have never actually read the module itself. They've only read the documentation. Therefore, they know that the module contains function X, but they don't know how complicated that function is, or what other dependencies it creates.

    I see that as a kind of cargo-cult programming.

    Merlyn may not agree with me on this, but I think but I think my version is consistent with Feynman's original essay on cargo-cult science. Programmers who use a module because they know what it does, but don't know how it works, can end up getting unexpected results.. like ever-increasing chains of dependencies.

    How much harm does that do? Again, there's no set answer. If you don't care whether your program takes an extra tenth of a second (or 700ms, assuming a 10ms average seek time and 70 extra modules) to open and load all the additional files, there's no problem. If you can't afford that extra startup latency (700ms won't cut it for Google or Slashdot), you'll probably be willing to dig through the code and factor out the parts you really do need. And I haven't heard any complaints about incompatible-version conflicts, but it's worth remembering that Perl isn't any more immune to those problems than Linux was.

    FWIW, I'm also a big fan of inlining code rather than importing it. It's easy to reinvent the wheel if you have a perfectly good wheel to use as a design reference. You get the advantages of using mature code, you avoid the disadvantages of transitive dependencies, and you can learn a thing or two along the way.

      Merlyn may not agree with me on this, but I think but I think my version is consistent with Feynman's original essay on cargo-cult science. Programmers who use a module because they know what it does, but don't know how it works, can end up getting unexpected results.. like ever-increasing chains of dependencies.
      While you have a point about "not really understanding the module you're using", the problem is that: Where do you stop needing to understand what you're doing?

      Do you need to understand everything perl is doing at the C level? Everything C is doing at the ASM level? Everything ASM is doing at the uh, binary level (whatever the level below that is considered. Machine code?). Where do you stop needing to understand and just use the functionality?

          Where do you stop needing to understand what you're doing?

        Once again, there's no pat answer. It depends on what you do with the code.

        For a script you don't plan to use often, or for anything critical, you don't have to dig very far. Your code can be as sloppy(1) as you want it to be, because you're willing to accept the consequences, and you're not foisting them off on anyone else.

        (1) - I don't mean to suggest that 'sloppy' is inherently bad. All code exists on a scale with 'sloppy' at one end and 'obsessive' at the other. A script that's 90% sloppy is still 10% obsessive, and one that's 90% obsessive is still 10% sloppy. Part of the programmer's job is deciding where on that scale each project should fall.

        If you intend to use your code a lot, or to publish it for others to use, I personally think it's polite to shoot for the 'obsessive' end of the scale. At very least, you should work to avoid problems that are known and understood.

        Implicit dependencies are a known, well-understood problem. Windows users have lived in 'DLL Hell' for years, and Linux developers have reinvented that same, butt-ugly wheel for themselves. Everyone wants to use libraries, but nobody can agree on which version of any given library to use, and the installers will happily replace some item 5 layers deep with another, possibly incompatible version.

        Anyone who knows Open Source knows that large-scale coordination is one of the weak points of OSS development. Anyone who knows game theory knows that environments like that are fertile soil for yet another version of DLL Hell. The way to avoid creating a Perl version of DLL Hell is for each developer to be properly Lazy, and eliminate as many dependency problems as possible themselves, so people down the line won't have to.

Re: How Many Modules Is Too Many?
by liz (Monsignor) on May 30, 2004 at 12:50 UTC
    How many modules can one module require before it is considered too many? Given the speed of today's computers and the amount of memory they have, is this question academic?

    Before answering the question, I think the question is missing at least one other dimension: the environment in which the module is being used. And if we're distinghuising environments, I'm thinking basically two groups: load,run x 1,quit (basic maintenance scripts, CGI scripts) and load,run x N,quit (persistent environments such as mod_perl).

    Many CORE modules use AutoLoader, because it was felt that huge libraries (such as POSIX) or functionality that you need only in exception handling (e.g. Carp), would not need to be loaded in their entirety, but only when they're reallly needed. This is all fine for "run x 1" environments. But it is counterproductive for "run x N" environments, as it introduces memory bloat in those environments (because of shared memory between child processes becoming unshared).

    I think Perl needs a way to optimize for the different environments in which it is being used. That is way more important than a discussion about what the right amount of modules is.

    Liz

Re: How Many Modules Is Too Many?
by Jenda (Abbot) on May 30, 2004 at 12:59 UTC

    I would not count most of the modules. I'd count only about 16 modules. Anyway Mail::Box looks like a result of a heavy OOOverdose to me. Not as heavy as the .Net framework though.

    Jenda
    Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
       -- Rick Osborne

    Edit by castaway: Closed small tag in signature

Re: How Many Modules Is Too Many?
by belg4mit (Prior) on May 30, 2004 at 19:58 UTC
    You make a specious argument. Mail::Box::MH doesn't pull 70 *external* modules. Mail::Box::MH is part of the Mail::Box suite, developed from scratch with few external dependencies. If you take exception to Mail::Box design, this I can understand as I've tried using as well. However IMHO the issue is OO gone mad. Now if you are truly simply concerned with the number of use statements anything uses, than I am inclined to agree with what others have said, there is no such thing as too many; especially when the alternative is to reinvent the wheel.

    PS> strict et al. are pragmas and therefore probably oughtn't be considered modules, especially in this sense.

    --
    I'm not belgian but I play one on TV.

Re: How Many Modules Is Too Many?
by andyf (Pilgrim) on May 30, 2004 at 21:28 UTC
    It's an interesting question, and it is 'academic' because there is an infinite supply of problems to solve. Enough is never enough. I get that attitude as a C coder where there is an ocean of existing code, but it is far more work gluing it together, and you have to really understand it almost as as much as if you wrote it sometimes. Perl takes modularity to the next level. Maybe it makes it too easy? I occasionally think I have a useful module candidate, but I rarely wrap it up and publish because when I look there are two or three existing ones. The time when enough definitely is enough is when there are too many very similar things. Even with Perls great handling of versions and the awesome CPAN it's easily possible to load a bunch of deprecated old modules with overlapping functionality unless you look very closely at what you're requiring. But then you should do that anyway.

    In the context of the Perl Quiz, which looks very interesting btw, you might want to look at the rules viz modules, they may differ from the comments here. For the current puzzle (hangman) I think you could probably crack it with a one liner for style points.
Re: How Many Modules Is Too Many?
by dws (Chancellor) on May 31, 2004 at 15:34 UTC

    One very practical consideration for using as few modules as necessary (while still using as many as you absolutely need to) is bloat. Bloat bites you in two ways. One is size: pull in enough modules, and pretty soon things like the profiler go haywire. The second is startup speed, which doesn't sound like a bit problem when you're doing a mod_perl app, unless you're heavy into unit tests, in which case the startup hit for pulling in several tens or hundreds of modules bites you every time you execute a .t file. And when you have several hundred tests (a good thing), the additional startup can add minutes to a full test run (a bad thing).

    Code reuse is not an absolute good that happens in some ideal vacuum. Reuse is a tool, and like any other tool, it has both benefits and costs.

      The second is startup speed, which doesn't sound like a bit problem when you're doing a mod_perl app, unless you're heavy into unit tests, in which case the startup hit for pulling in several tens or hundreds of modules bites you every time you execute a .t file. And when you have several hundred tests (a good thing), the additional startup can add minutes to a full test run (a bad thing).

      This problem can (as I'm sure you know :-) be mitigated by building and tearing down your own test fixtures in a single test script rather that using lots of different *.t scripts to isolate your tests.

      I'd probably look towards optimising the test suite before I add the extra overhead of rewriting / inlining modules.

        This problem can (as I'm sure you know :-) be mitigated by building and tearing down your own test fixtures in a single test script rather that using lots of different *.t scripts to isolate your tests.

        Many of our .t files already have over a hundred individual tests (i.e., ok() tests). Each .t does specific setup and teardown, and many use END {} blocks to do useful things like destroy temporary database objects that were inserted for purposes of testing, and which shouldn't be there by the time the next bunch of tests runs. Combining these tests would be possible, but only at the expense of a lot of work. We're getting more mileage by attacking bloat first.

Re: How Many Modules Is Too Many?
by zakzebrowski (Curate) on Jun 01, 2004 at 00:20 UTC
    IMHO, a module solves a problem. Sometimes, in order to solve a larger problem, you need to then pull in additional modules, which solved subsets of the larger problem... Also, if you can prove (via testing) that this module solves this specific piece of the problem, why not use that one?


    ----
    Zak - the office

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://357529]
Approved by Itatsumaki
Front-paged by gmax
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (5)
As of 2014-09-03 04:54 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite cookbook is:










    Results (35 votes), past polls