How Many Modules Is Too Many?


There's more than one way to do things
	PerlMonks

How Many Modules Is Too Many?

by Belgarion (Chaplain)

on May 29, 2004 at 23:42 UTC ( [id://357529]=perlmeditation: print w/replies, xml )

Need Help??

I was going through the solutions to the expert Perl Quiz of the Week when I came across Mark Jason Dominus' commentary regarding the Mail::Box::MH module. To wit, he states:

That sounds like a joke, doesn't it? Say "use Mail::Box::MH" and you load *seventy* modules.

I tend to agree that loading seventy modules seems a little excessive; however, where do we define the cut-off point? The common refrain in most programming circles is to re-use a module rather than write your own. The Mail::Box::MH module obviously adheres to this mantra, since it pulls in seventy helper modules.

As another example, I use the Class::DBI module all the time when doing database work. Class::DBI uses thirty-two modules. For what Class::DBI does, I feel it's a fair trade-off.

As an aside, eight of the required modules are actually part of Class::DBI itself, so we should count that as one module. That still leaves twenty-five other required modules. I also do not consider the base, strict, vars, and warnings modules as problematic since virtually every Perl module will include this modules. Therefore, there are still twenty-one other modules required by Class::DBI.

perl -l -MClass::DBI -e 'print join "\n", keys %INC' | sort | uniq

__OUTPUT__
/usr/lib/perl/5.8.2/auto/Storable/autosplit.ix
AutoLoader.pm
Carp.pm
Class/Accessor.pm
Class/DBI.pm
Class/DBI/Column.pm
Class/DBI/ColumnGrouper.pm
Class/DBI/Query.pm
Class/DBI/Relationship.pm
Class/DBI/Relationship/HasA.pm
Class/DBI/Relationship/HasMany.pm
Class/DBI/Relationship/MightHave.pm
Class/Data/Inheritable.pm
Class/Trigger.pm
Config.pm
DBI.pm
DynaLoader.pm
Exporter.pm
Exporter/Heavy.pm
Fcntl.pm
Ima/DBI.pm
List/Util.pm
Scalar/Util.pm
Storable.pm
UNIVERSAL/moniker.pm
XSLoader.pm
base.pm
overload.pm
strict.pm
vars.pm
warnings.pm
warnings/register.pm
[download]

What do other monks think? How many modules can one module require before it is considered too many? Given the speed of today's computers and the amount of memory they have, is this question academic?

Comment on How Many Modules Is Too Many? Download Code

Replies are listed 'Best First'.

Re: How Many Modules Is Too Many?
by PodMaster (Abbot) on May 30, 2004 at 06:27 UTC

I tend to agree that loading seventy modules seems a little excessive; however, where do we define the cut-off point?

Somebody pointed out to Dominus http://simon-cozens.org/draft-articles/email.html, Simons thoughts on the design of Mail::Box. So Mail::Box is ripe for refactoring, so what? I wouldn't care if it loaded 150 other modules. It does many things and I didn't have to write it.

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

Re: How Many Modules Is Too Many?
by Zaxo (Archbishop) on May 30, 2004 at 05:54 UTC

When you check code coverage and find a useed module that you don't actually use, that's one too many modules.

Use all the modules you can. That's just the right number.

After Compline,
Zaxo

Re: How Many Modules Is Too Many?
by samtregar (Abbot) on May 30, 2004 at 05:47 UTC

When I see that a module has a ton of prerequisites I expect to find a higher quality module. Modules with no prerequisites are the ones to watch out for! Most likely they're reinventing some wheels under the covers.

-sam

Re: Re: How Many Modules Is Too Many?

by PodMaster (Abbot) on May 30, 2004 at 06:30 UTC

Modules with no prerequisites are the ones to watch out for! Most likely they're reinventing some wheels under the covers.

MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
** The third rule of perl club is a statement of fact: pod is sexy.

Re: Re: Re: How Many Modules Is Too Many?

by gmpassos (Priest) on May 30, 2004 at 23:03 UTC

But note that here our policy of development is to always create a system not much dependent of others. 1st because our enverioment that we develop need to be easy to install, 2nd because it need to be portable for many OS, and dependecies make this harder.

For example, XML::Smart has his own parser, XML::Smart::Parser, that is a upgrade and fix of XML::Parser::Lite. Why that? Because XML::Parser, the main XML parser for Perl, uses 26 modules. But the biggest problem is not to load 26 modules, but this modules come from a lot of different distributions (XML::Parser need: URI, HTPP, LWP, libwww...), that need much more things to be installed. So, 1 dependency generally means more sub-dependencies.

History also shows to us that big dependencies make the probabiblity of bugs bigger, and is harder to fix them, but the biggest problem is the probability to have incompatibilities in the future with new versions. I know that, since I try to use less dependencies as possible, and I had this problem in less than 1 year with 2 modules.

Graciliano M. P.
"Creativity is the expression of the liberty".

Re^3: How Many Modules Is Too Many?

by BigLug (Chaplain) on Jun 01, 2004 at 00:03 UTC

How Many Modules Are 'Just Enough'?
by mstone (Deacon) on May 30, 2004 at 08:05 UTC

As many as you need, and no more.

There are just too many free variables for any pat answer to be meaningful. The number of acceptable modules depends on the absolute size of your code, the complexity of that code, and the amount of startup latency you can afford.

On the pro-module side, I happen to be a big fan of using tiny classes. These area basically data structures with a bit of test code thrown in, like arrays with boundary checking, strings which can never be zero length (useful for filenames), timestamps which can never be zero, and so forth. The whole module may contain a dozen lines of code or less, but the module justifies itself by making fifty to a hundred lines of higher level error-checking code unnecessary.

On the anti-module side, I've read far too many Perl scripts which could stand a good refactoring. The problem, of course, is that many people who use some module have never actually read the module itself. They've only read the documentation. Therefore, they know that the module contains function X, but they don't know how complicated that function is, or what other dependencies it creates.

I see that as a kind of cargo-cult programming.

Merlyn may not agree with me on this, but I think but I think my version is consistent with Feynman's original essay on cargo-cult science. Programmers who use a module because they know what it does, but don't know how it works, can end up getting unexpected results.. like ever-increasing chains of dependencies.

How much harm does that do? Again, there's no set answer. If you don't care whether your program takes an extra tenth of a second (or 700ms, assuming a 10ms average seek time and 70 extra modules) to open and load all the additional files, there's no problem. If you can't afford that extra startup latency (700ms won't cut it for Google or Slashdot), you'll probably be willing to dig through the code and factor out the parts you really do need. And I haven't heard any complaints about incompatible-version conflicts, but it's worth remembering that Perl isn't any more immune to those problems than Linux was.

FWIW, I'm also a big fan of inlining code rather than importing it. It's easy to reinvent the wheel if you have a perfectly good wheel to use as a design reference. You get the advantages of using mature code, you avoid the disadvantages of transitive dependencies, and you can learn a thing or two along the way.

Re: How Many Modules Are 'Just Enough'?

by BUU (Prior) on May 30, 2004 at 10:46 UTC

Merlyn may not agree with me on this, but I think but I think my version is consistent with Feynman's original essay on cargo-cult science. Programmers who use a module because they know what it does, but don't know how it works, can end up getting unexpected results.. like ever-increasing chains of dependencies.

Re: Re: How Many Modules Are 'Just Enough'?

by mstone (Deacon) on May 30, 2004 at 20:15 UTC

Where do you stop needing to understand what you're doing?

Once again, there's no pat answer. It depends on what you do with the code.

For a script you don't plan to use often, or for anything critical, you don't have to dig very far. Your code can be as sloppy(1) as you want it to be, because you're willing to accept the consequences, and you're not foisting them off on anyone else.

(1) - I don't mean to suggest that 'sloppy' is inherently bad. All code exists on a scale with 'sloppy' at one end and 'obsessive' at the other. A script that's 90% sloppy is still 10% obsessive, and one that's 90% obsessive is still 10% sloppy. Part of the programmer's job is deciding where on that scale each project should fall.

If you intend to use your code a lot, or to publish it for others to use, I personally think it's polite to shoot for the 'obsessive' end of the scale. At very least, you should work to avoid problems that are known and understood.

Implicit dependencies are a known, well-understood problem. Windows users have lived in 'DLL Hell' for years, and Linux developers have reinvented that same, butt-ugly wheel for themselves. Everyone wants to use libraries, but nobody can agree on which version of any given library to use, and the installers will happily replace some item 5 layers deep with another, possibly incompatible version.

Anyone who knows Open Source knows that large-scale coordination is one of the weak points of OSS development. Anyone who knows game theory knows that environments like that are fertile soil for yet another version of DLL Hell. The way to avoid creating a Perl version of DLL Hell is for each developer to be properly Lazy, and eliminate as many dependency problems as possible themselves, so people down the line won't have to.

Re: How Many Modules Is Too Many?
by castaway (Parson) on May 30, 2004 at 06:41 UTC

There have been several occasions when I go to install a module and get stuck in a web of dependencies so that at some point Im not even sure why I started. Yes, I install modules by hand the old fashioned way, I like to know what I actually have installed and what things it uses. At about a depth of 3 or more than 5 or so extra in one depth, I give up, more often than not.

Whats more annoying though, is that there are modules that do essentially the same thing, for what ever reason (better wheels?) and modules on top of these that use only one of them. Thus requiring me to have several modules with the same functionality installed. I wish people would research more and attempt to interface with each other module that provides the functionality they want to use, and not just use the one they happened to have installed (but I guess I can wish all I want, unless I start pointing out specific cases to authors.. ,)

Also, look at your list, and deduct any and all CORE modules. How does it look now?

DBI.pm
Ima/DBI.pm
[download]

As for the amount of memory these things take up, I find its usually justified, a module thats being used and developed by several people, or just tested by others, may have a function or two more than you need, but it will do the others better than anything one could put together alone.

.. How to find these good, well developed modules on CPAN, is another matter.. ;)

C.

Re: How Many Modules Is Too Many?

by Abigail-II (Bishop) on May 30, 2004 at 10:01 UTC

I would think 70 is too many, but it depends on what exactly the module does.

Abigail

Re: How Many Modules Is Too Many?
by liz (Monsignor) on May 30, 2004 at 12:50 UTC

How many modules can one module require before it is considered too many? Given the speed of today's computers and the amount of memory they have, is this question academic?

Before answering the question, I think the question is missing at least one other dimension: the environment in which the module is being used. And if we're distinghuising environments, I'm thinking basically two groups: load,run x 1,quit (basic maintenance scripts, CGI scripts) and load,run x N,quit (persistent environments such as mod_perl).

Many CORE modules use AutoLoader, because it was felt that huge libraries (such as POSIX) or functionality that you need only in exception handling (e.g. Carp), would not need to be loaded in their entirety, but only when they're reallly needed. This is all fine for "run x 1" environments. But it is counterproductive for "run x N" environments, as it introduces memory bloat in those environments (because of shared memory between child processes becoming unshared).

I think Perl needs a way to optimize for the different environments in which it is being used. That is way more important than a discussion about what the right amount of modules is.

Liz

Re: How Many Modules Is Too Many?
by Jenda (Abbot) on May 30, 2004 at 12:59 UTC

I would not count most of the modules. I'd count only about 16 modules. Anyway Mail::Box looks like a result of a heavy OOOverdose to me. Not as heavy as the .Net framework though.

Jenda
Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.
-- Rick Osborne

Edit by castaway: Closed small tag in signature

Re: How Many Modules Is Too Many?
by belg4mit (Prior) on May 30, 2004 at 19:58 UTC

PS> strict et al. are pragmas and therefore probably oughtn't be considered modules, especially in this sense.

-- I'm not belgian but I play one on TV.

Re: How Many Modules Is Too Many?
by andyf (Pilgrim) on May 30, 2004 at 21:28 UTC

Re: How Many Modules Is Too Many?
by Wassercrats (Initiate) on May 30, 2004 at 00:59 UTC

Developing a module or any software for hire is different. In that case, whoever is hiring you should give you the specifications, and maybe you would need to cut down on the unneeded features and excess file operations.

Re: Re: How Many Modules Is Too Many?

by BigLug (Chaplain) on Jun 01, 2004 at 00:21 UTC

Cpan modules are like free software, so I wouldn't say that any of them aren't good enough. I don't think programmers should waste their time perfecting a module as if it were a tool that they were paid to create. Once they create something useful, they should publish it, and if they want, they or others could improve it as they see fit, at their own convenience.

published

... or others could improve it as they see fit ...

Now, given that, there are too many orphaned modules on CPAN. Personally I don't use a module that hasn't been touched in years. Of course that could mean it's perfectly stable, however I don't feel that is often the case.

Also, I never require a module in another module that either has dependants that are orphaned or that don't pass their tests easily. If I have problems installing something, then I can only assume my users will have the same problems.

"Get real! This is a discussion group, not a helpdesk. You post something, we discuss its implications. If the discussion happens to answer a question you've asked, that's incidental." -- nobull@mail.com in clpm

Re^3: How Many Modules Is Too Many?

by adrianh (Chancellor) on Jun 01, 2004 at 10:43 UTC

I disagree completely. CPAN modules are free software, however they're published. People who publish software to a public archive are under a moral obligation to either maintain their code or to remove their code from publication (or, in the case of CPAN, add a prominent note to their documentation that the module is no longer maintained by the author and may be up for adoption)

I'm under no moral obligation to maintain my code. I do (as time permits) but I'm certainly not under obligation to. The openness of CPAN is in my opinion the core reason for its success. I'm in total agreement with Jarkko's The Zen of Comprehensive Archive Networks when he said:

Code quality? Ratings/reviews? Moderation/metamoderation? "Approved" SDKs? These all are hotly debated subjects and will not be addressed here since the CPAN is and will stay an open and free forum, where the authors decide what they upload. Any further selection belongs to different fora. Besides, adding any rating or approval processes creates bottlenecks, and bottlenecks are bad.

Now, given that, there are too many orphaned modules on CPAN. Personally I don't use a module that hasn't been touched in years. Of course that could mean it's perfectly stable, however I don't feel that is often the case.

Also, I never require a module in another module that either has dependants that are orphaned or that don't pass their tests easily. If I have problems installing something, then I can only assume my users will have the same problems.

This is of course your privilege and CPAN allows you to do this. I'm happy using some older modules because they do the job and CPAN allows me to do this too. I don't want to see them go just because they don't fit your usage pattern of CPAN.

Re: How Many Modules Is Too Many?
by dws (Chancellor) on May 31, 2004 at 15:34 UTC

One very practical consideration for using as few modules as necessary (while still using as many as you absolutely need to) is bloat. Bloat bites you in two ways. One is size: pull in enough modules, and pretty soon things like the profiler go haywire. The second is startup speed, which doesn't sound like a bit problem when you're doing a mod_perl app, unless you're heavy into unit tests, in which case the startup hit for pulling in several tens or hundreds of modules bites you every time you execute a .t file. And when you have several hundred tests (a good thing), the additional startup can add minutes to a full test run (a bad thing).

Code reuse is not an absolute good that happens in some ideal vacuum. Reuse is a tool, and like any other tool, it has both benefits and costs.

Re^2: How Many Modules Is Too Many?

by adrianh (Chancellor) on May 31, 2004 at 16:07 UTC

The second is startup speed, which doesn't sound like a bit problem when you're doing a mod_perl app, unless you're heavy into unit tests, in which case the startup hit for pulling in several tens or hundreds of modules bites you every time you execute a .t file. And when you have several hundred tests (a good thing), the additional startup can add minutes to a full test run (a bad thing).

This problem can (as I'm sure you know :-) be mitigated by building and tearing down your own test fixtures in a single test script rather that using lots of different *.t scripts to isolate your tests.

I'd probably look towards optimising the test suite before I add the extra overhead of rewriting / inlining modules.

Re: Re^2: How Many Modules Is Too Many?

by dws (Chancellor) on May 31, 2004 at 18:55 UTC

This problem can (as I'm sure you know :-) be mitigated by building and tearing down your own test fixtures in a single test script rather that using lots of different *.t scripts to isolate your tests.

Many of our .t files already have over a hundred individual tests (i.e., ok() tests). Each .t does specific setup and teardown, and many use END {} blocks to do useful things like destroy temporary database objects that were inserted for purposes of testing, and which shouldn't be there by the time the next bunch of tests runs. Combining these tests would be possible, but only at the expense of a lot of work. We're getting more mileage by attacking bloat first.

[reply]
[d/l]
[select]

Re^4: How Many Modules Is Too Many?

by adrianh (Chancellor) on May 31, 2004 at 22:05 UTC

Re: How Many Modules Is Too Many?
by zakzebrowski (Curate) on Jun 01, 2004 at 00:20 UTC

----
Zak - the office

Back to Meditations

Log In^?

Domain Nodelet^?

www.com | www.net | www.org

Node Status^?

node history
Node Type: perlmeditation [id://357529]
Approved by Itatsumaki
Front-paged by gmax
help

Chatterbox^?

How do I use this? • Last hour • Other CB clients

Other Users^?

Others having a coffee break in the Monastery: (4)

As of 2024-04-19 21:53 GMT

Sections^?

Information^?

Find Nodes^?

Leftovers^?

Today I Learned

Voting Booth^?

No recent polls found