Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Tried and True CPAN Modules

by Rhandom (Curate)
on Jul 17, 2003 at 16:57 UTC ( #275284=perlmeditation: print w/replies, xml ) Need Help??

So, I have been a Perl believer for many years now - 5 to be exact which may be young for some and old for some. I've been at perlmonks for 3 or 4 years which is young for some and old for some. With my time with Perl and at perlmonks I have gone through the gamut of newbie, learner, teacher, explorer, expounder, and so on. It is safe to say that I will probably not every be listed as elite - or for that matter "saint." However, I can claim to know something of good and bad code. I have done much of bad coding - I can tell when I see it.

I have looked at the treasure trove we call CPAN. I have perused the name spaces. I have read much code. I have even contributed to the code base. I have helped others contribute to the code base. I have used many CPAN modules. I have despised many CPAN modules - well too strong - I have preferred to not have used many CPAN modules. I have contributed bad code myself - known to me only in hindsight. I have contributed good code myself. Though obviously not an expert, I am safe in saying that there are many proverbial "tares" surrounding the "wheat." There is much bad or even worthless code on CPAN (worthless is in the eye of the beholder - one man's trash/junk/whatever is another man's treasure - and nobody would knowingly contribute bad code).

And so after so much rant - the question(s): With good code mingled in with bad code on the system, how can one distinguish between the two? Is there any possibility of a peer review system for modules? Can we list real world working examples? Can we have side-by-side comparisons with comparible modules? Where would we host it and how would we avoid ballot stuffing? Is this even possible?

Fortunately there is a fairly large base to start with in a standard Perl distribution. Outside of that, I have spent plenty of time going through a host of simillary named modules to find one that is intelligently written to handle the majority of the tasks I need - and if it cannot handle all of the tasks - at least is extensible enough to build upon. Many times my efforts have come up fruitless and I've had to re-implement a module. Sometimes I have found a module, used it, and then had to re-implement the module. And other times I have found a stable working module, used it, and gone on to more important tasks.

Is there anyway to simplify the process - or is a halmark of the Perl coder the ability and the requirement to sift through large volumes of contributed code? Obviously, there is always a need to review the code you use for safety - but can we streamline or thin the number of choices down? Or is a large number choices simply the benefit and detriment of open source?

my @a=qw(random brilliant braindead); print $a[rand(@a)];

Replies are listed 'Best First'.
Re: Tried and True CPAN Modules
by perrin (Chancellor) on Jul 17, 2003 at 17:39 UTC
    So far, I've been attacking this problem by writing reviews that cover a range of modules (templating tools, data caching tools, object-relational mappers). I'd like there to be a way to find out what the most popular modules are (by preference, not by download stats), but there is some resistance to this idea. A partial solution may be Leon Brocard's Module::CPANTS, which generates some metrics about a module and may eventually be able to give you some generally useful stats about danger signs or quality signs in a module.
Re: Tried and True CPAN Modules
by simonm (Vicar) on Jul 17, 2003 at 17:12 UTC
    This has been discussed repeatedly in the past. Lacking a clear vision for how to shoe-horn this added functionality into CPAN, there seems to be a trend towards independent communities of interest attempting to make sense of modules in some specific area of functionality.

    A few examples:

    Given the difficulty in evaluating modules without fully understanding the context they're used in, I think this type of topic-area review might be as good as it gets, at least in the short term.

      Thank you for your items. These are partially the sort of lists that help in digesting information. It would be nice to have a, um, list of lists.

      my @a=qw(random brilliant braindead); print $a[rand(@a)];
Re: Tried and True CPAN Modules
by dragonchild (Archbishop) on Jul 17, 2003 at 17:11 UTC
    What's wrong with the Module Review section here?

    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      It is very good to have. But there is still the moderating system that is needed (or would be nice). A module, such as CGI would be off the charts. Others - even some listed under reviews - would have a low score. If you could sort by category, and then score or vice versa you could more quickly zero in on a module.

      my @a=qw(random brilliant braindead); print $a[rand(@a)];
        Who's going to moderate? That's a big question that you can't just wave your hands at ...

        We are the carpenters and bricklayers of the Information Age.

        Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

        Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Re: Tried and True CPAN Modules
by ajdelore (Pilgrim) on Jul 17, 2003 at 20:30 UTC

    Here is one idea... Write some perl code that will examine CPAN to determine how many times a particular module is used by other CPAN modules and scripts.

    Sort of a googlish approach to module usefulness or popularity, I suppose, but I think the results would be interesting.

    It would be more effective to know how many times those modules are used outside of CPAN, but obviously that becomes problematic. Perhaps something that could search the archive of comp.lang.perl.misc to see how many times a particular module gets mentioned?


      That wouldn't give much useful statistics. It would at best give some measure of popularity, but not at all at usefulness. Some problems: there's no way to count how often the modules are used in programs, as programs are hardly uploaded to CPAN. You'd just count how often other modules use them. Furthermore, it would show a strong bias in favour of modules that don't have alternatives. For instance DBI. There isn't a module that does something similar, so the DBI would get high marks, even if the code itself is shitty.

      I wouldn't bother going to a site that ranks CPAN modules based on a popularity vote. After all, Windows isn't ten times better than Unix as an OS, is it? But that would be the outcome if you'd let people vote.

      I would value a site that does reviews of modules. Non-anonymous reviews, so a reviewer can establish a name for him/herself. Some monks are already exploring this idea after a similar discussion last week.


Re: Tried and True CPAN Modules
by lachoy (Parson) on Jul 18, 2003 at 01:56 UTC

    First: some folks are putting their money where their mouth is and providing a place to talk about CPAN modules. Gavin Estey recently created a wiki site for people to comment on CPAN modules. Check it out.

    Second: it really does seem counterproductive to have so many modules, many of which duplicate one another. At least until you think about the CPAN and the community using it as a marketplace of ideas. Just like a normal marketplace there's lots of competition and seemingly little differentiation in areas where many people congregate (e.g., templating, web applications, accessing databases) and that have a fairly low barrier to entry. There's not as much competition in more esoteric problemspaces (largescale number crunching, biological analysis, etc.).

    And just like a marketplace certain products win not just by being the best (however that's defined) but by attracting other people to support and discuss it by other means (good documentation, professional website, excellent support, charismatic leader, strong forum presence, etc.). Arbitrarily weeding modules out (even bad ones, which I bitched about recently) is unnecessary -- the bad ones will fall by the wayside, never updated and never discussed. It can be tough to distinguish these but time, and articles written by an expert (like Perrin) are IMO the best solutions.

    Anyway, my point is that having all these modules and the interplay among them is the best way for the cream to rise to the top. Maybe in a year I'll get tired of supporting SPOPS and contribute to Class::DBI or Alzabo. Or maybe someone will pick up the ball. Or maybe something new will come along trumping us all. Who can predict the future? All I know is it's an awful lot of fun being in the marketplace, as chaotic and messy as it can be.

    M-x auto-bs-mode

      I think Gavin Estrey's site is a GREAT idea. I was just going to suggest if we could have a central place where we could dicuss specific CPAN modules. Many of the larger projects have mailing lists and/or sourceforge projects.

      I author a smaller module (Finance::YahooProfile) that is used by at least 20 people that I know of. I feel too lazy to maintain a website, discussion board, and a mailing list for this. I would like it if there were an auto-generated sourceforge-like site for CPAN modules.

      For now, I like the CPAN wiki and will start using it extensively. Especially given the open-source nature of CPAN people other than the author will discover bugs or new ways to use the module. A wiki allows everyone to add to the documentation.

      $will->code for @food or $$;

        Better yet, a central place to WORK on *any* CPAN module. Imagine the wiki idea, but with code + comments rather than just comments. Imagine that you could click your way through the code for say, Date::Manip, cheerfully categorising algorithms (finite state machine here), doing tiny cleanups on code (doesn't handle corner case X), suggesting and branching new areas to work on larger refactorings, add tests to the test suite, and have that all linked into a system that automatically builds stable and beta branches, runs the test suites, and indicates their status, and permits CVS/Subversion checkouts of any given tree by ID or a gzip'd download of same for the purposes of testing on your local system.

        Imagine running into a bug in a CPAN module and not having to file a bug report, simply hopping onto the site and branching in a fix with a comment, and adding a test case.

        Imagine running a class for programming students, and being able to assign homework as walking through certain CPAN modules categorising algorithms or patterns.

        Imagine having 20 minutes without anything to do, hopping onto the site and refactoring a tiny mess in a single function in a random module into something clean and elegant.

        Imagine a site where sections of code have been voted especially clean, or especially messy and deserving of attention, so that rather than avoid using them or reimplementing from scratch, they can be bit-by-bit cleaned up and refactored.

        Imagine a site where common code across many modules is regularly identified and abstracted into a common module, slowly building up a pool of best-of-breed module support code created and reviewed by many, used in many modules. No more hackish reimplementations, just a slowly expanding core of solid code.

        Imagine contributing a module and watching it grow and stabilise every day as people from across the world spend a little bit of their time commenting and cleaning and refactoring.

        Wouldn't it be cool? :)

Re: Tried and True CPAN Modules
by artist (Parson) on Jul 17, 2003 at 19:12 UTC
    Just an idea:
    Use of modules is required to do particular task. If you juggle between 2 modules, you can always say that for doing task T, module A (is better|same as|faster|efficient|..) than module B. You can build the fact list. These facts could be changed over the period as new Modules are introduced. Group specific modules (for example : Date/Time) could serve purpose here. These has to be maintained well. We may start something like that over here as a new section. The knowledge which is shared in form of SOPW or Q&A answeres can be given a systematic format where only module(s) are answers to your question (not regex for example), much similar to c.p.l.m(o)


Re: Tried and True CPAN Modules
by hsmyers (Canon) on Jul 17, 2003 at 19:42 UTC
    By my quick count, there are 98 or so reviews in the Modules portion of 'Reviews', and at a guess a partial solution would result if this number went up---to say 200 or 300. Having said that I fully intend to add to the body count shortly, although I wouldn't mind a suggestion or two on which reviews are deemed 'good examples'. This will of course be limited to modules I am familiar with enough to review (excluding my own).


    "Never try to teach a pig to wastes your time and it annoys the pig."
Re: Tried and True CPAN Modules
by Anonymous Monk on Jul 17, 2003 at 20:15 UTC
    FYI (and FWIW), a small point of English usage: the usual expression for distinguishing good from bad in a large quantity of similar material is "separating the wheat from the chaff." It's a nice word, so I thought I'd share it. "Tares" is a perfectly nice word, too, but it's far less common, and possibly obsolescent.

    --BorgCopyeditor (IRL)
    (sans password ATM)

Re: Tried and True CPAN Modules
by chunlou (Curate) on Jul 17, 2003 at 20:36 UTC

    Peer review could be hard for two reasons: 1) no contribution, and 2) too many contribution. You could have modules no one cares to write review for. Or you could have modules that have way too many reviews.

    Rating system (unmoderated probably) is a good feasible compromised solution. Just as people rate on movie online. I don't know how philosophically agreeable or disagreeable it is to other people.

    SourceForge and some other open source portals have download and activity statistics for every project, which give you some indirect hint about the product. It's not a technically difficult to implement, just a matter of whether people like it or not.

    Another mindless indicator could be querying how many times a module being used by all other modules. Of course, whatever indicators you used, they should be compared preferably within comparable modules (it won't be fair to compare the popularity of Carp to, say, PARI).

A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://275284]
Approved by chromatic
Front-paged by dbp
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2022-10-03 08:05 GMT
Find Nodes?
    Voting Booth?
    My preferred way to holiday/vacation is:

    Results (13 votes). Check out past polls.