Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Writing a rock-solid general-purpose CPAN module is hard. Very hard. After all, such modules are expected to work flawlessly in a wide variety of environments -- many of which the author may have no experience in.

To illustrate how hard it can be to write a module that works in many different environments, consider two recent examples. Schwern gushes here about the joy he derives from coaxing his lovingly crafted Test::More module to work faultlessly in multi-threaded environments -- even though he never uses threads himself. And in CHECK and INIT under mod_perl, Ovid reminds us that writing a CPAN module that works smoothly in a mod_perl environment is not trivial.

To help those CPAN authors less experienced than an Ovid or a Schwern, I've tried to provide some tips and links on improving CPAN module quality.

Choosing a Module Name

The Pause Module List gives some excellent advice which I won't repeat here. Update: the Pause Module List is no longer maintained nowadays. For current CPAN module naming advice see:

I will offer a word of warning, however: if you trample on the global namespace, you will be flamed! Two recent attacks that spring to mind are: Unix-0.02 where the name Unix was chosen not because the module had anything to do with Unix, but because it was "Unix inspired" (fanning the flames, the author then published a rebuttal of the criticisms in his module's POD ... which led to more flames for perverted use of POD); and the unfortunate Acme::Util where the author, in a desperate attempt to flee the relentless cpanrating flames, moved his module from the global to the Acme namespace (though clearly it does not belong in either). So, save yourself a lot of pain and discuss your module name in a public forum (here at Perl Monks or the module-authors@perl.org mailing list, for example) well before you upload it to the CPAN. If you are introducing a new module in an area where others exist, please take the time to clearly describe how your module differs from them and why you wrote it.

The most relevant piece of naming advice from Perl Best Practices is practice 3.1: "Use grammatical templates when forming identifiers". For packages and classes, a suitable template is:

Abstract_noun Abstract_noun::Adjective Abstract_noun::Adjective1::Adjective2
For example:
package Disk; package Disk::Audio; package Disk::DVD; package Disk::DVD::Rewritable

Module Reviews

If you're lucky, your module might be reviewed at cpan ratings or gav's CPAN wiki (update: broken link) or Mark Fowler's lovely Advent Calendar or Neil Bowers CPAN Module Reviews (2012 update) or even here in the Perl Monks Module Reviews section. You might even try posting a request for review at Simon's code review ladder (update: broken link). Update Oct 2011: A new PrePAN module review site is now up. Update 2022: Sadly, it's now gone (see Re: How do I package a module and related apps for successful upload to cpan and Good bye PREPAN by Aristotle). However, you're unlikely to gain much from these sources simply because performing a detailed, quality module review is very time consuming and few people have the time and inclination to do it.

A more practical alternative is to isolate small pieces of code from the module that you're unhappy with and post multiple small questions to Perl Monks. Much more likely to elicit a response than posting a 1000-line module for review.

Another approach is to review your own module, using the checklists below. When reviewing your own module, it's helpful to perform several reviews, each one from a different perspective: beginner user perspective, intermediate user perspective, expert user perspective, maintenance programmer perspective, support analyst perspective, and so on. Many programmers (including me) don't pay enough attention to the customer view of the system. I've found the simple act of pretending to be a first time user or a support phone jockey uncovers many ideas for improving quality.

Module Review Checklist

I'll start by listing the general areas that a module review might cover; those areas that I have an interest in, I'll discuss in a bit more detail later.

  • Module Interface and ease-of-use. See perlmodstyle and On Interfaces and APIs.
  • Module Versioning. See Module Versioning Advice.
  • Dependencies. Does the module contain unnecessary dependencies? Update: see new "Dependencies" section below.
  • Testability and Test Suite (see next section).
  • Documentation. Separate user versus maintainer documentation. Tutorial and Reference; Examples and Cookbook; Maintainer; How your module is different to similar ones.
  • Change log. Notes re portability, configuration & environment, performance, dependencies, bugs, limits, caveats, diagnostics, bug reporting. See also: Re: Module Change Log by Tux.
  • Stylistic issues: consistency, variable naming, indentation, magic numbers, commenting, long subs, global variables, ...
  • Review the module with regard to the 256 guidelines in Perl Best Practices.
  • Use a tool to test kwalitee (e.g. Perl::Critic, Module::CPANTS, Devel::Cover, Devel::SawAmpersand, Perl::MinimumVersion, Devel::Cycle, Test::Memory::Cycle, lint, valgrind, ...).
  • Use common and well-known idioms and programming practices.
  • Sniff out Code smells.
  • Portability.
  • Performance.
  • Robustness (see Solidity checklists below).
  • Security. See Re: Security techniques every programmer should know (Security References).
  • Error Handling. Document all errors in the user's dialect. Prefer throwing exceptions to returning special values.
  • Are edge cases handled properly?
  • Plays well with others. Don't use $&, $', $`, don't export anything by default, localize Perl globals, Makefile.PL evilness (e.g. phone home).
  • Avoid code and other duplication.
  • Simplicity and Clarity.
  • Generality and Extensibility.
  • Abstraction and Encapsulation.
  • Reuse and Decoupling.
  • Maintainability.
  • Supportability and Traceability.

See also On Coding Standards and Code Reviews.

Testability and Test Suite

Was the test suite written before, during or after the module? The benefits of Test Driven Development are well known and I won't further elaborate here.

Are there nicely commented tests covering the examples in the documentation? I strongly encourage this because: it acts as tutorial material for someone browsing the test suite; and it ensures the examples given in the documentation actually work.

How isolated/independent are the tests? Can they be run in any order? Are boundary conditions tested? Are errors and exceptions tested?

How maintainable is the test suite? How long does it take to run? Is it one monolithic script or broken into a number of smaller ones, one per functional area? Are Mocks/Stubs employed, where practicable, to test platform-specific features on all platforms?

Test suite code coverage (via Devel::Cover) should be at least 80% in my view. POD coverage should be 100% and is easily enforced via pod-coverage.t (auto-generated by Module::Starter):

use Test::More; eval "use Test::Pod::Coverage 0.08"; plan skip_all => "Test::Pod::Coverage 0.08 required for testing POD co +verage" if $@; all_pod_coverage_ok();

Update (2009): Nowadays, you can use Test::XT to generate "best practice" author tests. See also this alias use.perl.org journal.

Dependencies

When should you depend on another CPAN module rather than write your own code?

  • Every module you add as a dependency is a module that can restrict your module -- if one of your module's dependencies is Linux-only, for example, then your module is now Linux-only; if another requires Perl 5.20+ so do you; if one of your dependencies has a bug, you also have that bug; if a new release of one of your dependencies fails, the likelihood of your release being unable to install increases; take care with dependencies having a different license to yours. Don't introduce dependencies lightly.
  • As noted at Release::Checklist, there are two types of modules: functional modules, like DBI, and developer convenience modules, like Modern::Perl. Don't add developer convenience modules as a dependency.
  • It's usually best to use popular, quality CPAN modules in complex domains (e.g. DBI and XML) rather than roll your own. Doing so allows you to leverage the work of experts in fields that you are probably not expert in. Moreover, widely used CPAN modules tend to be robust and have fewer bugs than code you write yourself because they are tested by more users and in many different environments.
  • For small and simple modules, on the other hand, such as slurping a file, rolling your own code is usually better than paying the dependency cost of an external module.
  • Cost vs Risk. Though using CPAN modules seems "free", there are hidden snags. What if your dependent module has a security vulnerability? What if the author abandons it? How quickly can you isolate/troubleshoot a bug in its code?
  • Quality and Trust. Before introducing a dependency, it's worth checking CPAN ratings, Kwalitee score, bug counts, how quickly are bugs fixed etc. Does it contain gratuitous/unnecessary dependencies? (the ::Tiny CPAN modules were a reaction against modules that seemed to haul in half of CPAN as dependencies).
  • Popularity. When you use a 3rd party module, you want it to be popular and widely supported; you want to be able to ask for advice on using it; you don't want it to die. Moreover, if your module depends on a very popular CPAN module, there's a good chance your module's users will already have it installed.

Update: the Judging the Quality of a CPAN module checklist at Re: Choosing the right module (Judging the Quality of a CPAN module References) might help you decide whether to add a 3rd party CPAN module as a dependency.

See also: Criteria for when to use a cpan module and Should I list core modules as dependencies? and Re: Factory classes in Perl (updated) and Re^3: Perl Code Changes Behavior if Two Subroutine definitions are swapped by haukex and replies. Update: and How old is too old? by Bod. And Prefer Pure Perl Core Modules by Leitz. And Add optional modules to TEST_REQUIRES? (especially the replies).

Mechanics of Creating a New CPAN Module

Start with: Re: What do I use to release a module to CPAN for the first time? by davido.

See also:

  • CPAN (perldoc) - query, download and build perl modules from CPAN sites
  • perlnewmod (perldoc) - preparing a new module for distribution
  • perlmodstyle (perldoc) - Perl module style guide (re making a dev release: if you want to release a 'beta' or 'alpha' version of a module but don't want CPAN.pm to list it as most recent use an '_' after the regular version number followed by at least 2 digits, e.g. 1.20_01)
  • CPAN (perldoc) (FAQ 12) - How do I install a "DEVELOPER RELEASE" of a module? By default, CPAN will install the latest non-developer release of a module. If you want to install a dev release, you have to specify the partial path starting with the author id to the tarball you wish to install, like so: cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
  • CPAN::Meta::Spec (perldoc) - specification for CPAN distribution metadata (for release_status, if the version field contains an underscore character, then release_status must not be "stable")
  • perlpodstyle (perldoc) - Perl POD style guide
  • Release::Checklist - A QA checklist for CPAN releases by Tux
  • (Early version of proposed Release::Checklist by Tux)

General Code Solidity Checklist

Programming languages, such as C++ and Java, tend to classify routines as:

  • unsafe (uses unprotected global or static state)
  • thread-safe (aka MT-safe)
  • async-signal-safe (reentrant)
  • exception-safe

(There is also async-cancel-safe, which is of little practical importance). Update: As pointed out by BrowserUk below, the above categories, though important when writing C extensions, are mostly irrelevant at the Perl level. I'd be interested if anyone could provide examples of where the above categories are relevant when writing pure Perl code.

Thread Safety

Update: please see BrowserUk's response below for thread safety advice at the Perl level.

It's not easy to determine if a piece of code is thread safe. Nor is it easy to write tests to prove its thread safety. However, armed with an understanding of thread safety, a careful examination of the code will prove fruitful. Note that thread safety and reentrancy (aka efficient thread safety) should be considered early (at the C level) since they often affect interfaces and are hard to retrofit later.

See perlthrtut for information on Perl thread safety. An interesting twist in making a Perl module thread-safe is working around core constructs and modules known to be thread-unsafe. For instance, Test::More v0.51_01 was changed to use a Perl sort block rather than a subroutine because sort subroutines are known to be thread-hostile (perl bug #30333 discussed in Perl threads sort test program crashes).

Exception Safety

In C++, the dominant exception handling idiom is RAII (Resource Acquisition is Initialization), which I much prefer to the Java Dispose pattern. Though RAII can't be used in full garbage collected languages, such as Java and Perl 6, it can work well in simple reference counted languages, like Perl 5.

To give a very simple example, this code:

sub fred { open(FH, "< f.tmp") or die "open error f.tmp"; # ... process file here die "oops"; # if something went wrong close(FH); } eval { fred() }; if ($@) { print "died: $@\n" } # Update: this is better, see: [id://11130946] # my $ok = eval { fred(); 1 }; # if (!$ok) { print "died: $@\n" } # oops, handle FH is still open if exception was thrown.
is not exception-safe because FH is not closed when die is called. A simple remedy is to replace the global FH with a lexical file handle (which is auto-closed at end of scope (RAII)):
sub fred { open(my $fh, "< f.tmp") or die "open error f.tmp"; # ... process file here die "oops"; # if something went wrong close($fh); } eval { fred() }; if ($@) { print "died: $@\n" } # ok, $fh is auto-closed when its ref count goes to zero
If you are stuck with Perl 5.005 and can't use a lexical file handle, IO::File or localizing FH with local *FH should do the trick.

Update: For more recent (and better) references on using Exceptions in Perl see: Re: eval to replace die? (Exceptions and Error Handling References)

Perl-specific Code Solidity Checklist

I've taken the liberty of extending the General Code Solidity Checklist above with some Perl-specific ones:

Sorry, I couldn't resist the last two. ;-) As you can see, there are many things to consider when writing solid Perl code!

Ideally, you should test your taint-safe code both with and without taint because taint mode has been a historical source of bugs and strange differences in behaviour. And you can do that easily enough via the Test::Harness prove command's -T switch. However, if you specify #!/usr/bin/perl -wT in a test script, make test will run it in taint mode only (anyone know of an easy way around this?). Update: see also Re: When not to use taint mode.

For information on mod_perl safety, see mod_perl reference and chapter 6 (Coding with mod_perl in Mind) of Practical mod_perl (which is now available online).

Curating CPAN

See Also

  • Tutorials (PM Tutorials - search for Module)

Related CPAN Modules

More Refs Added Later

Considered: rinceWind "Edit: promote to tutorials"
Unconsidered: castaway Keep/Edit/Delete: 12/41/0 - We let the author decide if a node is a tutorial.

Updated June 2006: Added more recent references (e.g. PBP) and tools (e.g. Perl::Critic). Jan 2008: Added module naming advice from PBP. Aug 2009: Added new "Related CPAN Modules" section. Oct 2011: Added link to PrePAN. Nov 2012: Added link to Neil Bowers CPAN module reviews. Aug 2015: Added more references based on Improving the quality of my modules. Nov 2016: Added bullet point on dependencies. Oct 2018: Added refs to Release::Checklist and The Berlin Consensus. Feb 2019: Added new Dependencies section. Added Discipulus refs. Dec 2019: Added On the Naming of Modules Pause reference. Oct 2020: Added new "Mechanics of Creating a New CPAN Module" section. May 2021: Added Module Versioning Advice. Sep 2022: Improved Change Log advice based on replies to Module Change Log by Bod. Sep 2022: Added new "Curating CPAN" section. Sep 2023: "Mechanics of Creating a New CPAN Module" section updated from Re^2: testing/release advice [was: Stopping tests].


In reply to Writing Solid CPAN Modules by eyepopslikeamosquito

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (3)
As of 2024-04-20 04:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found