Writing Solid CPAN Modules

Writing a rock-solid general-purpose CPAN module is hard. Very hard. After all, such modules are expected to work flawlessly in a wide variety of environments -- many of which the author may have no experience in.

To illustrate how hard it can be to write a module that works in many different environments, consider two recent examples. Schwern gushes here about the joy he derives from coaxing his lovingly crafted Test::More module to work faultlessly in multi-threaded environments -- even though he never uses threads himself. And in CHECK and INIT under mod_perl, Ovid reminds us that writing a CPAN module that works smoothly in a mod_perl environment is not trivial.

To help those CPAN authors less experienced than an Ovid or a Schwern, I've tried to provide some tips and links on improving CPAN module quality.

Choosing a Module Name

The Pause Module List gives some excellent advice which I won't repeat here. Update: the Pause Module List is no longer maintained nowadays. For current CPAN module naming advice see:

pause: On the Naming of Modules
01modulesindex
The "Map of CPAN" (mapofcpan.org) a visual map of all Perl CPAN modules (update: not sure current status of this)

I will offer a word of warning, however: if you trample on the global namespace, you will be flamed! Two recent attacks that spring to mind are: Unix-0.02 where the name Unix was chosen not because the module had anything to do with Unix, but because it was "Unix inspired" (fanning the flames, the author then published a rebuttal of the criticisms in his module's POD ... which led to more flames for perverted use of POD); and the unfortunate Acme::Util where the author, in a desperate attempt to flee the relentless cpanrating flames, moved his module from the global to the Acme namespace (though clearly it does not belong in either). So, save yourself a lot of pain and discuss your module name in a public forum (here at Perl Monks or the module-authors@perl.org mailing list, for example) well before you upload it to the CPAN. If you are introducing a new module in an area where others exist, please take the time to clearly describe how your module differs from them and why you wrote it.

The most relevant piece of naming advice from Perl Best Practices is practice 3.1: "Use grammatical templates when forming identifiers". For packages and classes, a suitable template is:

Abstract_noun
Abstract_noun::Adjective
Abstract_noun::Adjective1::Adjective2
[download]

For example:

package Disk;
package Disk::Audio;
package Disk::DVD;
package Disk::DVD::Rewritable
[download]

Module Reviews

If you're lucky, your module might be reviewed at cpan ratings or gav's CPAN wiki (update: broken link) or Mark Fowler's lovely Advent Calendar or Neil Bowers CPAN Module Reviews (2012 update) or even here in the Perl Monks Module Reviews section. You might even try posting a request for review at Simon's code review ladder (update: broken link). Update Oct 2011: A new PrePAN module review site is now up. Update 2022: Sadly, it's now gone (see Re: How do I package a module and related apps for successful upload to cpan and Good bye PREPAN by Aristotle). However, you're unlikely to gain much from these sources simply because performing a detailed, quality module review is very time consuming and few people have the time and inclination to do it.

A more practical alternative is to isolate small pieces of code from the module that you're unhappy with and post multiple small questions to Perl Monks. Much more likely to elicit a response than posting a 1000-line module for review.

Another approach is to review your own module, using the checklists below. When reviewing your own module, it's helpful to perform several reviews, each one from a different perspective: beginner user perspective, intermediate user perspective, expert user perspective, maintenance programmer perspective, support analyst perspective, and so on. Many programmers (including me) don't pay enough attention to the customer view of the system. I've found the simple act of pretending to be a first time user or a support phone jockey uncovers many ideas for improving quality.

Module Review Checklist

I'll start by listing the general areas that a module review might cover; those areas that I have an interest in, I'll discuss in a bit more detail later.

Module Interface and ease-of-use. See perlmodstyle and On Interfaces and APIs.
Module Versioning. See Module Versioning Advice.
Dependencies. Does the module contain unnecessary dependencies? Update: see new "Dependencies" section below.
Testability and Test Suite (see next section).
Documentation. Separate user versus maintainer documentation. Tutorial and Reference; Examples and Cookbook; Maintainer; How your module is different to similar ones.
Change log. Notes re portability, configuration & environment, performance, dependencies, bugs, limits, caveats, diagnostics, bug reporting. See also: Re: Module Change Log by Tux.
Stylistic issues: consistency, variable naming, indentation, magic numbers, commenting, long subs, global variables, ...
Review the module with regard to the 256 guidelines in Perl Best Practices.
Use a tool to test kwalitee (e.g. Perl::Critic, Module::CPANTS, Devel::Cover, Devel::SawAmpersand, Perl::MinimumVersion, Devel::Cycle, Test::Memory::Cycle, lint, valgrind, ...).
Use common and well-known idioms and programming practices.
Sniff out Code smells.
Portability.
Performance.
Robustness (see Solidity checklists below).
Security. See Re: Security techniques every programmer should know (Security References).
Error Handling. Document all errors in the user's dialect. Prefer throwing exceptions to returning special values.
Are edge cases handled properly?
Plays well with others. Don't use $&, $', $`, don't export anything by default, localize Perl globals, Makefile.PL evilness (e.g. phone home).
Avoid code and other duplication.
Simplicity and Clarity.
Generality and Extensibility.
Abstraction and Encapsulation.
Reuse and Decoupling.
Maintainability.
Supportability and Traceability.

Testability and Test Suite

Was the test suite written before, during or after the module? The benefits of Test Driven Development are well known and I won't further elaborate here.

Are there nicely commented tests covering the examples in the documentation? I strongly encourage this because: it acts as tutorial material for someone browsing the test suite; and it ensures the examples given in the documentation actually work.

How isolated/independent are the tests? Can they be run in any order? Are boundary conditions tested? Are errors and exceptions tested?

How maintainable is the test suite? How long does it take to run? Is it one monolithic script or broken into a number of smaller ones, one per functional area? Are Mocks/Stubs employed, where practicable, to test platform-specific features on all platforms?

Test suite code coverage (via Devel::Cover) should be at least 80% in my view. POD coverage should be 100% and is easily enforced via pod-coverage.t (auto-generated by Module::Starter):

use Test::More;
eval "use Test::Pod::Coverage 0.08";
plan skip_all => "Test::Pod::Coverage 0.08 required for testing POD co
+verage" if $@;
all_pod_coverage_ok();
[download]

Update (2009): Nowadays, you can use Test::XT to generate "best practice" author tests. See also this alias use.perl.org journal.

Dependencies

When should you depend on another CPAN module rather than write your own code?

Every module you add as a dependency is a module that can restrict your module -- if one of your module's dependencies is Linux-only, for example, then your module is now Linux-only; if another requires Perl 5.20+ so do you; if one of your dependencies has a bug, you also have that bug; if a new release of one of your dependencies fails, the likelihood of your release being unable to install increases; take care with dependencies having a different license to yours. Don't introduce dependencies lightly.
As noted at Release::Checklist, there are two types of modules: functional modules, like DBI, and developer convenience modules, like Modern::Perl. Don't add developer convenience modules as a dependency.
It's usually best to use popular, quality CPAN modules in complex domains (e.g. DBI and XML) rather than roll your own. Doing so allows you to leverage the work of experts in fields that you are probably not expert in. Moreover, widely used CPAN modules tend to be robust and have fewer bugs than code you write yourself because they are tested by more users and in many different environments.
For small and simple modules, on the other hand, such as slurping a file, rolling your own code is usually better than paying the dependency cost of an external module.
Cost vs Risk. Though using CPAN modules seems "free", there are hidden snags. What if your dependent module has a security vulnerability? What if the author abandons it? How quickly can you isolate/troubleshoot a bug in its code?
Quality and Trust. Before introducing a dependency, it's worth checking CPAN ratings, Kwalitee score, bug counts, how quickly are bugs fixed etc. Does it contain gratuitous/unnecessary dependencies? (the ::Tiny CPAN modules were a reaction against modules that seemed to haul in half of CPAN as dependencies).
Popularity. When you use a 3rd party module, you want it to be popular and widely supported; you want to be able to ask for advice on using it; you don't want it to die. Moreover, if your module depends on a very popular CPAN module, there's a good chance your module's users will already have it installed.

Update: the Judging the Quality of a CPAN module checklist at Re: Choosing the right module (Judging the Quality of a CPAN module References) might help you decide whether to add a 3rd party CPAN module as a dependency.

Mechanics of Creating a New CPAN Module

Start with: Re: What do I use to release a module to CPAN for the first time? by davido.

See also:

CPAN (perldoc) - query, download and build perl modules from CPAN sites
perlnewmod (perldoc) - preparing a new module for distribution
perlmodstyle (perldoc) - Perl module style guide (re making a dev release: if you want to release a 'beta' or 'alpha' version of a module but don't want CPAN.pm to list it as most recent use an '_' after the regular version number followed by at least 2 digits, e.g. 1.20_01)
CPAN (perldoc) (FAQ 12) - How do I install a "DEVELOPER RELEASE" of a module? By default, CPAN will install the latest non-developer release of a module. If you want to install a dev release, you have to specify the partial path starting with the author id to the tarball you wish to install, like so: cpan> install KWILLIAMS/Module-Build-0.27_07.tar.gz
CPAN::Meta::Spec (perldoc) - specification for CPAN distribution metadata (for release_status, if the version field contains an underscore character, then release_status must not be "stable")
perlpodstyle (perldoc) - Perl POD style guide
Release::Checklist - A QA checklist for CPAN releases by Tux
(Early version of proposed Release::Checklist by Tux)

Perl CPAN test metadata (mentions Oslo Consensus 2008, Lancaster Consensus 2013, Berlin Consensus 2015 and PTS Oslo 2018)
The Berlin Consensus Recommended standards of care for CPAN authors
notes from the review at PTS 2018 Oslo comments by Karen Etheridge et al
On the Naming of Modules (Pause)

What do I use to release a module to CPAN for the first time? by Lady_Aleena (replies mention Minilla)
github Tutorial on module creation by Discipulus (also contains links to many PM nodes, e.g. hippo's recent Basic Testing Tutorial)
collection of things that help making your CPAN module portable with regard to metadata by tinita

General Code Solidity Checklist

Programming languages, such as C++ and Java, tend to classify routines as:

unsafe (uses unprotected global or static state)
thread-safe (aka MT-safe)
async-signal-safe (reentrant)
exception-safe

(There is also async-cancel-safe, which is of little practical importance). Update: As pointed out by BrowserUk below, the above categories, though important when writing C extensions, are mostly irrelevant at the Perl level. I'd be interested if anyone could provide examples of where the above categories are relevant when writing pure Perl code.

Thread Safety

Update: please see BrowserUk's response below for thread safety advice at the Perl level.

It's not easy to determine if a piece of code is thread safe. Nor is it easy to write tests to prove its thread safety. However, armed with an understanding of thread safety, a careful examination of the code will prove fruitful. Note that thread safety and reentrancy (aka efficient thread safety) should be considered early (at the C level) since they often affect interfaces and are hard to retrofit later.

See perlthrtut for information on Perl thread safety. An interesting twist in making a Perl module thread-safe is working around core constructs and modules known to be thread-unsafe. For instance, Test::More v0.51_01 was changed to use a Perl sort block rather than a subroutine because sort subroutines are known to be thread-hostile (perl bug #30333 discussed in Perl threads sort test program crashes).

Exception Safety

In C++, the dominant exception handling idiom is RAII (Resource Acquisition is Initialization), which I much prefer to the Java Dispose pattern. Though RAII can't be used in full garbage collected languages, such as Java and Perl 6, it can work well in simple reference counted languages, like Perl 5.

To give a very simple example, this code:

sub fred {
    open(FH, "< f.tmp") or die "open error f.tmp";
    # ... process file here
    die "oops";   # if something went wrong
    close(FH);
}
eval { fred() };
if ($@) { print "died: $@\n" }
# Update: this is better, see: [id://11130946]
# my $ok = eval { fred(); 1 };
# if (!$ok) { print "died: $@\n" }

# oops, handle FH is still open if exception was thrown.
[download]

is not exception-safe because FH is not closed when die is called. A simple remedy is to replace the global FH with a lexical file handle (which is auto-closed at end of scope (RAII)):

sub fred {
    open(my $fh, "< f.tmp") or die "open error f.tmp";
    # ... process file here
    die "oops";   # if something went wrong
    close($fh);
}
eval { fred() };
if ($@) { print "died: $@\n" }
# ok, $fh is auto-closed when its ref count goes to zero
[download]

If you are stuck with Perl 5.005 and can't use a lexical file handle, IO::File or localizing FH with local *FH should do the trick.

Update: For more recent (and better) references on using Exceptions in Perl see: Re: eval to replace die? (Exceptions and Error Handling References)

Perl-specific Code Solidity Checklist

I've taken the liberty of extending the General Code Solidity Checklist above with some Perl-specific ones:

strict-safe
warnings-safe
taint-safe
mod_perl-safe
coverage-safe (Devel::Cover, Pod::Coverage)
cpants-safe (keep Module::CPANTS happy)

Sorry, I couldn't resist the last two. ;-) As you can see, there are many things to consider when writing solid Perl code!

Ideally, you should test your taint-safe code both with and without taint because taint mode has been a historical source of bugs and strange differences in behaviour. And you can do that easily enough via the Test::Harness prove command's -T switch. However, if you specify #!/usr/bin/perl -wT in a test script, make test will run it in taint mode only (anyone know of an easy way around this?). Update: see also Re: When not to use taint mode.

For information on mod_perl safety, see mod_perl reference and chapter 6 (Coding with mod_perl in Mind) of Practical mod_perl (which is now available online).

Curating CPAN

Curating CPAN sometimes means really deleting stuff by BooK
Re^2: How do I package a module and related apps for successful upload to cpan by PUCKERING (2022) - CPAN has some cruft; is there a process for cleanup?
Deprecating a CPAN Module by mikkoi (2019) - is there an official way to deprecate a module in CPAN?

See Also

Intro to TDD in Perl (pdf) by Grant McLean
Starting a module with Module::Starter (Perl Training Australia)
Writing Perl Modules for CPAN book by Sam Tregar (also available as free pdf at Online Web Library)
Guide to Creating Perl Modules by Ken Williams
Perl Module Mechanics by Steven McDougall
Making Perl Reusable with Modules by Andy Sylvester
Building Good CPAN Modules by Rob Kinyon
Making New Distributions by brian d foy (using Template Toolkit)
CPAN Top 100
Tom Hukins "Using CPAN's Toolchain to Improve Your Code" journal
Tom Hukins "Using CPAN's Toolchain to Improve Your Code" slides
Neil Bowers CPAN Module Reviews
PrePAN
CPAN Authors FAQ
About PAUSE
Identifying CPAN distributions you could help out with
Pull request ideas by Neil Bowers
Pull request etiquette by Neil Bowers
Curated list of awesome Perl modules
What should be in a CPAN distro README by brian d foy
cpan tutorials by Miroslav Tynovsky and choroba

Tutorials (PM Tutorials - search for Module)

Related CPAN Modules

Perl::Critic Critique Perl source code for best practices
Perl::Critic::Lax Policies that let you slide on common exceptions
Perl::Tidy Perl source code formatter
Devel::Cover Code coverage metrics for Perl
Devel::NYTProf Perl source code profiler
Pod::Spell a formatter for Spellchecking Pod
Dist::Zilla CPAN distribution builder
Module-Release Automate Software Releases
Test::XT Generate best practice author tests
Pod::Coverage Checks if the documentation of a module is comprehensive
Module::Starter A simple starter kit for any module
Module::Setup
Module::Install
Module::Build
Devel::SawAmpersand Perl extension querying PL_sawampersand variable
Devel::Cycle Find memory cycles in objects
Test::Memory::Cycle Check for memory leaks and circular memory references
Test::Pod::Coverage Check for pod coverage in your distribution

Perl::MinimumVersion - Find a minimum required version of perl for Perl code
perl 5.6.0 perldelta - e.g. see when our was introduced
perl 5.10.0 perldelta - e.g. see when say was introduced

Module::CPANTS CPAN Module Testing
Module::CPANTS::ProcessCPAN Generate Kwalitee ratings for the whole CPAN
Module::CPANTS::Analyse Generate Kwalitee ratings for a distribution
CPANTS Testing Service
Perldoc::Server
CPAN::Mini
CPAN::Mini::Extract
CPAN::Mini::Inject
CPAN::Mini::Webserver
CPAN::Mini::Devel
CPAN::Mini::FromList
CPAN::Mini::Growl
CPAN::Mini::Indexed
CPAN::Mini::Visit
CPAN::Uploader
CPAN::Upload::Tiny

Carton
plenv (github)
carton (github)

cpan (cpan command)
App::perlbrew (perlbrew command)
App::cpanminus (cpanm command)
CPANPLUS (cpanp command)

More Refs Added Later

Re: [RFC] Module code and POD for CPAN by me responding to Bod (April 2021) - list of basic refs for new CPAN authors
Perl Weekly Issue 642 by Mohammad S. Anwar - Neil Bowers blog posts on CPAN uploads
What happens when you upload to CPAN (NeilB blog)

CPAN Ratings... by Bod (2023) - getting started with github
Re^3: GitHub PR (was: Re^4: CPAN Ratings...) by choroba (2023) - common workflow: create a fork, clone repository to a working copy on your machine, make changes, run the tests
Re^4: testing/release advice [was: Stopping tests] by stevieb (2023) - uses github actions to perform automated CI testing on every push to the repo

perlfaq candidate: "How do I adopt an abandoned module ? " by syphilis (2007)
Social CPAN : Finding the best and most popular modules by jannino (2009)
CPAN Avatars by Sixtease (2008)
ExtUtils::MakeMaker : delcare a binary dependency by bliako (2023)

GitHub PR (was: Re^4: CPAN Ratings...) by Bod (2023)
CPAN Testers by Bod (2023)
Please review documentation of my AI::Embedding module by Bod (2023)
What is ignore.txt by Bod (2023)
Deciding dependency versions by Bod (2023)
Stopping tests by Bod (2023)
Finding when a feature or keyword was introduced by Bod (2023) - note Perl named operators and builtin subs introduced since 5.6 response by ikegami

Trying to build a module for upload to CPAN by talexb (October 2021) - his experiences uploading a new module to CPAN after 20 years
Help with PAUSE mechanics - replacing a bad module by PUCKERING (2022) - lots of advice on how to version his first CPAN module
How do I package a module and related apps for successful upload to cpan by PUCKERING (2022) - lots of advice on choosing a good name for a CPAN module
Modulino to report ip address changes and Unable to release modulino to CPAN by davies (2022) - advice sought on a new App::ipchange Modulino-based CPAN command
How to report bugs in CPAN modules? by Xilman (2023)

Writing my first module by harangzsolt33 (2023)
Re: Writing my first module by haukex (2023)
How to create and name module tarball for CPAN by Polyglot (2023)

How to make distribution for simple script with CPAN dependencies? by peteyboy (2024) - how to create a CPAN distro that installs both a module and a script
App::FatPacker - pack your dependencies onto your script file
Changes in MooX::Role::Parameterized by choroba (2024)

Best practices for local libraries
local::lib create and use a local lib/ for perl modules with PERL5LIB
CPAN Namespace Navigator by Discipulus
How old is too old? by Bod
Author tests or standard tests? by nysus (t/ vs xt/, author/extra tests in xt/ - release and other non-install-time tests, subdirectories optional)
Section proposal: Best CPAN Modules by ptum
The problem of documenting complex modules. (2015) by BrowserUk
Leveraging the Gitpan repos by davido
Git::CPAN::Patch by Yanick

Package manager (wikipedia)
Configuration management (wikipedia)
Dependency hell (wikipedia)
Coupling (wikipedia)
Software update managers (wikipedia)
Service pack (wikipedia)

Considered: rinceWind "Edit: promote to tutorials"
Unconsidered: castaway Keep/Edit/Delete: 12/41/0 - We let the author decide if a node is a tutorial.

Updated June 2006: Added more recent references (e.g. PBP) and tools (e.g. Perl::Critic). Jan 2008: Added module naming advice from PBP. Aug 2009: Added new "Related CPAN Modules" section. Oct 2011: Added link to PrePAN. Nov 2012: Added link to Neil Bowers CPAN module reviews. Aug 2015: Added more references based on Improving the quality of my modules. Nov 2016: Added bullet point on dependencies. Oct 2018: Added refs to Release::Checklist and The Berlin Consensus. Feb 2019: Added new Dependencies section. Added Discipulus refs. Dec 2019: Added On the Naming of Modules Pause reference. Oct 2020: Added new "Mechanics of Creating a New CPAN Module" section. May 2021: Added Module Versioning Advice. Sep 2022: Improved Change Log advice based on replies to Module Change Log by Bod. Sep 2022: Added new "Curating CPAN" section. Sep 2023: "Mechanics of Creating a New CPAN Module" section updated from Re^2: testing/release advice [was: Stopping tests].

Comment on Writing Solid CPAN Modules Select or Download Code

Replies are listed 'Best First'.
Re: Writing Solid CPAN Modules by BrowserUk (Patriarch) on Jan 03, 2005 at 12:19 UTC
Thread Safety Unfortunately, almost everything you mention regarding thread-safety is of little use to a perl module writer. The wikipedia entry for thread safe, discusses it in terms that would be useful to a C programmer, but that are, for the most part, non-useful to a Perl programmer. Reentrancy. This is almost irrelevant to a Perl programmer. In C, reentrancy can (sometimes) be achieved by only using "atomic opreations" on data. You can usually saftely increment and decrement a shared integer without applying locking because these operations are single opcodes and so the entire operation is started and completed on a given thread before the scheduler gets a chance to interupt. All Perl's datastructures are "fat". Which is to say, each Perl variable, even the humble scalar contain not just the user's value, but also a large amount of "system state". Eg. An SV (can) consist of an IV, an RV, a PV and a CV plus multiple forms of magic. All of these various pieces that sit behind a Perl scalar (mostly hidden from the Perl programmer and are, for the most part, beyond his control), must be maintained in a coherent state. However the Perl level operations that manipulate and maintain this state on behalf of the user consist of 10s, 100s or even 1000s of machine-level opcodes, and the scheduler can interupt a thread in between any two of these. Further more, many of the C-runtime calls used by Perl in the actioning of Perl-level operations are themselves non-reentrant. These are entirely beyond the ability of the Perl programmer to affect or control. As such, the entire concept of reentrancy is completely lost on a Perl programmer. Mutual exclusion. Whilst the threads::shared module does provide access to a number of OS level functions cond_. These are for the most part pretty useless to the Perl programmer. The underlying APIs upon which they are based is intended for use from C-level code where the cond_var* and lock_var parameters are themselves C-style `unsigned integer` variables--Ie. pieces of data which can be manipulated using cpu-atomic operations. The Perl view of these APIs substitutes Perl scalars--which are very much not cpu-atomic storage--which makes the use of these APIs fraught with problems. Exception safety. Since Perl 5.8.0, perl has had "safe signals"--ie. signals are caught by the Perl runtime and deferred from interupting the user-level code until that code reaches the end of the current Perl-level operation. Whilst this works pretty well for single threaded applications--although the, possibly indefinite, delay between the signal being sent and the Perl-level code noticing it, somewhat detracts from the utility of the whole concept of signals--their use in multi-threaded Perl code is very confused. What happens if multiple threads install a signal handler for the same type of signal? When the signal comes, which thread gets control? How should signals be propagated between threads? How are multiple signals handled? That is, if the runtime detects one signal (type) and then receives a second, disparate signal prior to the user code having reached the end of the current Perl-level operation, which one of these signals gets propagated? perlthrtut (Link currently broken (again) from where I am, but maybe it'll be fixed again soon!) Whilst there is nothing (that I've noticed) in this document that is factually inaccurate, much of the content is of little use to a Perl programmer who is either trying to use threads in a stand-alone script, and less so if he is trying to code a Perl module such that it can safely be used from within threaded Perl scripts. Mosty of the content is only useful to that very small band of Perl authors trying to write modules that would useable *across iThread boundaries!* That is to say, modules that use threads within themselves without the user needing to be aware that threads are being used. I know of no existing modules publically available that fit into this category. modules that are intended for use from multiple threads but that require those threads to communicate in order for the module to perform it's primary role. A good (the best) example of this is Thread::Queue. These modules are rare, and thanks to the extremely clever design and implementation of iThreads, the need for threads that do this is even rarer. So far in my experiments, due to the way iThreads has been implemented, the vast majority of existing modules, including those written prior to the existance of Threaded Perl, and that as far as I am aware, have never been modified to make them "thread-safe", are perfectly usable and useful with iThreads--provided they are used in the correct way! This is where the existing documentation is almost completely lacking! There is almost nothing that provides advice on how threads can be safetly used with existing modules. As such, perlthrtut acts almost as anti-documentation. It is so full of apparent caveats, hidden traps, warnings, history and discussion of intricate details that are, for the most part, completely irrelevant to the average Perl programmer wishing to use threads, that it serves only to put people off. It is almost totally lacking of any discussion of: Why people might consider using iThreads. What benefits they might get from doing so. When iThreads are likely to be beneficial (and when not). How to structure programs to use iThreads effectively. Comparative discussion of iThreads with alternatives and when which is most applicable. The bug you found Perl threads sort test program crashes. I'm not sure what the official resolution of the bug you found was--nor actually if it was ever resolved as you didn't provided a link to the perlbug #30333 and I've forgotten where to go to look them up. In any case, as far as I can remember, you have to have a logon at the relevant web site just in order to view the bug description or find out whether it has been resolved--which always seemed pretty silly to me. However, I am prepared to go out in a limb and guess that if the problem has been resolved, the resolution required a patch to the Perl sources and was completely beyond your control as a Perl programmer? As such, I am not really sure of the efficacy of citing that bug in this context. It doesn't help any module author in creating a thread-safe module? I should also consider the possibility that the resolution to your bug was a piece of "Don't do that!" advice. If this is the case, then I see nothing in perlthrtut, nor threads pod, nor threads::shared pod that appears to related to the issue in your post, nor any post in that thread that identifies the underlying problem and solution? The suggestion that Perl should start to specify which functions and modules are thread-safe is flawed. The cited example, Tie::File, is perfectly "thread-safe"--if used in a thread-safe way. The problem with the example code in that post is that it makes no attempt to use threads::shared. It doesn't share the tied array; makes no attempt to perform any locking. It could never do anything useful as coded! In fairness to pg, Perl threads, and particularly iThreads were very new back then. I don;t think that even that little documentation that is available now was available back then; there was an almost complete absence of example code available. He went on to do some very useful things with threads. However, your citing that post in this meditation is less forgiveable. And so is the suggestion that module authors should make this consideration when writing modules. For the most part, module authors have no need to consider (i)threading when writing their modules, and to state that they should do so, and "certify" their modules as "thread-safe" will have two effects--both of them negative: Those authors that have neither the need nor the desire to use threads themselves will see no benefit to themselves in attempting to consider threading in the design and implementation of theor modules. The likely effect will be that they will either ignore the directive, or more insidiously, dodge the issue completely and simply label it a "Not thread safe". For many (most) modules, this will be incorrect, as most modules can be used with threads, if used in the right way--or rather, not used in the wrong way. Those (potential CPAN) authors that have modules that work for them and consider releasing them to CPAN, will see the directive for certifying thread-safety, take a look at perlthrtut, get panicked by the caveats and dire warnings it contains and decide to withold their modules from release as they are put off by the need to try and meet this "requirement". Which for the vast majority of modules would be completely unnecessary. So, whilst I understand your motivations for trying to pursuade people to consider thread-safety when writing modules, the information and examples you cite in support of your intentions, are flawed to the point where I think that they will do more harm than good. The best way to promote threading in Perl would be to provide some documentation that gives some practical examples of using them as "patterns" for potential users. A necessary part of this would be to show some good examples of how existing (pre-threading) modules can be safetly used with threads provided a few simple rules of thumb are followed. To promote the writing of new modules that can be used in conjunction with threads, then an addition of a section to one or more of the module tutorials that outlines those few things that author should not do if they want their modules to be usable in conjunction with threads, would be a big step forward. To promote the writing of modules that (transparently) use threads to enhance their utility--asynchronous reading/writing of files; providing user feedback during long running/cpu intensive operations; etc. This would require (probably) a completely new document aimed specifically at the authors of this (rare) type of module. Modules of this type probably belong in a namespace that identifies them as requiring threads for their operation. Unfortunately, the mixed up and confusing conflagration of the original Thread::* namespace (which currently contains a raft of modules which were either written to work with the original and flawed pThreads implementation, or those written to bypass the threads module completely utilising liz's forks module; combined with the dubious choice of a lower-case, "pragma" namespace and the strange "rules" that surround these, means that chosing a suitable namespace for modules of this type is impossible. If you place them in the Thread::* namespace, they will most likely be ignored as old, pthreads modules that are no longer useful. If you place them in the threads::* namespace (where IMO they belong), the self-appointed namespace nazis will jump all over you for ignoring (their) rules. This is a no-win situation that will only be resolved by: Discarding/renaming the irrelevant modules from the Thread::* namespace--but purging the pthreads association is likely impossible. Opening up the threads::* namspace for iThreads dependant modules--my prefered option--but unlikely to get past the "pragmas are for core modules only" lobby or the "threads are spawn of the evil empire and the total antithisis of Perl unix roots" brigade. Create a new namespace (say Ithreads::) for such modules--likelyhood of acceptance based on what I have read = 0. Forget the idea until CP6AN comes into being and hope that the underlying threading model, and the clean slate of the namespace conventions will afford a suitable place for thread-dependant modules to be placed. Please don't take this as a personal attack--I applaud the intent of your meditation. Most of my critisms are not aimed at you, nor indeed, any* individual or group. The main thrust of my critism is aimed at the situation in general, that has been arrived at through no particular plan or agenda. It's just where we happen to have ended up! Examine what is said, not who speaks. Silence betokens consent. Love the truth but pardon error.	[reply] [d/l]
Re^2: Writing Solid CPAN Modules by eyepopslikeamosquito (Archbishop) on Jan 03, 2005 at 21:52 UTC
As should be obvious from his response, the BrowserUk is much more experienced with Perl threads than me and I thank him for his prompt and detailed corrections. The only place where I disagree with him is described below. I'm not sure what the official resolution of the bug you found was--nor actually if it was ever resolved as you didn't provided a link to the perlbug #30333 and I've forgotten where to go to look them up. In any case, as far as I can remember, you have to have a logon at the relevant web site just in order to view the bug description or find out whether it has been resolved--which always seemed pretty silly to me. However, I am prepared to go out in a limb and guess that if the problem has been resolved, the resolution required a patch to the Perl sources and was completely beyond your control as a Perl programmer? As such, I am not really sure of the efficacy of citing that bug in this context. It doesn't help any module author in creating a thread-safe module? AFAICT, it's still not fixed. As described here, Dave Mitchell proposed a patch which seems to have been rejected on performance grounds. My original sort test program, used to demonstrate the bug, still crashes with perl 5.8.6. After I found this bug, I searched the whole Perl core source tree for use of this rare sort construct and the only place I found that used it was Test::More. Hence the simple fix to work around it. BTW, I found this bug when using LWP (which has also been patched in libwww-perl-5.801). I should also consider the possibility that the resolution to your bug was a piece of "Don't do that!" advice. It was. The crashing perl core construct is rarely used and easily worked around. IMHO, it's unacceptable for heavily used modules, like LWP and Test::More, to crash when run on earlier versions of Perl. Perl module authors working around core bugs is akin to application developers working around OS bugs to ensure their application does not crash when run on old versions of an OS (which reduces support costs ... plus most customers prefer an app to just work rather than suffering a nasty OS crash, logging a support call, only to be told to upgrade their OS on possibly thousands of servers). Update: To clarify the sort bug in question, if you write, for example: `sub mycmp { length($b) <=> length($a) } sort mycmp @list;` [download] your code is not currently thread-safe, yet if you write instead: `sort { length($b) <=> length($a) } @list;` [download] your code is thread-safe all the way back to Perl 5.8.0. It's the lightweight calling mechanism employed with sort subs that's at fault. You might say it's ridiculous that the Perl programmer needs to worry about such things, yet if you want your module to be thread-safe all the way back to perl 5.8.0 you must avoid this construct. Hopefully, this bug will be fixed soon. As for the root cause, I quote Dave Mitchell: This is deeply, deeply, not thread-safe. It's supposed to turn the leavesub op temporarily into a null op: the sort sub is invoked via a lightweight mechanism that doesn't require a leavesub at the end; but since the op tree is shared between threads, the other thread may have already done the same and then restored it, leading to leavesub being erroneouly called, and general corruption ensuing. Update: I'm delighted to report that Replacing closures (to work around threads crash) is now fixed in Perl 5.8.6 (see change 23499 by Steve Hay for more detail). (Further update: sorry, it seems this bug was fully fixed in 5.8.7, not 5.8.6). While I'm happy to work around the rarely used sort construct described above, I am definitely not keen to work around the much more widely used closures. So, if you have a multi processor machine and are using threads and closures, you better upgrade to perl 5.8.7. BTW, though I'm far behind the BrowserUk in general ithreads knowledge, I'm rapidly becoming an expert at debugging Perl threads crashes at the C level. ;-) Which is not all bad, no, really it's not, without that I would never have had a reason to learn about perl internals.	[reply] [d/l] [select]


Your skill will accomplish what the force of many cannot
	PerlMonks