http://www.perlmonks.org?node_id=905878

Over the last few years, I've helped build private CPANs (DarkPANs or DPANs as brian d foy calls them) for 3 different organizations. Each time I cobbled together some combination of different CPAN::Site, CPAN::Mini::Inject and CPAN modules with various shell scripts, commit hooks, and cron jobs. Although they were generally effective, I feel they were clunky, highly specialized, and hard to maintain.

Once again, I'm faced with building another private CPAN. But this time, I have an opportunity to build something that could have broader appeal in the Perl community. In fact, the explicit goal is to produce an open source, turnkey framework for creating, deploying, and maintaining a private CPAN.

So with that in mind, I'm looking for feedback on what you might want from such a framework. Here are some questions I've been asking myself -- hopefully they will help stir your mind:

Thanks for sharing your thoughts!

Jeffrey Thalhammer
Imaginative Software Systems

Replies are listed 'Best First'.
Re: RFC: Private CPAN In A Box
by grantm (Parson) on May 20, 2011 at 23:54 UTC

    I guess a private CPAN might be useful in a network of disparate systems. I'm fortunate to not work in such an environment. The only time I download something from CPAN is when I'm building a .deb from it. This activity would take place on a dev server and never on a production server. The advantage of using the system's native packaging is that you can set up dependencies on things that aren't CPAN modules and deployment is a breeze.

    I'm not suggesting your proposal is a bad idea - just not something I'd use.

Re: RFC: Private CPAN In A Box
by JavaFan (Canon) on May 20, 2011 at 22:05 UTC
    How would teams use it to distribute and share their own modules and applications within the organization?
    I wouldn't have it based on CPAN. For code written inside the organisation, I'd use a source control system. Deployment, I'd either distribute using the source control system, or a package system that's appropriate for the OS (for instance, rpms + yum + cfengine) and the organization. And I'd use the same package system to distribute external packages.

    Frankly, I wouldn't know what to use a "private CPAN" for. What's the point, and why would an organization want to limit itself to a system that's geared to a single language, and is designed to do quite different things than corporations need?

      I wouldn't have it based on CPAN. For code written inside the organisation, I'd use a source control system. Deployment, I'd either distribute using the source control system, or a package system that's appropriate for the OS

      A private CPAN does not preclude you from using a source control systems for deployment. Nor does it preclude you from using a more general packaging system (like RPMs) to distribute your code. These are complimentary technologies that fit around a private CPAN.

      What's the point, and why would an organization want to limit itself to a system that's geared to a single language, and is designed to do quite different things than corporations need?

      In my experience, the purpose of a private CPAN is to enable organizations to leverage the CPAN tool chain for managing the dependencies between their own modules and their third-party libraries (i.e. the public CPAN).

      I'm not suggesting that everyone *should* use a private CPAN, especially if they are already comfortable with their dependency management infrastructure. However, there are a significant number of organizations that don't manage their Perl module dependencies well. Often times, this leads to application failures, unnecessary development costs, and general chaos.

      Private CPANs have started to emerge as one possible solution for managing Perl module dependencies. But the current tools for creating, maintaining, and using a private CPANs are very fragmented. Moreover, the patterns for using those tools are not well established. My goal is to assimilate the existing tools and knowledge (and perhaps some new tools and knowledge) into a coherent product.

      So this thread certainly isn't relevant for everyone. But if your production system has ever crashed because the team down the hall decided to upgrade their app to the latest Catalyst, or if you have to debug in production because you can't reproduce the exact same system somewhere else, or if you've ever had to force-install a module with failed tests, then I want to hear from you.

      Jeffrey Thalhammer
      Imaginative Software Systems

        In my experience, the purpose of a private CPAN is to enable organizations to leverage the CPAN tool chain for managing the dependencies between their own modules and their third-party libraries (i.e. the public CPAN).
        But the CPAN toolchain is actually really bad in managing dependencies. Sure, it allows the author to signal a dependency on another CPAN module, but it isn't very suited to do dependencies on non-CPAN modules. OS package systems (including those ported to a range of OSses, like RPM) and distributions systems like cfengine (which allows specifying dependencies based on the role of a box) far superior for that task that CPAN.
        Nor does it preclude you from using a more general packaging system (like RPMs) to distribute your code. These are complimentary technologies that fit around a private CPAN.
        Pray tell me, if I'm already using RPMs, what additional value does a private CPAN give me?
Re: RFC: Private CPAN In A Box
by sundialsvc4 (Abbot) on May 23, 2011 at 13:25 UTC

    Jeffrey, we’ve all heard from Brian with regard to his experiences on this matter ... he wrote a book on it, after all ... but what about yours?   Since you have done this “for three different organizations now, and soon to be four,” I would call that serious boots-on-the-ground experience.   (Not to imply that Brian doesn’t have the same ... we know better.)   The foot-soldier that I would really like to hear from now ... is you.

    (Hang on a sec, let me grab my bag of popcorn ...)   To start with, how would you answer your own questions?   What worked, and what didn’t?   What would you count as “a mistake, not to be repeated,” and what as “a goodness, that I would like to improve upon, or at least, manage to do again?”

Re: RFC: Private CPAN In A Box
by thargas (Deacon) on May 24, 2011 at 12:04 UTC

    I've found that what I really want is the ability to install, centrally, multiple versions of perl modules and have a way for scripts to point at which version of each (non-core) module they want. It is a pain having to enumerate the version of each module you want, but without this, upgrading any module is a potential downtime.

    Even if all modules signaled broken backwards compatibility by the version number, which they don't, it wouldn't help this problem because of all the existing scripts, which were written for the old version of the module, may break when you upgrade such a module. You can find most of these scripts by searching for "use SomeModule", but then someone will be clever and load the module at runtime with require... Until perl can (out of the box) deal with this, it's a dangerous thing to use in a large scale environment. Don't get me wrong, I prefer perl, but I'm also sure that anyone who has used perl in an enterprise environment has been bitten by this problem. Things like PAR can help, but they require more effort to package the script for deployment.

      Until perl can (out of the box) deal with this, it's a dangerous thing to use in a large scale environment.

      The out-of-the-box solution is to have per-application module bundles, or perl installs. So each application which requires many and potentially incompatible prerequisites, gets its own local::lib, its simple

      $ENV{PERL_MM_USE_DEFAULT}=1; $ENV{PERL_MB_OPT}='--install_base /some/path/appxyzv23/perl5'; $ENV{PERL_MM_OPT}='INSTALL_BASE=/some/path/appxyzv23/perl5'; use strict; use warnings; use CPAN; CPAN::Shell->install(qw[ AUTHOR/Module-1.23.tar.gz ... ]); ... use lib '/some/path/appxyzv23/perl5';

      There is also only - Load specific module versions; Install many

      Hmmm... I was wondering about that, lately.   You see, I inherited a big, ugly application that does a lot of RESTful calls.   To its credit, there is an extremely consistent logic to the way those calls are done, and the implementation is as orthagonal and “tight” as one might wish for ... but it’s mod_perl, and it’s big.

      So, one of the things that I did to it ... and now I’m wondering about the “goodness” of it ... is to require each of the REST handlers when, and if, a call to that handler actually comes in.   (All of the possible handlers originate from exactly one name-space, so it isn’t like we don’t have a complete list of what modules could be invoked.)   The idea is to demand-load those handlers which might not be routinely used in every lifespan of a particular Apache process, in hope that these processes might use considerably fewer megabytes apiece than they once did.

      Did I sin?   Must I do penance?

Re: RFC: Private CPAN In A Box
by Argel (Prior) on May 25, 2011 at 18:33 UTC
    The only real use I could see is to have one system that can mirror the public CPAN then have the rest of the internal systems use the private CPAN. Not sure if I would go much further than that. As JavaFan mentions, for our Red hat systems we would create packages and use Red Hat's Satellite Server technology to push them out.

    Elda Taluta; Sarks Sark; Ark Arks