Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic

Size of CPAN

by tomazos (Deacon)
on Sep 27, 2005 at 14:08 UTC ( #495379=perlquestion: print w/replies, xml ) Need Help??

tomazos has asked for the wisdom of the Perl Monks concerning the following question:

Motivated by The State of the Onion 9 and the ignorance of the Slashdot reaction - as <joke>self-appointed Perl PR guy for the next 20 seconds </joke>, could someone please tell me what the total number of lines of Perl code that are contained within the CPAN archive?

Recommend use of this number as an advocacy tool.

Update: Good work itub. I think we should ignore the man-hour calculations which are a bit of an eye-roller.

The answer is over 15,000,000 lines of code - and it's all freely available to reuse. Isn't that cool?


Andrew Tomazos  |  |

Replies are listed 'Best First'.
Re: Size of CPAN
by itub (Priest) on Sep 27, 2005 at 14:29 UTC
Re: Size of CPAN
by dragonchild (Archbishop) on Sep 27, 2005 at 15:10 UTC
    I think we should ignore the man-hour calculations which are a bit of an eye-roller.

    Really? And why should that be? Let's take a look at the hours I've spent on Excel::Template.

    • I wrote E::T as a fork of PDF::Template, which I had taken over from Dave Ferrance. I would estimate he had spent between 100 and 250 hours on the 0.05 version. My initial release took 4 weeks to test, which is 160 hours, plus about 40 hours of design help and testing from a coworker. (200-450 hours)
    • The actual forking of E::T took about 4 days, resulting in an initial release cost of 30 hours. (230-480 hours)
    • I've released 24 updates to E::T. The minimum amount of work to release an update to a CPAN module is 2 hours. I generally put in about 5-20 hours per release, averaging about 10 hours per release. So, that's 12 hours per release, on average. So, 24 * 12 is 288 hours. (518-768 hours)
    So, let's call it 600 hours of work put into Excel::Template. At a measly $25/hr (which is what sloccount appears to use), that's $15,000. For one distribution. CPAN has over 7000 distros. If we assume E::T has an low-to-average cost (which I would say is about fair, given the Acme namespace vs. DBI or CGI), then that's a minimum of $105 million dollars. Hmm ...

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      I was not suggesting in any-way-shape-or-form that contributions to CPAN don't have monetary value.

      The point is that it is a statistical calculation.

      People are very wary of marketing information of any kind.

      If you say CPAN is worth $600 million dollars, many readers will be confused - and then go about their day unmoved.

      Many of those that arn't confused, will be skeptical about the calculation method, (not bothering to find out where the figure came from of course) - and then go about their day unmoved.

      On the other hand, 15,000,000 lines of code is a hard fact. Immediately obvious and tangible. Hits you straight between the eyes. Bam. :)


      Andrew Tomazos  |  |
Re: Size of CPAN
by randyk (Parson) on Sep 27, 2005 at 14:29 UTC
Re: Size of CPAN
by chester (Hermit) on Sep 27, 2005 at 14:18 UTC
    From the CPAN FAQ, this link gives the current size of the CPAN. (However not in lines of code, only in bytes.)
      Isn't that in bytes?

      Developers usually talk about N lines of code.

      It would be more useful to say CPAN contains over 1,000,000 (example) lines of freely available source code that...


      Andrew Tomazos  |  |
Re: Size of CPAN
by LTjake (Prior) on Sep 27, 2005 at 20:15 UTC

    Hey, that's a totally cool statistic -- from the 50,000 foot level. And, at a glance, that may be all we would want to look at.

    Looking a little closer, here are a few questions:

    • is 15,000,000 lines of code better than, let's say, 14,000,000 lines code?
    • what percentage is actively maintained?
    • what percentage is duplication (modules that accomplish the same task)
    • what percentage is "production ready"
    • ...and more

    CPANTS tries to answer some of these questions with Kwalitee metrics (and succeeds to some degree). As for the others, I'm not sure we can find an answer -- and maybe it's not important -- I'm just looking at it from a different perspective.


    "Go up to the next female stranger you see and tell her that her "body is a wonderland."
    My hypothesis is that she’ll be too busy laughing at you to even bother slapping you.
    " (src)

      All valid questions.

      It is often commented that CPAN is one of the most impressive parts of "Perl" (along with the flexibility of the language, and the size and supportiveness of the community).

      CPAN is one of Perl's biggest selling points and differentiates it significantly from alternative platforms.

      To understand the value in the 50,000 feet approach, consider the following scenario:

      • You are trying to get a skeptical programmer interested in finding out more about CPAN.
      • They only know a little about Perl and CPAN.
      • You only have time to say eight words to them.

      What is the optimal thing to say in those eight words?

      The best answer I can think of is "CPAN offers 15,000,000 lines of free code."

      The point is that although there is more to the story - people don't always have time to invest in listening to it.


      Andrew Tomazos  |  |

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://495379]
Approved by blazar
Front-paged by monkfan
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2020-12-06 02:29 GMT
Find Nodes?
    Voting Booth?
    How often do you use taint mode?

    Results (65 votes). Check out past polls.