|Perl: the Markov chain saw|
I don't really trust any of those numbers, although the basic conclusion is probably right.
Simply counting projects doesn't mean anything on either CPAN or SourceForge. I can register a module or a project then do nothing with it, or I can register a single project, like "brian d foy's Perl modules" that is really many separate projects. The CPAN numbers are similarly skewed by failed starts, abandoned projects, and inactive developers.
I don't think that the conclusions are wrong, though. Most metrics would show that Java is more popular (there is only one print magazine for Perl, but shelves full for Java, for instance).
At some point, I think we have to look at the quality of the project too. Should a project with lots of CERT advisories count as much as a project that hardly ever breaks? Should a project with 10 downloads a year count as much as one with a million? Should a project that only compiles on one operating system if the moons are aligned and it is tuesday count as much as a project that runs on just about anything? Should "Hello World!" count as much as apache?
At the New York Perl Mongers meeting last month, we talked a bit about collecting adjusted statistics about CPAN. It's not enough to just count things. We have to weed out PAUSE accounts who never logged in or uploaded a module, modules that never made it past their first upload, and so on. We also should look at how attention a distribution gets by analyzing frequency of uploads, number of co-maintainers, how many other distributions require it, and so on. If I only had enough time...
brian d foy <email@example.com>