rsync -rvz cpan.pair.com::CPAN/ > list
perl -e 'use strict; use warnings; my %h; while(<>){ chomp; /^\S{10}/
+or next; my $n = substr $_,43; substr($_,11) =~ /^\s*(\d+)/ or die; m
+y $s = $1; while ($n =~ m"/|$"g) { my $p = $`; $p =~ m"authors/id/.(?
+:/..)?$" and next; $h{$p} += $s; } } warn 0+keys(%h); for my $n (reve
+rse+(sort { $h{$b} <=> $h{$a} } keys%h)[0..9999]) { printf "%10.0f %s
+\n", $h{$n}, $n; }' list > sizes
The first command downloads the full directory listing (not the files) from a CPAN mirror server, this is about 33 megabytes uncompressed currently. The second lists the 10_000 largest files and directories (omitting some uninteresting ones) from this to the file sizes.
It turns out that the author with the largest total file size is NWCLARK with 224 megs of files, then GRAHAMC, RGARCIA, TPEDERSE in this order. The largest single file is authors/id/G/GR/GRAHAMC/SiePerl-5.8.8-bin-1.0-Win32.INSTALL.exe, 34 megabytes large.
Here's a copy of the last few lines of output.
Update: I have posted the generalization of this code to rsyncsize -- Largest directories on a remote file system.
Update 2011-12-17: CPAN - As Seen From Space!
|