I recently wrote a new module. When time came to bite the bullet and
provide the carefully-crafted code with some documentation, I started asking
myself questions about how to do it. I'd written some of the module before
running the module package creation tool h2xs so I didn't have the
boilerplate entries already in place. What was properly supposed to be in a
module's POD? I started looking at some semi-random modules installed to my
system, and discovered that there appeared to be a lot of variation in what
the authors chose to use as head[ing]1 labels, these
being one of the basic elements that give POD documentation their structure.
My curiosity was piqued further.
A simple way to address this question is to just run
h2xs to make a
"fake" module setup. Doing so (with Perl-5.8.1 as the base release) yields
this list of
head1 labels in the generated module file "skeleton":
=head1 NAME
=head1 SYNOPSIS
=head1 DESCRIPTION
=head1 SEE ALSO
=head1 AUTHOR
=head1 COPYRIGHT AND LICENSE
If these six labels are present in some large percentage of module
documentations, we might infer that many module authors are using the
h2xs tool and leaving the POD pretty much as is -- "filling in the
blanks".
I decided that I wanted to find out more. Reading page after page of POD
is way too much work, being the lazy programmer that I am. So I decided to hack
together a little Perl to let me in on the bare stats without laborious
manual labor. As textual analysis tools go this is no great shakes, in fact
it could be seriously flawed. But anyway what it revealed was interesting ...
Update:
Minor update to this node was done on 11 Jan 2004
#!/usr/bin/env perl
=head1 NAME
PodHeadCensus - Survey the POD headings in perl POD-bearing files.
=cut
# This program Last modified: 12 Dec 2003 03:45:52
# Canonical code location: http://intrepid.perlmonk.org/scriptscode/Po
+dHeadCensus
# $CVSHeader$
BEGIN {
use strict;
$/ = "\n";
$| = 1;
use vars qw/$v $vv/;
if ($ARGV[0] eq q/-v/) { (shift @ARGV) and $v=1 }
elsif ($ARGV[0] eq q/-V/) { (shift @ARGV) and $v=1 and $vv=1 }
$^W = 1;
}
use File::Find;
use Tie::IxHash;
use Pod::Usage;
use subs qw! lead_section close_section draw_border !;
my $LL = $ENV{'COLUMNS'} || 80;
my ($TCs, $NoPOD,$HadPOD,$_75p,$_25p,$_05) = (1, 0,0,0,0,0);
my (%Aseen,%c_POD);
my $ixh = tie(%Aseen, q/Tie::IxHash/);
my $term = [ &termcodes_set ]; # Our esc seqs are going to be in her
+e?
my ($re, $bd, $us, $ue) = @$term;
unless ( 4 == grep { $_ } @$term )
{
$TCs= 0;
$re = "\e[0m"; # ANSI
$bd = "\e[0;1m"; # ANSI
$us = ''; # dunno
$ue = ''; # dunno
}
my $argN = $ARGV[$#ARGV];
if (! $argN) { pod2usage(1) }
elsif (grep /^$argN$/i, qw% -h -u %, qw% --man --help --usage %)
{ pod2usage(-verbose => 2) }
elsif (-d $argN) { $argN = pop @ARGV }
else { $argN = undef }
if ($argN)
{
my $only_allow = sub
{
my $str = shift(@_);
my @pu = $str=~/[[:punct:]]/g;
if ($str =~ m/[[:cntrl:]]/) { return 0 }
elsif(! grep{ m/[\Q{}&*:!`|><\E]/ }@pu )
{ # `filter the names, as a small precautionary measure.
return 1;
} return 0;
};
File::Find::find(
sub
{
my $fful_n = $File::Find::name;
my $fdir_n = $File::Find::dir ;
return if /^\.{1,2}$/; # reject . and ..
push @ARGV, $fful_n
if (
$fdir_n !~ m%/pod$% and # Not from Perl dist "pod/" dir,
&$only_allow($_) and # no funny business,
/\.p(?:m|od)$/ and # POD-bearing type extensions,
-f $_ # we will only open real files, thank-you.
);
} , $argN );
}
# Now, having "manually" or "quasi-meta-auto-magically" populated
# @ARGV, we scan through the files looking for '=head1' lines.
for my $podfile (@ARGV)
{
lead_section $podfile if $vv;
open PPod, $podfile or die
"Failed open() on \"$podfile\", maybe no rights?:\n $!";
while(<PPod>)
{
if ( m%^=head(?:1|2)\s+([A-Z][-\s_A-Z]+[A-Z])\s*$% )
{
print "${us}$_${ue}" if $vv;
$Aseen{"$1"}++ if $1;
exists $c_POD{"$podfile"} || $c_POD{"$podfile"}++;
}
}
close PPod;
close_section if $vv;
}
for my $n (keys %c_POD) {$HadPOD++ if $c_POD{"$n"};}
$NoPOD = @ARGV - $HadPOD;
$_75p = sprintf("%u", .75 * $HadPOD);
$_25p = sprintf("%u", .25 * $HadPOD);
$_05p = sprintf("%u", .05 * $HadPOD);
print join ("\n", map
{ sprintf "%-65s seen ${bd}%3u${re} times", $_,$Aseen{$_
+}
} keys(%Aseen)
) if $v ;
print draw_border
,"Number of files examined (${bd}".$NoPOD."${re} had no POD): "
. ${bd} . @ARGV . ${re}
,draw_border;
print draw_border
,"These were the headings seen in at least 75% of the cases:"
,draw_border;
print map { " ${bd}$_${re}\n" if $Aseen{"$_"} >= $_75p } keys(%Asee
+n);
print draw_border
,"These were the headings seen in at least 25% of the cases:"
,draw_border;
print map { " ${bd}$_${re}\n" if $Aseen{"$_"} >= $_25p } keys(%Asee
+n);
print draw_border
,"These were the headings seen in at least 5% of the cases:"
,draw_border;
print map { " ${bd}$_${re}\n" if $Aseen{"$_"} >= $_05p } keys(%Asee
+n);
exit 0;
sub termcodes_set
{
eval
{ require Term::Cap;
require POSIX;
}
or do {
warn "Cannot import POSIX or Term::Cap\n" unless $^O =~/M?S?Win/;
return undef;
};
my $termios = POSIX::Termios->new();
if (not $termios) { die "Badly!\n$!" }
$termios->getattr;
# This next defaults to $ENV{'TERM'}, if not set, we'll croak!
my $ts = Tgetent Term::Cap { TERM => undef,
OSPEED => $termios->getospeed };
my $sure = eval { $ts->Trequire( qw/me md us ue/ ) } ;
if ($@) { return undef }
else
{
return map { $ts->Tputs($_,1) } qw/me md us ue/;
}
# From curses termcap/terminfo (5) manpage:
#------------------------------------------------------------------
# md = extra bold mode ue = exit underline mode
# mh = dim mode us = enter underline mode
# *** me = turn off all attributes ***
# so = enter standout mode se = exit standout mode
# sp = set curr color pair to #1
# op = set curr color pair back to original pair
# Sf = set foregrnd color #1 Sb = set backgnd color #1
# AF = set ANSI fg color AB = set ANSI bg color
#------------------------------------------------------------------
}
sub lead_section
{
printf("%-${LL}s", ' ---"'.shift(@_).'"---');
}
sub close_section
{
print "\n", ('=' x ($LL-1)), "\n";
}
sub draw_border
{
return "\n" if $TCs;
sprintf "\n%s\n", '-' x ($LL-1);
}
__END__
--- 8< --- SNIP! --- 8< --- cut here --- 8< ---
The POD for this script, with its line lengths longer than advisable
to post as "code" on Perlmonks (IMHO), is available at
http://intrepid.perlmonk.org/scriptspod/PodHeadCensus.pod
The .pod file can be inserted one blank line after the __END__ marker
which terminates the code above. It is recommended to run this
program with its POD in place.
--- 8< --- SNIP! --- 8< --- cut here --- 8< ---
Update: until further notice the links
just below are offline. Sorry!
Links provided for reader convenience:
- This script's documentation as POD:
PodHeadCensus.pod
|
- This script as syntax-hilighted stand-alone code, no POD:
PodHeadCensus.html
|
- This script's "production copy" with inline POD and cryptographically signed:
PodHeadCensus
|
links last checked: 11 Jan 2004 |
The Results
Perhaps unsurprisingly, it appears from the very preliminary data which
I've seen so far that Perl authors are a highly individualistic bunch, who
take guidelines and recommendations as only that. A quite high percentage
of the scanned documentation lacked several of the "labels" output by the
h2xs utility. On the other hand, many, many unique instances of
labels were detected. The trial was run on the several Perl releases that
I have available and these are some representative results:
$ PodHeadCensus /opt/perl/lib/5.8.1/
Number of files examined (25 had no POD): 355
These were the headings seen in at least 75% of the cases:
NAME
SYNOPSIS
DESCRIPTION
These were the headings seen in at least 25% of the cases:
NAME
SYNOPSIS
DESCRIPTION
SEE ALSO
AUTHOR
These were the headings seen in at least 5% of the cases:
NAME
SYNOPSIS
DESCRIPTION
SEE ALSO
EXAMPLES
CAVEATS
DIAGNOSTICS
AUTHOR
NOTES
AUTHORS
METHODS
BUGS
COPYRIGHT
NOTE
$ PodHeadCensus /usr/share/perl
(Perl-5.6.1 on Debian)
Number of files examined (33 had no POD): 214
These were the headings seen in at least 75% of the cases:
NAME
SYNOPSIS
DESCRIPTION
These were the headings seen in at least 25% of the cases:
NAME
SYNOPSIS
DESCRIPTION
SEE ALSO
AUTHOR
These were the headings seen in at least 5% of the cases:
NAME
SYNOPSIS
DESCRIPTION
SEE ALSO
DIAGNOSTICS
EXAMPLES
AUTHORS
BUGS
AUTHOR
COPYRIGHT
NOTE
METHODS
There have undoubtedly been interesting discussions about creating POD
in the past here on perlmonks nodes; I will likely be downvoted for not
referencing them ;-). OTOH I have made what I hope is an original (if small)
contribution by providing a tool to use in "POD-head census-taking" on
various Perl installations.
Soren/somian/Intrepid
--
2003: The 3 least meaningful terms in Online jargon:
troll flame rant
These used to mean something -- but then they were highjacked by
inferior intellects who, when faced with a more erudite opponent
employing superior arguments, abuse them as merely another form of
name-calling. ;-)