Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Perl6 Pod -- reinventing the wheel?

by j3 (Friar)
on Nov 25, 2006 at 00:02 UTC ( #585945=perlmeditation: print w/ replies, xml ) Need Help??

I noticed that Damian recently posted a draft of Synopsis 26 (Perl6 Pod) to perl.perl6.language. It looks like a great deal of effort has been put into making POD into a more general and complete doc format. But I'm left wondering, is a very large wheel being reinvented here?

It looks as if there are already numerous doc formatting markups available that could fairly easily be used in place of POD if a more complete doc format is wanted. The one that immediately jumps out at me is Texinfo.

I won't enumerate every feature of Texinfo here, but suffice it to say that it seems to have most every feature you'd want for this sort of job (generates multiple output formats, is indexable, searchable, and so on). Furthermore, it would be trivial to incorporate it into perl code. For example, maybe have "POTD" ("Plain Old Texinfo Docs") start with '#@' instead of '#'. For example:

#!/usr/bin/perl use strict; use warnings; #@ @node A few good subs #@ @chapter A few good subs #@ #@ This is a line of POTD. This module contains #@ some functions and might be used as follows: #@ @verbatim #@ do_something(); # Magic happens here! #@ @end verbatim #@ # ------------------ # Subroutines # ------------------ #@ @node do_something #@ @section do_something #@ #@ You'd use this @emph{awesome} function for: #@ #@ @itemize #@ @item #@ When you want to do foo. #@ #@ @item #@ When you want to do bar, since foo obviously #@ isn't cutting it. #@ @end itemize sub do_something { print "Magic goes here.\n"; } print "hi.\n"; do_something; print "bye!\n";

Then a "potd2whatever.pl" tool could simply start off as something like:

#!/usr/bin/perl use warnings; use strict; my @lines = (); while ( <> ) { if (/^#@/) { s/^#@[ \t]*//; push @lines, $_; } } my $lines = join '', @lines; # Now pipe $lines through any one of: # - texi2dvi # - texi2pdf # - makeinfo

Some more observations:

  • The Texinfo licensing is compatible with Perl.
  • Texinfo is a very mature and stable tool (and also happens to be the standard and official GNU tool for the job).
  • Texinfo -- like Perl itself -- comes with (or at least is available for) every flavor of GNU/Linux that I've come across.
  • It's fairly pleasant to type.
  • You can even put mathematics into texi docs, which I think is a big plus.

Seems like the Perl + Texinfo would go together like peanut butter and jelly. So, why is this particular wheel being reinvented?

(Edit: Fixed the above code to remove the '#@' from the beginning of each line.)

(Edit: Corrected spelling of "Damian".)

Comment on Perl6 Pod -- reinventing the wheel?
Select or Download Code
Re: Perl 6 Pod -- reinventing the wheel?
by chromatic (Archbishop) on Nov 25, 2006 at 01:35 UTC
    So, why is this particular wheel being reinvented?

    There's already a perfectly good documentation format for Perl that has a much less bletcherous syntax. It's POD.

    Now POD has its flaws, as anyone who's ever written a book in it will argue (and several of the people who've done so are actually on the Perl 6 design team), but it's also fairly nice in what it does include and how it does it.

    More than that, whatever the document format of Perl 6, it must be portable to all of the systems where Perl 6 runs (and saying "every flavor of GNU/Linux that I've come across" doesn't really impress me with portability). It must be lightweight enough that it can be part of the core distribution. It should be sufficiently advanced over POD in Perl 5 to make up for any differences in syntax. It should fix as many of the warts of POD in Perl 5 as possible. It ought to be similar to POD in Perl 5 where possible, as change for the sake of arbitrary change is a lousy design goal. It needs to be extensible, which is one of the main problems of POD in Perl 5, and it should allow better reuse and introspection and customization than Perl 5's POD.

    It's also very nice to control the document formatter used for the core documentation rather than relying on upstream to review patches and release new versions.

      There's already a perfectly good documentation format for Perl that has a much less bletcherous syntax. It's POD.

      Actually, the syntax doesn't look bad to me. In comparison to POD, you've got @foo{bar} instead of F<bar>. On average, they both seem to take up about the same amount of space on the page too.

      Now POD has its flaws, {snip}, but it's also fairly nice in what it does include and how it does it.

      Yup. Texinfo seems ok too though. From the few times I've used it anyway.

      More than that, whatever the document format of Perl 6, it must be portable to all of the systems where Perl 6 runs (and saying "every flavor of GNU/Linux that I've come across" doesn't really impress me with portability).

      Which flavors of GNU/Linux *isn't* Texinfo available for? ;) Seriously though, since you can easily convert Texinfo to various formats (ex. HTML), MS Windows users should be able to access their docs just fine.

      It must be lightweight enough that it can be part of the core distribution.

      The only heavy part of Texinfo is TeX, which, of course, isn't necessary unless you want to produce dvi or pdf. Dunno how Texinfo-sans-TeX weighs in compared to the POD suite of tools, but I'm guessing the difference is not a big deal either way.

      It should be sufficiently advanced over POD in Perl 5 to make up for any differences in syntax.

      Check.

      It should fix as many of the warts of POD in Perl 5 as possible.

      Check.

      It ought to be similar to POD in Perl 5 where possible, as change for the sake of arbitrary change is a lousy design goal.

      Well, it's not arbitrary, since -- evidently -- there's reasons to go from Perl5's POD to Perl6's Pod. Perl5's POD is well-defined, and so is Texinfo, so a translator shouldn't be too much of a problem.

      It needs to be extensible, which is one of the main problems of POD in Perl 5,

      Dunno what you mean here. What do you need in a doc system that's not in Texinfo? And why would it be hard to extend Texinfo?

      and it should allow better reuse and introspection and customization than Perl 5's POD.

      Ah. Well, you've got me there. I think Ruby's doc system has some introspection built into it. That seems to come with its own problems though. For example, you re-open and extend a class but the docs either don't show your additions, or else you can no longer read the docs how they were before your addition (though I guess these are problems that can be worked through...).

      It's also very nice to control the document formatter {snip}

      Yes. I think this is a substantial tradeoff you make when using a standardized doc system outside of your own mothership. You get benefits too though, and I still think the idea has merit.

        What's not bletcherous about the # @directive syntax, especially after you add some mechanism of escaping that perl the compiler knows how to ignore but doesn't also turn normal commented-out code into formatting directives?

        What exactly is portable about "it runs on Linux and Windows users can just use HTML"? Do you know on how many other platforms Perl can run? You don't know how big the Texinfo distribution is, so I assume you likewise have no informed idea about the platforms it supports! How can you possibly argue that it's anywhere nearly portable enough to be part of the Perl 6 core without knowing at least these two important facts?

        What exactly is easy to extend about Texinfo unless the Perl 6 maintainers fork their own version and add their own features, rather than relying on upstream to make those changes?

        How exactly is making everyone who wants to write POD in Perl 6 learn a completely new style of syntax with a new escaping system that POD has never needed before not an arbitrary change?

        Is Texinfo seriously an order of magnitude better than the POD 6 proposal? That's my threshold for such a large change. You have to address my four objections in a seriously impressive fashion to get my vote.

        and it should allow better reuse and introspection and customization than Perl 5's POD.
        Ah. Well, you've got me there.

        Hm. I just had another look at that S26.pod6, and I don't see anything about introspection. Is Perl6's Pod going to have special "smart" features like Ruby's? (which I'm none too crazy about anyway).

        Also, I don't know why this post got down-voted. I'm not trying to convince anyone to shoehorn-in Texinfo instead of Perl6's Pod. However, it sure seems like a whole extra boatload of work to change all the existing POD tools (and/or write new ones) when you could just grab an existing standard that has mature and debugged tools ready and waiting. {shrug}

        Is there some stigma against Texinfo that I'm unaware of?

      There's already a perfectly good documentation format for Perl that has a much less bletcherous syntax. It's POD.

      Now POD has its flaws, as anyone who's ever written a book in it will argue (and several of the people who've done so are actually on the Perl 6 design team), but it's also fairly nice in what it does include and how it does it.

      Hmmm, it seems I've mentioned YAML quite often lately. (Although I'm not really sure whether two entries count as "quite often" even if so close in time.) Now, it may seem OT to mention it in this thread too. But your points gave me the chance to expose a meditation that's been in my head for some time now. Now, both YAML and POD are LWMLs. As it happens, LWMLs are typically divided in two categories of which each of them is part respectively: namely, data serialization oriented and presentation oriented ones.

      XML, for example, is suitable both for presentation and data serialization. But it sure is not lightweight. Now, I'm not really asking about merging (something based on) POD with (something based on) YAML. But I'd be interested in the feasability of an actual lightweight markup that could be suitable both for presentation and for data serialization. A first remark that one may make is that there are good reasons to keep logically different things, ehm, different. But it's also true that document description involves very similar issues to those that one can find in data serialization, e.g. wrt sectional divisions or special lists and many other things. So solving the problem once may be enough. Solving it in a way that is both forgiving enough for one task, and precise enough for the other IMHO would mean to achieve flexibility and reliability. Provided that it's possible, but then that's why I'm asking here...

Re: Perl6 Pod -- reinventing the wheel?
by merlyn (Sage) on Nov 25, 2006 at 04:37 UTC
    The Texinfo licensing is compatible with Perl.
    Actually, it's not. As far as I could determine in a few searches, it's released only under the GPL, which means it cannot be included in the core Perl distro. Everything in the Perl core has to be released under the dual-license (GPL + Artistic). I highly doubt that the GNU folks will grant their precious Texinfo to be released under something more liberal than the GPL, especially as they already view licenses like the Artistic and BSD licenses as sell-outs.

    Which means, we'd need to rewrite the entire Texinfo application. Are you volunteering?

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      The Texinfo licensing is compatible with Perl.
      Actually, it's not. As far as I could determine in a few searches, it's released only under the GPL, which means it cannot be included in the core Perl distro. Everything in the Perl core has to be released under the dual-license (GPL + Artistic).

      If Perl6 were to use an existing doc standard from which to base it's Pod, it would make sense to me that an implementation of this existing standard would simply be listed as a prerequisite, rather than actually being distributed along with Perl6 proper. That's part of the price for choosing to use an outside implementation of an existing doc standard for one's software project. Maybe that's not acceptable here though. I don't know what the Perl project policy is on that.

      Which means, we'd need to rewrite the entire Texinfo application. Are you volunteering?

      Not as such, but thank you for the offer. :) BTW, I don't mean to be making any requests here. I hope my posts don't come off sounding that way. Incidentally, if the Perl6 team specifically wanted to use an existing (non-POD) standard specification and then write their own implementation, I bet we'd be able to scratch up some usable pod62man and pod62html scripts very quickly.

      From what I can tell, a main reason for not reinventing the wheel here is to allow many users to write their docs using markup and tools with which they are already familiar. Their existing tools may have features they can't live without (like having TeX draw their mathematics, or easily being able to include screenshots in their docs).

      Now, if Perl6's Pod is general enough that folks can use it for documenting things that have nothing whatsoever to do with Perl, and if it's nifty enough that they'll actually *want* to document those things in Pod instead of their current favorite, then maybe it's worth taking the time to reinvent Perl5's POD. I think it would be great if it turns out to be both.

        If Perl6 were to use an existing doc standard from which to base it's Pod, it would make sense to me that an implementation of this existing standard would simply be listed as a prerequisite, rather than actually being distributed along with Perl6 proper.
        Well, the distro has to include internaldoc-to-man at a minimum, or Unix admins around the world will scream. And I don't even think a texinfo-to-man exists, because texinfo is too rich to fit into manpages. In fact, that's the problem with your proposal as well. According to texinfo:
        Notable is the lack of man as an output format. Texinfo is used to write the documentation of GNU software, which typically is used in Unix-like environments such as Linux, where the traditional format for documentation is man. Man pages have a strict conventional format, whereas typical Texinfo applications are for tutorials and reference manuals. As such there is no benefit in using Texinfo for man pages, which are traditionally quick reference guides. However, many GNU projects eschew man pages nearly altogether, referring the reader of the provided, and often self-describedly seldomly maintained, man page to the Info document
        Ugh.
Re: Perl6 Pod -- reinventing the wheel?
by TheDamian (Priest) on Nov 26, 2006 at 00:32 UTC
    Okay, so let me talk a little about why the proposed new Pod is the way it is, and why we didn't choose to jump to Texinfo or DocBook or XHTML or any other pre-existing markup system.

    Let me start by rewriting your proposed POTD example using the new Pod notation:

    #!/usr/bin/perl use strict; use warnings; =head1 A few good subs =para This is a line of Pod. This module contains some functions and might be used as follows: =code do_something(); # Magic happens here! # ------------------ # Subroutines # ------------------ =head2 do_something =para You'd use this I<awesome> function for: =item When you want to do foo. =item When you want to do bar, since foo obviously isn't cutting it. sub do_something { print "Magic goes here.\n"; } print "hi.\n"; do_something; print "bye!\n";

    Take a moment to compare the two versions:

    • Which is easier to read?
    • Which has less clutter?
    • Which is more compact (and therefore less intrusive into the flow of the code)?
    • Which uses a distinct column-1 marker character for documentation as opposed to comments?
    • Which doesn't use a @tagname notation that is distracting to Perl programmers (who will unconsciously associate it with arrays)?
    • Which doesn't use a @tagname notation that would require a @verbatim{@tagname} around every mention of an actual array?
    • Which has carefully chosen keywords that clearly describe what a the role of each piece of documentation, rather than how it should be rendered?
    • Which looks familiar to the vast majority of Perl programmers who already know POD?

    To take those points one at a time...

    The readability of raw mark-up matters. Not because the readers of documentation read it raw, but because the writers of documentation write it raw. The less complex and intrusive a mark-up notation is, the less likely the document writer is to make a mistake (either with the notation, or with the content) when documenting.

    The choice of keywords matters too. Texinfo (and HTML and Perl 5 POD for that matter) get this wrong in subtle but important ways. For example, Texinfo and HTML provide the @emph{...} and @strong{...} (<em>...</em> and <strong>...</strong> in HTML) markers. But what do they mean? When should I use "emphasis" and when should I use "strength"? The usual answer for most people is that they simply ignore the (un)descriptive aspect of these labels, mentally translate them back into "italics" and "bold" respectively, and decide which they want on that basis. So the syntax chosen for these descriptive elements actually undermines the descriptive focus: the writer has to resort to presentational considerations to work out what they should use.

    In contrast, and in a typical Perlish (or perhaps a Damianish) approach, Pod solves this problem by going so far the other direction that it nearly comes full circle. Perl 5 POD provided the I<> and B<> markers (for "italics" and "bold"); it was not even pretending to be descriptive. Perl 6 Pod could have provided E<> and S<> markers (for "emphasis" and "strong"), but then it would have been pretending to be descriptive, since everyone would just mentally translate them back to "italics" and "bold". Instead Pod provides the U<>, I<>,, and B<> markers: for "unusual", "important", and "basis". That is, Pod provides three levels of significance markers and—far more importantly—provides an easy way to decide which one to use.

    Instead of asking yourself "Should this be emphasized or strong", you ask yourself "Is this merely unusual in the surrounding text, or is it actually important in the surrounding text, or is it in fact the entire basis of the surrounding text". So Pod gives you a much better way of deciding which mark-up tag is appropriate...by making the markup keywords actually mean something. Instead of deciding what the text should look like (presentational mark-up) or how much emphasis to apply to the text (descriptive mark-up), you decide how significant that text is (semantic mark-up).

    Now, by sheer coincidence, the unusual tag (U<>) is typically rendered in underlining, the important tag (I<>) is typically rendered in italics, and the basis tag (B<>) is typically remdered in bold. So if you don't like—or can't cope with—Pod's semantic level of mark-up, it turns out that you can just pretend that Pod is still a descriptive mark-up notation and simply choose whether you want underlining, italics, or boldness. Of course, it's a complete accident that the U<>, I<>,, and B<> tags can be misunderstood in that fashion, but it's a highly convenient and backwards-compatible accident. ;-)

    This notion of semantic mark-up is applied throughout the design of Pod. For instance, whereas Texinfo has @verbatim blocks and HTML has <PRE> blocks, Pod has three distinct alternatives: =code, =input, and =output. That's so you can distinguish the three commonest uses for pre-formatted verbatim text in a readable way, and so you be specific about the semantics of a particular block (what it means), and so renderers can easily distinguish between those three types of block and thus present snippets of code, samples of input, and listings of output in three distinct and easily recognized formatting styles.

    Added to all of the above motivations is the fact that Perl 6's Pod has been carefully designed to be very easy to adapt to if you're already familiar with Perl 5's POD. If you know POD, you can very quickly learn the new rules for Pod (there are fewer of them and they're less restrictive, simpler, and more consistent). Similarly, it's easy to pick up the small number of new constructs (nested list items, autonumbered list items, input and output samples, tables, definitions) because they use the same syntactic structures as existing constructs. Oh, and annoyances like =over/=back and =cut (which were often where mark-up mistakes crept in) have been removed.

    So, yes, we could have chosen to move to an entirely different mark-up notation, but we decided that we could better meet the needs (and the expectations) of Perl programmers by tweaking the notation they're already familiar with to remove the pitfalls and annoyances, to raise the level of abstraction, and to increase the expressive power, without compromising the fundamental goal of having a truly Perlish documentation system: one that helps you get your job done efficiently, without getting in the way.

    Damian

      Given your example and some of what you wrote (including criticism of "=cut"), I got the impression that the Perl-5-style "obvious" way to write paragraphs and verbatim text blocks had been dropped. I think that it is one of the best features of (Perl 5) POD, that you can denote a paragraph by simply adding a blank line and denote sample code by simply indenting a paragraph (both so intuitive, compact, clear, and convenient that they were obvious to both the author and the reader).

      Checking the linked Synopsis, I see that this isn't the case. The obvious method is still supported. The example in the (well, that is the) Synopsis looks much more like Perl 5 POD and looks better, IMO, than the example you gave above. Just for y'all's information (including other readers who might have gotten a similar impression but didn't want to bother to browse the Synopsis).

      Thanks for the Synopsis and for the explanation here, Damian.

      - tye        

        To expand on Tye's comment, you could of course still write in a more traditional Perl documentation style:
        #!/usr/bin/perl use strict; use warnings; =begin pod =head1 A few good subs This is a line of Pod. This module contains some functions and might be used as follows: do_something(); # Magic happens here! =end pod # ------------------ # Subroutines # ------------------ =begin pod =head2 do_something You'd use this I<awesome> function for: =item When you want to do foo. =item When you want to do bar, since foo obviously isn't cutting it. =end pod sub do_something { print "Magic goes here.\n"; } print "hi.\n"; do_something; print "bye!\n";
        ...if you preferred.

        I'd argue that this version is also much cleaner and less intrusive than Texinfo or HTML (or even classic POD). Whether it's better than the version I showed earlier is, I suspect, a matter of personal preference. Some people will prefer the clarity of explicit tags, others will prefer the elegance of implicit contextual cues.

        The point being, of course, that Pod is part of Perl 6, and hence TMTOWTDI.

        Damian

      I'm reviewing some old posts -- sorry to dredge up something that is long gone. But, this is still an awfully bad solution to a problem needing fixing. I'm failing to see how your solution is any better than say a solution that involves wikitext. Or, for that matter what all you've accomplished.

      Sample Texttile based markup follows:
      == Title == ''Less emphasis'' '''More emphasis''' '''''Ridiculous emphasis''''' * List item A * List item B ** Sub a under B **# num1 under sub a under B

      I really don't see anything wrong with your argument about strong and em -- largely because of your alternatives and their semantic meaning. However, other questions are left unanswered by this: a search engine for instance can weigh a strong element closer to that of a keyword, how would that fit in with what your saying? How does the unusual content matter -- and why would you want to markup as special something that is unusual but not important? Are all things unusual supposed to be defined elsewhere as important? And, what makes basis special?

      More importantly, you still don't address the simple fact that often the people most capable of documenting, the users, are excluded from the process because of the learning curve of the version control and the commit bit needed. Cleaning up pod is analogous to using a worn bandaid to hold closed a bleeding neck wound. I'm not against pod as an option -- but I think a bigger achievement would be delivering wiki-like functionality into perldoc, and a better default would be wikitext, redcloth, or any variant of Texttile.



      Evan Carroll
      I hack for the ladies.
      www.EvanCarroll.com
Re: Perl6 Pod -- reinventing the wheel?
by philcrow (Priest) on Nov 27, 2006 at 13:59 UTC
    Are you also proposing that we read Perl 6 docs with info? Please say no.

    Phil

      Are you also proposing that we read Perl 6 docs with info? Please say no.

      Well, the point of the OP was that if you were able to use an existing generic doc tool, then you'd benefit from their existing tools. To answer your question though: "no", but it could be an option, I suppose, if you chose a tool that generated info docs. Note, of course, that most generic doc tools can generate various kinds of output, including plain text.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://585945]
Approved by Joost
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (7)
As of 2014-12-19 07:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (73 votes), past polls