Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: An Introduction to Literate Programming with perlWEB

by BrowserUk (Pope)
on Jan 13, 2009 at 10:25 UTC ( #735910=note: print w/replies, xml ) Need Help??


in reply to An Introduction to Literate Programming with perlWEB

Knuth's concept of Literate Programming (LP) has been around since he published his paper in 1984. (Auspicious year!) If it is such a good idea, how come that after 28 years, there are so few examples of real-world use?

Short answer: Because most programmers are not writing code for use as examples in books or academic papers, nor do they seek the understanding of those illiterate in the (programming) language of choice the code is written in.

Longer answers (not necessarily in order of importance or frustration factor!):

  1. As a programmer, I am mentally incapable of--untangling, or is it unweaving? Maybe I'll stick with--rearranging the snippets of code littered (or maybe literated?) amongst large volumes of meaningless English prose.

    It's meaningless because: the compiler won't compile it; the interpreter won't interpret it; and nothing on earth will test it!

    • Neither the veracity of the claims it makes;
    • Nor the correctness of the algorithms it describes;
    • Nor the accuracy of correspondence between the prose and the code it purports to document.
  2. Instead of one level of maintenance, we have SIX! Six you say? Two surely? No! Six!!
    1. We have the English to maintain:

      Is the phraseology clear? Are there no deceptive misspellings? No ambiguous punctuation?

    2. We have the algorithms and logic that the English describes to maintain.

      Are the correct algorithms used? Is the logic, logical?

    3. We have the source code to maintain--in all the usual ways.
    4. We also have to ensure that the English description correctly describes the chosen algorithms.
    5. We have to ensure that the code actually implements the algorithms described, as described.
    6. And we have to make sure that the prose remains attached (physically adjacent to and above) the code it is describing.

      And when it comes time to re-factor a portion of one of the code snippets, that we move the appropriate subsections of the the original description to where ever the refactoring takes that portion of code.

    Six times as many ways to cock up, and none of the unnecessary five will likely ever have any automation tools that can be applied to them. At least, it's very unlikely in the working careers of anyone currently earning a living as a programmer.

  3. It adds another layer between the product of the programmer: the source code; and the consumer of that product: the compiler or interpreter.

    If you doubt that is a problem consider: Will compiler detected errors be reported in terms of their LP form line numbers or their post-processed form line numbers?

    If you've used Inline::C (or any Inline::* language), they'll you'll undoubtedly be familiar with the game:

    1. Write code
    2. Attempt a run
    3. Compiler detects an error--lists the erroneous line number in yourfile_dead.xs(69)
    4. Navigate to ./_Inline/build/yourFile_dead_pl/yourFile_dead.pl But it's not there!
    5. Go back and add  CLEAN_AFTER_BUILD => 0. Re-run.
    6. Compiler detects an error--lists the erroneous line number in yourfile_beef.xs(69)
    7. Navigate ./_Inline/build/yourFile_beef_pl/yourFile_beef.xs Load it into your editor.
    8. Examine line 69;, relate it to the error message; think for a bit; see what you think is the cause and correct it. Save file. Re-run.
    9. Compiler detects THE EXACT SAME error--lists the erroneous line number in yourfile_abcd.xs(69)
    10. Doh! Go back to the source (*.pl* file), work out which line in that file is the source for line 69 in the .xs file, Make the change (again!). Re-run.
    11. Compiler detects a different error--lists the erroneous line number in yourfile_a9c7.xs(73)
    12. Switch back to the editor, switch back to the .xs file. Line 73 *IS A BLANK LINE!*.

      Doh! This ./_Inline/build/yourFile_dead_pl/yourFile_beef.xs Discard it!

    13. Navigate to ./_Inline/build/ ... which bloody suffix was the latest?
    14. Switch to other shell: Ah! A9c7
    15. Navigate to ./_Inline/build/yourFile_a9c7_pl/yourFile_a9c7.xs; load file to editor. What was that damn line number again?
    16. Examine line 73, relate it to the error message; think for a bit; see what you think is the cause and correct it. Save file. Re-run.
    17. Compiler detects THE EXACT SAME error--lists the erroneous line number in yourfile_abcd.xs(73)

    *********AAAAAAAAAAAAAAAAAAARRRRRRRRRRRRGGGGGGGGG!!!!!!!!!!!!!**********

  4. Pre-processors and textual substitutions are evil. (And I don't say that about many things!)

    To see the point at its extreme consider the readmore in Re^3: WIN32-API Purgetory. Those 4000 lines of unintelligible code, manually extracted from amongst 26,000 lines filtered from a file containing 114,000 lines, are the result of applying a pre-processor to just 785 lines of source code input.

    If you want easy maintenance?

    Polluting the programmers primary workspace, in the ratio of 3 or 4 lines of noise to every line of productive code; and re-ordering the useful bits in some arbitrary fashion; using textual substitutions to further obfuscate things; and then adding another layer of indirection between what the programmer writes and the processor attempts to run --there are already too many levels--is not the right way to go about it.

  5. Application, program and library documentation has a different purpose and should be separate from source code.

    Architects do not record their stress and strain calculations, or materials assessment tests on either their blue prints, or on the side of the building and bridges the design.

    Or consider your lawyer presenting you with legal documents liberally strewn with footnotes and margin annotations of the form:

    ++Consider the term habeas corpus and the clarification of that term as laid out in Simpson v. Flanders (1999) and its the subsequent refinement in Itchy v. Scratchy (2001)

  6. Production source code in not the place to educate new programmers; nor document design decisions; nor record board meeting deliberations.

    The programmers job is to write programs--not teach newbies how to program. There are schools and courses for that.

    If you've any musical ability (I've none), then maybe this will convince you.

    Imagine your average orchestra musician sitting down to a score like this:

    At the start of this piece the mood is melancholic. Don't be too s +trident. If you're playing the piano, press that "quiet pedal" a lot. If you're the tympanist, use your left hand to dampen the ketties +for the first few bars. Violins, blur the transitions between strings and 'bend' from note + to note. ===== ... ==*== ... ----*-... In these few bars, we start to get the sense of something awaking. + Piano, use the "soft pedal" less and less from beginning to the en +d of the passage, Toward the end, add the "loud pedal" on occasional sharps and flat +s. The kettles should be allowed to ring a little; and little more as + the passage progresses. Violins: Slowly clean up you transitions and blur them less and le +ss. = = - Bassists: This is your regular rhythm to be maintained throughout +the piece: = = - But there are two variations on this which you alternate between for the first two bars at the start of every other stanza--after t +he first four-- and for the last two bars of the alternate stanzas. = ... = ... = ... = ... - ... - ...
  7. Programmers write programs in programming languages, not just because it is easier to write translators (compilers and interpreters) for them.

    But also because their restricted syntax makes it easier to write correct, unambiguous and verifiable descriptions of algorithms. Neither English, nor any other natural language is a suitable substitute. And I'll stick my neck out and say: and never will be.

    And the idea that writing code twice--once in the computer language; and a second time in an unverifiable, inherently ambiguous, infinitely variable of style--will some how render programs more reliable doesn't hold water.

Replies are listed 'Best First'.
Re^2: An Introduction to Literate Programming with perlWEB
by MidLifeXis (Monsignor) on Jan 13, 2009 at 14:04 UTC

    I can second this. I used LP techniques using FunnelWeb on a project in the past, and due to a learning curve with LP, the issues related to not using LP all of the time, and, at least for someone who has been writing code (not necessarily professionally) for over 20 years, just being "not natural", the cost associated with the initial write and now maintenance has made me determine that the next time I need to touch that code that I am rewriting it in more of a "classical" style.

    I much prefer something along the lines of NaturalDocs for helping me with my documentation (although I wish that it integrated more fully with POD - perhaps the newer versions do), and more of a classical approach to coding structure.

    One thing that really turned me off to FunnelWeb's implementation of LP, and LP in general (perhaps too broad of a brush, perhaps not), was changing, from the programmers perspective, the edit, execute, repeat cycle and replacing it with an edit, compile, execute, repeat cycle. Adding that extra step increased development time somewhat, and hassle significantly.

    In certain domains, LP may be the right tool, and while there were some nice things from using LP, but I will need someone else to convince me that it is worth trying it again, since the use of LP was a net loss in productivity, at least in the situation I used it.

    --MidLifeXis

      Not sure how long does it take FunnelWeb to "compile" a file, but quite likely you could create a daemon/service that'd watch the source directories and compile the modified files automatically as soon as they are saved.

      We use something like that for i18n.

        I am redoing our build environment, so perhaps I will relent and take another look at it. IIRC (it has been a couple of years since I had to maintain the one app that I used FW on), the compile did not take an overly significant amount of time, but it was a different (and quite unfamiliar) way to do things.

        --MidLifeXis

Re^2: An Introduction to Literate Programming with perlWEB
by swampyankee (Parson) on Jan 13, 2009 at 16:41 UTC

    Architects don't design bridges; civil engineers do. They also record their stress and strain calculations in documents that are considered part of the drawing, and the engineer damn well better be able to produce them if they get sued*.

    And while the contract you sign may not have all those lovely footnotes, the brief the lawyer presents to court probably will.


    * And they will, even if the tort was because a piece of poo fell on them from a manure truck crossing the bridge.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

      Architects don't design bridges; civil engineers do.

      Tell that to the architect Norman Foster who designed the The Millau Viaduct (in conjunction with a French Structural Engineer).

      Stress calculation are obviously done, and checked and recorded and are an integral part of the overall design and documentation--but they don't do the calculations on the backs (or fronts) of the blue-prints. You don't put the two in the same document.

      And while the contract you sign may not have all those lovely footnotes,

      Exactly! You don't put them all in one document.

      I strongly support project documentation--what files, modules and libraries exist (and where), and what they do; the names of the public interfaces and their parameters; and their purpose--but you don't need to include the names of the internal variables, constants or explain how sort works.

      It's all about not mixing different concerns together; Not repeating effort (the DRY principle); and not creating unnecessary artificial dependencies (the decoupling principle).


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        The drawing isn't the document; it's a part of the document, like Chapter XIV in Moby Dick. Maybe the difference is largely nomenclature, in that you're viewing the drawing as a product, analogously to a computer program, where I don't: it's a step, used to describe a part's geometry and no more a complete part description than "steel." On a drawing, one wouldn't say "bolt" without quite a lot of further information, like the size.

        From here on was added in an update.

        Oddly, I agree with many of your issues with LP, in that it adds a pre-processing layer, and even adds the complication of re-ordering code. I tend to view it as a very elaborate scheme of commentary, which may be a complete misinterpretation of its goals.


        Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re^2: An Introduction to Literate Programming with perlWEB
by doom (Deacon) on Jan 18, 2009 at 09:53 UTC

    Well it seems to me that there are programmers who have trouble doing the mental context switching between "thinking in code" and "thinking in english", and they have a visceral objection to all forms of mixing code and english together (comments in code, pod embedded in code, etc.).

    I'm not one of these people myself -- I'm a fan of embedded pod, and I go as far as to say you should avoid working with people who talk about "self-documenting code" with a straight-face... but on the other hand, part of the game of writing code for other people to read is to keep in mind that there are other kinds of people out there. It's not a bad idea to remember that there are "pure code" people around, and so you should do your best to keep comments brief, and so on.

    The objections that the "pure code" folks have to mixing words in with your code frankly don't often make much sense to me. For example, BrowserUK likes to complain about the need to maintain code and documentation in parallel, but that problem doesn't go away if you move the docs to another file... in fact, at least in theory, keeping the docs for a sub with the sub is supposed to make it more likely the docs will be changed when the code is. (It certainly works that way for me... but then, Damien Conway claims that it doesn't seem to work in practice, so who are you going to believe?).

    Some older threads on this subject, if anyone's interested: Code Maintainability, Programming *is* much more than "just writing code"..

    But then, I'm afraid I must agree that pre-processors are a bad idea. I'm a fan of the perl debugger myself -- do you know what it's like to try to debug code that uses a pre-processor? The abstractions the pre-processor was used to implement break down immediately, and you end up steping through the other guys code instead of the stuff you're working on.

    In general, I think perl is a very nice, flexible language, which means it can be used to implement variations of itself... but if you take that too far, it doesn't seem to me like programming in perl any more, and I lose interest pretty quickly.

      BrowserUK likes to complain about the need to maintain code and documentation in parallel, but that problem doesn't go away if you move the docs to another file

      Oh, but it does!

      When writing documentation in a different file, people are far less tempted to start describing how the code works--opens a file; increments a variable; sorts an array; stuff which all but the most novice of programmers can (and if they are to do a descent job of maintenance work; should), work out for themselves, from the code.

      Instead, they (should) concentrate on:

      • the what: the public APIs and their parameters;
      • the why: their function and purpose, in terms of what they are intended to do on behalf of the calling code;
      • the usage: this one is far to often glossed over or omitted completely.
      • the contract of that API: its requirements of the caller, and promises to them.

      Documentation is for users; and should be written at the same level of abstraction that the user will use the exported api.

      The code is description of the actual algorithms used and their implementation. And should be the only such description. It cares not for the external abstraction.

      You should be able to re-implement the internals of an published (documented) API, using different internal algorithms, or even a different language, and the documentation should not need to be change at all. Nothing in the documentation should need to change, when the implementation changes, provided that the published API is maintained.

      The purpose of comments is to annotate the code with additional (brief) information, pertinent to that code, that the language does not allow to be conveyed easily by the code itself. It should not repeat the code; nor the documentation; nor the language manuals.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Well, this is what I would tell you about words and code:

        Unless you're working from a fixed spec (which is more the exception than the rule, I'd say) the API of a module is going to change as you're working on it. If you keep the pod describing what a sub does in the same file as the sub, I think you're much more likely to remember to revise the docs when the API changes.

        Further, it's often a very good idea to use some comments throughout the code. The "paragraph style" works well: a "topic sentence" in english, followed by detailed explication in the form of code. Comments at the end of a line of code are good places for things like TODO notes and even hints to perl beginners ("hash slice", "schwartzian transform").

        As for things like this:

        The code is description of the actual algorithms used and their implementation. And should be the only such description. It cares not for the external abstraction.
        My personal opinion is that techies really need to watch these kind of religious beliefs -- we're always trying to squeeze the world into these neat, idealized doctrines, but the world always fights back. For example, if you were to take what BrowserUK is saying seriously, you would insist that perlguts should not exist.
      If you go so far as to say that you don't want to work with programmers who can talk about self-documenting code with a straight face, then you are going to pass up working with a lot of good people. Like me.

      On the subject of documentation and code, I am a big believer in limiting what you document about your code. That does not mean eliminating documentation! But my attitude is that documentation exists for people who do not need to read my code, which in practice means that the public API and important data structures (particularly database tables) need to be documented, and sometimes you need an introductory document or three. What documentation does not exist for is helping people to understand my code. If I need that then I have at least one problem, and if I use documentation to solve it then I have just added another.

      By reducing the amount I document, I avoid a lot of potential mistakes. Furthermore things that I think should be documented, like APIs, are things that you shouldn't be changing in your code without thinking through potential impacts anyways. So asking a person to go and document those changes makes perfect sense. If you stay consistent about what gets documented, when, then in my experience maintaining that documentation isn't too big a deal no matter where it is. And I personally find that documentation to be more readable and consistent when it is kept in one place.

      Even so, documentation will be less reliable than code. But by limiting where I make my mistakes, I know to focus on those potential problem areas. This compensates reasonably well. Since I only need to maintain that vigilance some of the time, it is easier than if I had to think about it more often.

Re^2: An Introduction to Literate Programming with perlWEB
by adamcrussell (Hermit) on Jan 13, 2009 at 14:41 UTC
    I will address your points:
    (1) and (2) sound like personal problems. LP does require different thought processes but this doesn't take too much extra effort. Really no more extra effort than, say, switching between perl and SQL when working on a database problem. Or switching between perl and html when working on a website.
    (3) is a rant against Inline::C. These problems don't exist for me as a practitioner of LP.
    (4) is a rant against pre-processors in general. I don't think anyone will argue that the extreme example you cited is good nor that heavy use of a pre-processor is wise. pre-processors have their place...I'll just say that the use of pre-processors should be dictated by your inhouse coding style and leave it at that.
    (5) The architect example is maybe a little weird. I have no knowledge of architecture and so have no way to dispute your example as being appropriate or not.
    (6) is an attempt at making some point but if you view what you wrote as "Literate Music" source and were able to extract out and seperate the score from the notes then, well, I really don't see anything that outrageous.
    (7) Doesn't make sense. LP is not about writing software in English but putting the English in with your code. What is output is well formatted English and, ultimately, more maintainable code. You seem to want to create a strawman of "English is not a programming language"?

      You've either failed to read the post, or are just being deliberately obtuse in order to dismiss it.

      The only one I'l come back at you on is 7: If the interleaved prose is not a secondary attempt to explain the algorithms and operations of the code it interleaves, why it is interleaved?

      If it isn't an attempt to re-describe the code--and both your example and those in Knuth's original paper show otherwise--then there is no logic at all in keeping with the code, let alone interleaving it.

      And the answer of course is that it is an attempt to describe the code. And syntax, used to describe algorithms, is programming by any other name. So now you have two descriptions of the same thing.

      • One in a specifically designed, targetted, concise and precise language with clearly defined syntax and semantics.
      • The other, a never designed, always changing, variably written, spoken and interpeted, verbose, imprecise language with capricious syntax and multiple semantics.

      And both need to be maintained and synchronised.

      Maybe if you are writing a book or academic paper, there is some merit, but for production code...it simply does not make sense. To me at least, but whatever floats your boat.

      I will say that having been there and done that--and in a much less intrucive form; no reordering or pre-processing--it was a nightmare to both write and maintain. And that was with the benefit of a folding editor that would hide the verbiage at a keystroke.

      By way of example, you're in there fixing a bug. Do you fix the prose or the code first?

      • If you do both, and then the code fix fails, you have to undo both.
      • If you fix the code first, where's the incentive to go back and fix the prose?
      • And if you fix the prose first--there;s no way to verify it.

      Any way you look at it, it's simply extra work for little or no gain.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        By way of example, you're in there fixing a bug. Do you fix the prose or the code first?
        If that's an argument to literate programming, then that's an argument to write any form of documentation or comments.

        Note, I'm not making any argument against or in favour of literate programming.

        dude, why are all your replies a long ranting screed? Do you really expect people will read all that? Keep your posts much shorter and concise. I find it odd when you accuse someone of being obtuse but your reply is twice as long as the original and full of contrived and nonsensical examples. If you don't understand music or architecture why do you think you can use them as examples? wtf?
        BowserUK, have you sought treatment for your Aspberger's yet? Your tiresome lengthy screeds fail to ever have much merit and reek of a manic compulsive disorder. I suggest you try and raise your signal to noise ratio!
Re^2: An Introduction to Literate Programming with perlWEB
by Anonymous Monk on Jan 13, 2009 at 10:31 UTC
    Hmmmm cock and job security ;)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://735910]
help
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2017-10-17 06:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My fridge is mostly full of:

















    Results (218 votes). Check out past polls.

    Notices?