Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Our documentation sucks

by jimt (Chaplain)
on Apr 12, 2007 at 21:31 UTC ( #609759=perlmeditation: print w/ replies, xml ) Need Help??

First of all, I will lead with my standard disclaimer and say that I am thoroughly, thoroughly, hopelessly guilty of this as well. So I'm also yelling at myself. I have some half though out ideas, but am mainly just waxing philosophic.

In short, I think our documentation sucks. "Our" can be interpreted as desired - the perl community, the open source world, IT in general. I think everybody's guilty.

I also think this is the biggest stumbling block to the use of code. Everybody always preaches that code should be re-used - use the standard libraries, use the stuff on CPAN, use the existing neato toolkit. It's better that writing your own, there are other solutions, blah blah blah blah.

The problem is, a lot of the time, it's easier to write your own solution than to try to decipher the documentaiton on an existing solution. Sure, it might solve your problem. It might do what you need. It might be better than what you'd come up with. But do you always have the time to look into it? At the end of the day, is it more satisfying to spend your afternoon writing your cool new widget, or decyphering the docs for something that already exists? I feel like I've accomplished more when I write a library that works than when I've just spent my time learning how to use one. Libraries should be easy, not difficult.

Side note - I'm also assuming that there exists a gizmo to do what you want to do. Further, I'm disregarding the fact that you may write a superior one or come up with a novel approach or whatever. Both of those can come up in reality, but we're ignoring 'em for the moment.

Here's a personal pet peeve (which, again, I'm quite guilty of) - modules that have "documentation" that consists of little more than lists of method names. So you know how to create an object and can dig through a list of all of its attributes....And then what? How do you put it all together?

I want to see use cases. I want to see examples. I want to see lists of things that I can cut and paste into place to solve my common problems. I don't care about long, expository paragraphs. I don't care about lists of methods (yet). I just want something I can copy out, paste in, tweak a little, and use.

Libraries are supposed to save me time. So save me time!

Compare, and tell me which one looks easier to follow:


NAME

Kool::Widget is a Widget that does Kool things

AUTHOR

Me

SYNOPSIS

my $kool = Kool::Widget->new(); my $frobnicated = $kool->frobnoz;

DESCRIPTION

Kool::Widget is a widget for doing kool things, such as frobnication, bazification, and fooination. Kool::Widget was born out of a desire to see more frobnification (as well as frobnication!) on the internet and because I didn't feel that the existing tools properly operated in this capacity.

It is an OO module that uses standard perl concepts and idioms to prevent your frobnification in an easy manner.

ATTRIBUTES

  • frob
  • bingo
  • boozle
  • horatio
  • gilpher
  • potash

METHODS

  • frobnoz
  • bazify
  • fooinate
  • petrify
  • set_current_time

Alternatively, how about this:

Kool::Widget - a toolkit for frobnication, bazification, and fooination, by Me.

EXAMPLES

To frobnicate

my $kool = Kool::Widget->new( 'weasels' => 77, 'pigeons' => 0, 'heartburn' => 'yes', ); #get back an arrayref of hashrefs. my $frobnicated = $kool->frobnoz( 'with_daylight_savings_time' => 0 ); #see what you've got print Data::Dumper($frobnicated);

To bazify

my $kool = Kool::Widget->new( 'weasels' => 77, 'pigeons' => 0, 'heartburn' => 'yes', ); #Prep the widget, throws an exception if it fails. $kool->petrify('input_file' => '/path/to/file'); #returns a hashref my $baz = $kool->bazify(); #see what you've got print Data::Dumper($baz);

ATTRIBUTES

  • frob
  • bingo
  • boozle
  • horatio
  • gilpher
  • potash

METHODS

  • frobnoz
  • bazify
  • fooinate
  • petrify
  • set_current_time

"But Jim," you protest, "This example doesn't make any sense! You made up methods and attributes and concepts! I don't follow!" My retort is you shouldn't need to follow. You've hunted down something for frobnification, so you already know the basic concepts, and I want you to concentrate on the concept presented instead of pointing out a bug in my XML parser.

"But Jim," you protest, "You just added a few more examples and reversed the order. This doesn't change anything!" I submit that it does:

  1. It changes the emphasis, both for the author and the user. The first, most important part of documentation should be how to use it and get people going as fast as possible. It should be the first thing people see, and should be the first part of the docs you write. Don't bury a few examples at the end. Write them first. You're using your module, adapt the code you have that uses it to be your documentation!
  2. Extra info is at the end, where it should be. Most of your users probably aren't going to need every single feature in your module. They're not gonna hit every single method, or use every single attribute. They just want to do one or two things to do their jobs. So show 'em how, but let 'em look up the rest.
  3. Methods and attributes can now be targeted. In those examples up above there, your user probably won't know what the weasels and pigeons and heartburn flags are for. Or what the input_file going into petrify is. But they can see immediately from the example that they're important things, so they can head back to the docs and immediately look up what they are. No more sifting through a list of all methods or attributes trying to figure out what to do - now you just grep through it to the one or two items you know you need to care about.

"But Jim," you protest, "This is completely throwing out the established perl documentation standards! NAME! SYNOPSIS! DESCRIPTION!" And I reply that you're absolutely right. I think we can do better.

Another note - gimme tools! If I can download something and run a little contraption you've built to save me more time, that makes it that much easier for me. For example, take the plethora of OO systems that are out there - all of us should be providing tools to generate modules. You have a database abstraction layer? Give the user a tool to automatically generate a module that maps to his database table. It seems like everyone that I've spoken to that uses an ORM package writes a tool to do that anyway after they get sick of creating cookie cutter modules, so provide it! I can build my modules using your contraption, I can copy and paste your example, and then I can get my job done. I'm a happy camper.

So now what? Well, personally I'm going to try and update my documentation to look more like this as I update my libraries, modules, whatever. I'm not going to race out and just change the docs, but with the next update? Yeah, I'll do it then.

Everybody else? I dunno. Did I make any sense? Did it seem reasonable? Do you maintain any libraries or modules? Give it a try the next time you do an update. At a minimum, I will thank you for it.

Comment on Our documentation sucks
Select or Download Code
Re: Our documentation sucks
by derby (Abbot) on Apr 12, 2007 at 22:06 UTC

    hmmm ... that's why I like open source software you don't have to rely on the documentation, just RTSL.

    -derby
Re: Our documentation sucks
by GrandFather (Cardinal) on Apr 12, 2007 at 22:11 UTC

    I don't think the need is for Yet Another Documentation Protocol, but simply for better and more consistent use of the current protocol. I heartily agree that an example that does exactly what I want so I can cut and paste it is ideal, but how does the module author know what nefarious purpose I may may find for his wonderful module?

    It's not the format of the documentation that is the problem, it is the documentation itself. Not everyone can write good documentation just like not everyone can write good code. And just because someone can write good code doesn't mean that they can document it well. All the rules and format changes you like are not going to get someone who is incapable of writing good documentation to write good documentation, or even provide comprehensive and comprehendable examples.

    There are some excellent examples of good documentation on CPAN (MIME::Lite for example) and many examples of poor documentation and quite a few examples where the problem domain is so big that no documentation is ever going to address everyone's needs (XML::Twig comes to mind).

    At the end of the day this just isn't a fixable problem, unless of course you are volunteering for the job? :)


    DWIM is Perl's answer to Gödel

      I'm less concerned with some new protocol or format or something than just better docs. But that was my idealized example.

      As for the nefarious purpose argument, I look at that as a future use thing. I want better docs so most users get their foot in the door, not so experienced hackers bend the module to their will. Get people using the module, then they'll try to worry about how to enhance it in the future.

      Most of the good use cases up front, something that can be cut 'n pasted, and that'll take care of most things. By all means include the docs to hack it like crazy afterwards, but most docs seem to skew towards that at the expense of the n00bs.

      At the end of the day this just isn't a fixable problem, unless of course you are volunteering for the job? :)

      I believe that AnnoCPAN isn't there for nothing :-)

      Not everyone can write good documentation just like not everyone can write good code. And just because someone can write good code doesn't mean that they can document it well.

      ++ just for that quote!

      I don't think the need is for Yet Another Documentation Protocol, but simply for better and more consistent use of the current protocol.

      I agree. Personally, I think the synopsis ought to show at least some cases, not be a dumb list of how the constructor works and what methods can be called.

      More generally, I think the standard is for people write cookbooks that give more in depth use-cases.

      Of course, it helps if the main Pod mentions that a cookbook exists, too.

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: Our documentation sucks
by chromatic (Archbishop) on Apr 12, 2007 at 22:26 UTC
    The problem is, a lot of the time, it's easier to write your own solution than to try to decipher the documentaiton on an existing solution.

    Maybe it's the types of questions PM has had lately, but I believe instead that a lot of people are too lazy to try to use a module than to write their own code.

    No amount or quality of documentation will fix the line of thinking that, for example, "It's a good idea to print HTTP or CGI headers directly, even if they're likely malformed and incorrect, because CGI.pm must be big and bloated and slow."

    Only experience will fix that, and often the painful kind.

    Still, I agree that explaining the most common operations in the synopsis at the start of the documentation can help.

      No amount or quality of documentation will fix the line of thinking that, for example, "It's a good idea to print HTTP or CGI headers directly, even if they're likely malformed and incorrect, because CGI.pm must be big and bloated and slow."

      I disagree. Here, how do I print out an HTML header? Easy!

      print "Content-type:text/html\n\n";

      "It doesn't work,", you say? "Baloney," I reply. It works just fine for me. Why should I go learn some module and download and import it just to duplicate what I did in one line? Especially when I need to root through the docs. It doesn't affect you, it's not your code, and it works just fine. Why do I need a module?

      Now, I'm playing devil's advocate. I use CGI.pm, don't get me wrong, but I don't remember how to print out a header without looking it up. I use it buried deep inside a black box so I never actually see the call to the header routine.

      A quick glance at the docs yields a pretty decent section -

      print header; -or- print header('image/gif'); -or- print header('text/html','204 No response'); -or- print header(-type=>'image/gif', -nph=>1, -status=>'402 Payment required', -expires=>'+3d', -cookie=>$cookie, -charset=>'utf-7', -attachment=>'foo.gif', -Cost=>'$2.00');

      Good docs, props to Lincoln. But, to the new user, it doesn't work. If I go back to my code and type in:

      use CGI; print header;

      Nothing happens. Yes, yes, strict tells me about the bareword, and maybe I know enough to try guess to import the header function into my namespace (use CGI qw(header)) to make it work, but maybe I don't. A quick grep in CGI's docs found that "CREATING A STANDARD HTTP HEADER:", but it didn't explicitly tell me to import the function.

      The case can be made that it's just being lazy ("Go read all the documentation!"), but it's still a barrier to entry. One line to print it out by myself or round trips to the docs to find out how to do it, then to find out that I need to import another function. Maybe I should've just looked at the synopsis and tried importing :standard, but maybe I didn't know that. If the example block started off with "use CGI qw/:standard/", I would've been going immediately, and arguably be happier to use the module.

      When I finally get it working, I see that my output is:

      Content-Type: text/html; charset=ISO-8859-1

      I could've just printed that out myself, right? Why wasn't I told to just add on the charset=ISO-8859-1 bit to my manual print line? Instead I was sent to a new module to read documentation in multiple places to get something that I see no benefit in?

      Yes, this example's quite contrived since it's so simple.

      But I stand by my premise - if the docs have quick, easy to find, complete, and correct examples, people are going to be more likely to use 'em. Point them in the right direction for the module, and they can get up and running immediately instead of trying to learn something additional.

      And before people start claiming I'm against learning modules, I'm not. I just want people to get up and running with a module in the minimal case as fast as they can. Then they can learn additional features once they've solved the immediate problem at hand.

        That's odd. When I wrote my first CGI script using CGI I copy and pasted the synopsis sample, edited it a little and it worked straight off. It may be because the first couple of "interesting" lines in the sample are:

        use CGI qw/:standard/; print header,

        which, without reading the comprehensive documentation, DWIM. Hey, maybe some of this CPAN documentation isn't so bad after all. ;)


        DWIM is Perl's answer to Gödel
        "It doesn't work,", you say? "Baloney," I reply. It works just fine for me.

        Are you sure it works? I'm pretty sure it violates the relevant RFC for HTTP headers, and thus Postel's law.

        Check your line endings.

        You know, there's plenty of things to criticize about CGI.pm, but the docs are not one of them. It has an excellent synopsis section. Many people essentially learned CGI programming from the CGI.pm docs.

        And look at your header example: you didn't know that adding the charset to the end helps prevent cross-site scripting attacks, but CGI.pm did, and if you had used it you would have gotten the benefit of that knowledge for free.

      I just knew you were going to use me as an example (even though I don't see how it has anything to do with documentation): Should I just print my own HTTP headers?

      There are 30+ replies to my thread, including 4 from you. There are around 5 that attempt to answer the original question, with the rest just pointing out that it's a stupid question or using bizarre hypothetical situations to demonstrate its uselessness.

      This thread now has more relevant discussion to my question than that one though, so I guess I should be thanking you. I just wonder why you decided not to mention headers at all in that thread and instead went on a tangent about benchmarking.
        I just knew you were going to use me as an example (even though I don't see how it has anything to do with documentation)...

        That's because I don't think the reason people avoid modules is because of poor documentation. CGI.pm has reasonably good documentation. (I say "reasonably" because it has a lot of documentation, and it's really difficult to arrange that in a perfectly clear and obvious way.)

        I just wonder why you decided not to mention headers at all in that thread and instead went on a tangent about benchmarking.

        I didn't think you cared about correctness, merely "wastefulness".

        Update: I don't mean to sound offense, and I realize this can come across that way. I apologize. What I meant to say was that by the point in the conversation where I started, it sounded like the conversation had already gone off on a tangent and you believed that the best reason not to use a module was due to its weight. In my experience, that's rarely a wise choice, especially with regard to network programming.

      There is one additional benefit when a programmer writes a documentation for her code - she becomes exposed to what is difficult to explain. It is very easy to forget about the complexities once you have them internalized - so this can be quite valuable feedback for the programmer for her design of the interfaces.
      ... but I believe instead that a lot of people are too lazy to try to use a module than to write their own code.

      When I see a module with really poor docs (either too little, or too disorganized, or what have you), the first thing I think is, "Is it really going to be worth it for me to slog through this? Can I reasonably expect that the design and the code is going to be any better than the docs in front of me right now?".

      As a general guideline, for better or worse, people use first impressions to save them a lot of time and/or trouble in the long run. A module's docs are my first impression of said module.

        I agree with you. I meant the tendency not to use any module, not the analysis of a particular module and the decision not to use it in specific.

Re: Our documentation sucks
by jhourcle (Prior) on Apr 13, 2007 at 11:56 UTC

    There are a few problems with documentation. Here are a few that I see coming up time and time again:

    1. You need to know the audience for the documentation. Sometimes, it's better to write more than one set of documentation, rather than try to write for multiple audiences in one document. (eg, first time users, more advanced users, future maintainers, etc.)
    2. Code maintainers are too close to the problem. Because they know the system intimately, it may be difficult for them to explain it to a newcomer without using jargon. The more complex a problem the program is trying to solve, the more likely this will happen.
    3. Using jargon or concepts without explaining them. This might be fine for advanced users, but when you use terms that aren't understood by the audience, they're not going to be happy with the documentation. If it's a completely unfamiliar term, the person can quickly realize it's something they need to look up; if it's a term that's being used in uncommon ways, it may confuse the reader. (and what if it's just a uncommon usage for some audiences?)

    Anyway, I propose the following -- keep two sets of documentation. Think of the first one like an API -- what the program is expected to do. You can even write this one before writing any code. The second set of documentation is for the code maintainers and/or advanced users, and needs to be updated whenever the code is. (the other may not need to change every time). Each set of documentation should contain a reference to the other one.

    When people complain about the documentation, and it's something that's not there, you have to ask yourself why they didn't find the relevent section -- look at the terms they used to describe their issue, and the terms used within the actual documentation. Yes, there are some lazy users out there, but if you dismiss every user as a dumbass, you'll never improve the documentation. (and hopefully, better documentation means fewer people bugging you with stupid questions)

    Update: I agree with j3 on non-pod comments for the maintainers ... I'm still not sure where the best place is for docs for advanced users. ... But with the mention of cookbooks I wanted to mention something else -- anything that goes into a cookbook should have a test made for it. (so that you can make sure you don't break your cookbook examples down the road). Is there a market for a script to generate cookbook files from tests or visa versa? (or generate them both from something else?)

    Oh -- and as for the design doc -- I tend to keep an 'overview' doc that explains what the general goals of the project are, and a timeline of major feature additions, and major features that we're considering. It tends to be a more executive level document, to justify our project's funding, though. Something similar might be useful for other audiences, though.

      I agree jhourcle, but just to make things more concrete:

      1. The docs for your module's users go in your module's POD. I like xdg's suggestion of including a separate cookbook.pod as well. A tutorial.pod can be included too (since the main module's POD is sometimes more of a user's reference manual than a tutorial).
      2. The docs for maintainers (or advanced users who might want to tinker) just go into the code as regular comments, since maintainers will already be looking at the code anyway.

      And actually, there's a third kind of documentation that you might provide: A design doc. This can go in a separate design.pod file.

      I think of it this way:

      • As a new user, I'll read the tutorial to learn to use the module.
      • As a novice, I'll use the cookbook and reference manual as-needed.
      • As an advanced user, I'll dig into the module's code and will be grateful to read the comments therein.
      • As any of the above I might want to read the design doc to get a better idea of how the module was put together (which objects/functions use which), why you made the design choices you did, how I might go about extending it (say, with a plug-in), etc.
Re: Our documentation sucks
by itub (Priest) on Apr 13, 2007 at 13:39 UTC
    In general, I've had very good experiences with the documentation of CPAN modules. It almost always starts with a synopsis that includes a common and simple use case that you can tweak as desired (there are exceptions, of course). I've had a much harder time with documentation for some other languages (including some commercial ones), which simply consists of lists of methods or subroutines, without anything to tie them together or to tell you where to start.
Re: Our documentation sucks
by j3 (Friar) on Apr 13, 2007 at 15:44 UTC

    jimt, In general, if you think a module could use better docs, fortunately or unfortunately, the usual road to take is to learn how to use said module using whatever tools are at your disposal (existing docs, mailing list, communications with the author/maintainer, PerlMonks, elbow grease (looking at the source)), and then go ahead and send the maintainer a patch containing proposed improvements to the docs.

    If you really can't figure out how to use the module, and the author is unresponsive, tell the community about it on cpanratings.

    Likewise, if the author was helpful, accepted your patch, and the module turned out to be a diamond in the rough (good code, bad docs), you might tell the community about it on cpanratings.

    Incidentally, I just read this morning that version 0.0.3 of the Perl6::Perldoc suite is now on CPAN. I'm very excited to try this out. Thank you Damian!

Re: Our documentation sucks
by wjw (Deacon) on Apr 18, 2007 at 02:31 UTC
    I am with you all the way on this jimt. My delve into the land of XML has been nightmarish. It really does me no good to have module docs that point back to the W3C standard and says "read 'em". An XML file is not that tough to understand, and neither are the rules that make it good or bad. Many of the docs seem to assume one has the time to research the guts of the underlying technology. If I had that kind of time, I would in fact write my own module! Use cases are badly needed. I admit I am not the consummate programmer, and never will be. But when I can write my own mod that finds things in an XML file using regex's faster than I can figure out the module that is supposed to do the work for me, there is a problem.

    I want to be clear that I am not faulting those that put all the work into writing these modules. I am frankly astounded at what it must have taken to accomplish this, and I wish I was that damn good. The only reason I pick on the XML stuff is because it is what I have been banging my head against for some time now.

    Thanks for saying what I have been thinking for a while jimt.

    I also recognize the value of the points that were made by others, but I gotta lean towards the validity of jimt's post.

    ...the majority is always wrong, and always the last to know about it...

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://609759]
Approved by friedo
Front-paged by neniro
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (11)
As of 2014-09-30 18:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (381 votes), past polls