John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

I have on-line documentation for a programming library, and it's showing its age. I want to update it, but I want to redesign the whole system of doing the docs.

Requirement: the entire doc set is a static group of HTML files, so it can be downloaded and browsed without a HTTP server.

However, I can have fancy stuff to generate the files from a more abstract description. That's what I'm thinking about: a Perl program to take docs that are easy to write and generate nicely formatted output and all the redundant stuff like tables of contents.

I wonder if POD gives me enough control? For each function, I want to list certain standard information such as the signatures, header file, and maybe others. Then my prose and code examples. It needs cross-references to other functions or parts of the documentation body.

Some portions though have other information I want formatted with the "formal" part, and that can vary. So... does each file need its own ability to customize the generated HTML in addition to starting from a central (and uniform) specification?

I might want to include a table or other HTML features, so a general escape mechanism is needed.

I want to use a simple markup that's not as hard to type as HTML.

Each specification file would generate one HTML file for all the functions described neatly formatted, another HTML file for the index, and contribute a line to a class index.

I'm wondering if people would want this body of info as XML, also. So maybe it should generate XML and have the browser format it?

Any suggestions, or pointers in the right direction? I don't want a mess, or a development effort that dwarfs the library being documented!

—John

  • Comment on HTML documentation system - design and planning

Replies are listed 'Best First'.
Re: HTML documentation system - design and planning
by traveler (Parson) on Sep 08, 2001 at 02:35 UTC
    I don't know all the capabilities your site has, but you might want to consider docbook. It is an HTML markup that can be exported to PDF, HTML and PostScript (and more).

    Here is a sort of Linux-oriented intro to docbook.

    Some editors let you edit text then export the formatted document as docbook, e.g. abiword. LyX has been touted as a multiplatform solution. More often folks convert or write in XML directly, it seems. I'd at least prefer an XML editor if I couldn't use, say, AbiWord.

    There is also a book which can be purchased or downloaded.

    HTH, --traveler

      You can also check out AxKit (the site has been down as the author is moving but should be up again soon), an Apache perl_mod module. It's written (obviously) in Perl and provides out-of-the-book support for automatic DocBook rendering to HTML (in a couple of forms).

      It uses XSLT and XPathScript to do its dirty work (using the XML::LibXML and XML::LibXSLT modules), which is nice since you can actually easily integrate multiple XML markups (you could generate product specific or task specific ones, like todo lists that are much lighter weight than DocBook) by simply assigning different stylesheets and people looking at the output are none the wiser :-)

      There is also an XML markup aimed at simulating LaTeX (someone else mentioned LaTeX) though its name currently escapes me. I use LaTeX on a pretty regular basis, its fine for things like papers (nothing beats it for mathematical formulae), but it doesn't really have good facilities for web delivered documentation---it was meant for a static print world and it performs that task admirably.

      If you have some time to invest, I would go for DocBook. The html Version of DocBook: The Definitive Guide by Norm Walsh and Leonard Muellner (O'Reilly, 1999) is available for free at O'Reilly's page, for easy browsing, or as a zipped download from docbook.org (html or sgml). Browsing a while through DocBook: The Definitive Guide should give you enough information to decide, wether this kind of encoding is the right stuff for you.

      Hanamaki
      DocBook is not what I'm looking for. It uses SGML markup, and that's just as "bad", if not moreso, than writing in HTML. I don't see any provisions for generating the master tables and index and such, too.

      However, I'm wondering if a two-stage approach might work. I can write a simple Perl tool to take an easily-typed markup and convert to formal XML, and that's all it does. I can bang on that to extend when needed, as I need more features. Anything that doesn't fit the mold can be written directly in XML.

      Then, another tool reads the XML document and produces the set of HTML pages, including the index and stuff.

      —John

        DocBook is not what I'm looking for. It uses SGML markup... .However, I'm wondering if a two-stage approach might work. I can write a simple Perl tool to take an easily-typed markup and convert to formal XML, and that's all it does.

        Yes, basically DocBook was sgml stuff. Fortunately since Docbook Version 4 there is also a XML-DTD and a XML subset of Docbook called Simplified DocBook.
        You could even convert your easily-typed markup to Simple DocBook. I don't want to talk you into some format like DocBook with a rather steep learning curve, but just want to show you an alternative.

        Hanamaki
Re: HTML documentation system - design and planning
by pmas (Hermit) on Sep 08, 2001 at 10:05 UTC
    I've seen interesting web-based collaborative environment (which I definetelly will use for our project documentation - just do not know when to do it...). It's called Wiki and has many free clones.

    I am interested in Wiki clone called TWiki, because TWiki is rather advanced (authentication users by groups, advanced page formatting, version control), nice active development group, and more than hundred instalations).

    See TWiki in action   and   TWiki feature overview slide-show presentation.

    There are many different Wiki clones, I recall something like DolphineWiki or ModWiki, which is static (pages generated on demand), much simpler than TWiki.

    Wiki is much simpler (IMHO) to use for plain users than HTML for text formatting, and WikiWords will link to page with the same name(explained on website). Every member of Wiki community can update these pages, great for documentation. TWiki will send you email that page you care about (and subscribed for) was changed. Version control will allow some admin users to undo changes if needed. TWiki has also "categories", so you can change some set of status fields - so you can use it i.e. for bug tracking.

    However, they do not have XP and voting... :-(

    You have also full source code, so it might give you a nice headstart when creating MonkWiki for your docs... ;-)

    Does somebody around here have experience with Wiki? I know Wiki is kind of "competitor" with Everything Engine in field of free web-based collaboration tools, so I hope it is not blasphemy to mention Wiki here...;-)

    Difference is: WikiPage is like thread in PM. Everybody can update any page, also text of others (Wiki does not support strict "ownership" of node as PM does. So there is smaller amount of pages, and links can by improved in time. Can you image all nodes explaining why use strict linked together? Easier to search. But, when you just adding text to the end of the page, it is less dynamic than threads in PM. Not having to worry about XP is good - and not having XP is bad. So, is better for documentation, but less fun for comunity like PerlMonks.

    As somebody said here: Forget about XP, remember the experience...

    BTW, Wiki-Wiki is "quick" in Hawai-an language.

    pmas
    To make errors is human. But to make million errors per second, you need a computer.

      I've used wiki, too, as a sort of on-going documentation system for development projects. It rocks! The lack of "ownership" of a node is a really good thing, since anyone can go & update the information on any of the nodes. And the "autolinking" it does is pretty nice- you don't have to worry about marking up words that you want to link, because if wiki sees that anything that you typed is a node, it automagically creates a link for you. Anyway, I've used a lot of different (usually crappy) documentation systems (html docs maintained through cvs... whiteboards in hallways...saved emails...). Most are either too much hassle to update or not scalable enough once you've got more than 3 people on a project. Wiki is a pretty cool solution to problems like that. My one problem with it (which might not be a problem with all versions) is that the more information you put in, the slooooooooooower searches get. -- cat
        Thank you for confirming my gut feeling about Wiki.

        Re Slow search: Yes, search on on many static pages might be slow. Solution might be to place all page text into database - DB are usualy good in searching text. Then, you slowered data presentation - because you need to generate every page on request.

        As always, you have tradeoffs. I expect to have a lot of searches, so I prefer my data be optimized for search speed - in database.

        Also, TWIki has "categories", which might speed up searching.

        Our web site should be later released to public, so when it will be up (and open to public), I'll let you know.

        pmas
        To make errors is human. But to make million errors per second, you need a computer.

Re: HTML documentation system - design and planning
by Starky (Chaplain) on Sep 08, 2001 at 06:23 UTC
    Although you are focusing on Perl, I would suggest LaTeX: A Documentation Preparation System. It's a bit obscure, but widely used in academic circles.

    The basic concept is a markup language like turbocharged HTML, but it also includes easily organized documents with sections, subsections, indexes, and tables-of-contents.

    Although there is a learning curve, I've found that I do almost all of my technical documentation in LaTeX and it's been for me by far the most efficient way to do it. I've had some peers scoff at the concept because they've never heard of it or used it or have heard it's difficult to learn (not more than HTML in my experience), but I've also known other people who use it for the same thing I do, and they absolutely swear by it as well.

    It also supports easy export to HTML (latex2html, written in Perl), PDF (pdflatex), PostScript (dvips), and other formats, and produces the most visually consistent and readable documents of any documentation system I know.

    If you are using GNU/Linux, you can find the tetex (the Linux version of LaTeX) RPMs at http://www.rpmfind.net/linux/rpm2html/search.php?query=tetex.

    They also come on the Red Hat installation CDs.

    If you decide to investigate it and have any questions, send me an e-mail. I'd be happy to give you some pointers to get you started.

    Hope this helps :-)

      For people who just want to try it and get a sense of how well it works, I suggest trying to write a document with LyX.
      Actually, I've used (regular) TEX for documentation long ago. I'm aware at how powerful a separate formatting pass, as opposed to WYSIWYG, can be. The obverse side of WYSIWYG is What You See Is All You Got.

      I got a LaTeX book once, but didn't care for it.

      Once upon a time, I decided I'd rather have copious documentation rather than beautiful documentation, and used Word for the project. It's just so much easier to create and edit, esp. if inserting illustrations and such.

        Once upon a time, I decided I'd rather have copious documentation rather than beautiful documentation, and used Word for the project. It's just so much easier to create and edit, esp. if inserting illustrations and such.

        So maybe you should investigate tilly's hint on LyX further. With LyX you get some WYSIWYG and the power of LaTeX. Together with Ghostscript (ps2pdf), pdfTeX, html2latex, latex2html and latex2rtf you will get a powerfull typsetting environment which isn't that difficult to learn.
        But it is still TeX, meaning a typsetting language. I would rather use XML or some other logical markup and then convert it to LaTeX or whatever for typsetting on paper, as pdf or to produce rtf documents.

        Hanamaki
Re: HTML documentation system - design and planning
by perrin (Chancellor) on Sep 08, 2001 at 08:47 UTC
    There are general-purpose things like DocBook, but they're complex and probably overkill for simple documentation. POD is pretty good, although you'd have to cheat to add tables. You can take POD and process it with something like POD::POM and Template Toolkit to generate HTML, XML, or whatever. There's also a module that Stas Bekman uses to generate all of the mod_perl Guide from POD.