|
The most important good documentation practice is to actually
write some! Too many programmers omit this. But here are two good
reasons to do it: Your documentation can be your design document.
The best time to write it is before you type a single line of code,
while you're thinking out what you want to do. You'll find that the
process of describing the way you want your program to work in natural
language focuses your mind on the high-level questions about what it should
do and how it should work. This may save you a lot of effort later. Your documentation is an advertisement for the
quality of your code.
Many people take poor, scanty, or illiterate documentation for a program as
a sign that the programmer is sloppy or careless of potential users' needs.
Good documentation, on the other hand, conveys a message of intelligence
and professionalism. If your program has to compete with other programs,
better make sure your documentation is at least as good as theirs lest
potential users write you off without a second look.
This HOWTO wouldn't be the place for a course on technical writing
even if that were practical. So we'll focus here on the formats and tools
available for composing and rendering documentation. Though Unix and the open-source community have a long tradition of
hosting powerful document-formatting tools, the plethora of different
formats has meant that documentation has tended to be fragmented and
difficult for users to browse or index in a coherent way.
We'll summarize the uses, strengths, and weaknesses of the common
documentation formats. Then we'll make some recommendations for good
practice. Here are the documentation markup formats now in widespread use among
open-source developers. When we speak of "presentation" markup, we mean
markup that controls the document's appearance explicitly (such as a font
change). When we speak of "structural" markup, we mean markup that
describes the logical structure of the document (like a section break or
emphasis tag.) And when we speak of "indexing", we mean the process
of extracting from a collection of documents a searchable collection
of topic pointers that users can employ to reliably find material
of interest across the entire collection. - man pages
The most most common format, inherited from Unix, a primitive form of
presentation markup. The man(1) command provides a
pager and stone-age search facility. No support for images or hyperlinks
or indexing. Renders to Postscript for printing fairly well. Doesn't
render to HTML at all well (essentially as flat text). Tools are
preinstalled on all Linux systems. Man page format is not bad for command summaries or short reference
documents intended to jog the memory of an experienced user. It starts to
creak under the strain for programs with complex interfaces and many
options, and collapses entirely if you need to maintain a set of documents
with rich cross-references (the markup has no support for
hyperlinks). - HTML
Increasingly common since the Web exploded in 1993-1994. Markup is
partly structural, mostly presentation. Browseable through any web browser.
Good support for images and hyperlinks. Limited built-in facilities for
indexing, but good indexing and search-engine technologies exist and are
widely deployed. Renders to Postscript for printing pretty well. HTML
tools are now universally available. HTML is very flexible and suitable for many kinds of documentation.
Actually, it's too flexible; it shares with man page
format the problem that it's hard to index automatically because a lot of the
markup describes presentation rather than document structure. - Texinfo
Texinfo is the documentation format used by the Free Software
Foundation. It's a set of macros on top of the powerful TeX formatting
engine. Mostly structural, partly presentation. Browseable through Emacs
or a standalone info program. Good support for
hyperlinks, none for images. Good indexing for both print and on-line
forms; when you install a Texinfo document, a pointer to it is
automatically added to a browsable "dir" document listing all the Texinfo
documents on your system. Renders to excellent Postscript and useable
HTML. Texinfo tools are preinstalled on most Linux systems, and available
at the Free Software Foundation
website. Texinfo is a good design, quite usable for typesetting books as well
as small on-line documents, but like HTML it's a sort of amphibian -- the
markup is part structural, part presentation, and the presentation part
creates problems for rendering. - DocBook
DocBook is a large, elaborate markup format based on SGML (more
recent versions on XML). Unlike the other formats described here it is
entirely structural with no presentation markup. Excellent support for
images and hyperlinks. Good support for indexing. Renders well to HTML,
acceptably to Postscript for printing (quality is improving as the tools
evolve). Tools and documentation are available at the DocBook website. DocBook is excellent for large, complex documents; it was designed
specifically to support technical manuals and rendering them in multiple
output formats. Its drawbacks are complexity, a not entirely mature
(though rapidly improving) toolset, and introductory-level documentation
that is scanty and (too often) highly obfuscated.
In July of 2000 representatives from several important open-source
project groups (including GNOME, KDE, the Free Software Foundation, the
Linux Documentation Project, and the Open Source Initiative) held a summit
conference in Monterey, California. The goal was to try and settle on
common practices and common documentation interchange formats, so that a
much richer and more unified body of documentation can evolve. Concretely, the goal everyone has in view is to support a kind of
documentation package which, when installed on a system, is immediately
integrated into a rich system-wide index of documents in such a way that
they can all be browsed through a uniform interface and searched as a
unit. From the steps GNOME and KDE have already taken in this direction,
it was already understood that this would require a structural rather
than presentation markup standard. The meeting endorsed a trend which has been clear for a while; key
open-source projects are moving or have already moved to DocBook as a
master format for their documentation. The participants also settled on using the `Dublin core' metadata
format (an international standard developed by librarians concerned with
the indexing of digital material) to support document indexing; details of
that are still being worked out, and will probably result in some additions
to the DocBook markup to support embedding Dublin Core metadata in
DocBook documents. The direction is clear; more use of DocBook, with auxiliary standards
that support automatically indexing Docbook documents based on their index
tags and Dublin core metadata. There are pieces still missing from this
picture, but they will be filled in. The older presentation-based markups'
days are numbered. (This HOWTO was moved to DocBook in August
2000.) Thus, people starting new open-source projects will be ahead of the
curve, and probably saving themselves a nasty conversion process later, if
they go with DocBook as a master format from the beginning.
|