Skip to content

Reviewing toolchains — publican and /cvs/docs

With all the attention on documentation toolchains in Fedora Docs, I wanted to provide a quick scope on the differences between the toolchain we’ve been using over the years, in /cvs/docs, and the newcomer, publican. The goals are many: introduce a new toolchain to users of the old tools; show the many similarities and few, important differences; provide some historical record/context for the toolchains; and help anchor our discussion in a few important areas as we are reviewing this new toolchain for use in Fedora Docs. One lesson looms, as well: if we’d finished making Fedora Docs toolchain an installable package, we would have had people going gaga over it years ago, too.

Part of the reason I am writing this is purely selfish. We’ve put a lot of work into the Fedora Docs toolchain over the years, while publican was being developed in parallel inside of Red Hat. In the end, we have two products that are very much the same, with some fundamental differences. I want to dispell the myth that the Fedora Docs toolchain was broken, unusable, or not useful. Rather, it is a long-standing, robust toolchain that has produced Fedora documentation for eight releases with no limelight shined on it. After continuously answering, “Cool that publican does that, yes, Fedora Docs toolchain does that too,” I thought I would just get it all down in writing one time.

The overview of publican capabilities is coming from the main publican documentation, available through the project hosting pages. As an example of the difference in toolchains, I am pulling the /cvs/docs capabilities from the Makefile, which is the sole docs for this toolchain. As an example in similarities, you could learn all that the publican documentation has from its Makefile, too.

Primarily, both tools do the following same tasks, using the same or similar tools. Sometimes the versions or methods used are different. This review does not dive into the specific command options called by the tools.

  • Take in DocBook XML 4.4+ that is constructed into individual, full, proper XML files following a specified pattern to organize the files:
    • One top-level XML file that calls in all others using XIncludes.
    • Use of a generated XML file that is needed for RPM production.
    • Zero or more individual XML files that are typically top-nested (chapter, appendix, preface).
  • Validate the XML using xmllint.
  • Run that DocBook XML through a DocBook XSL stylesheet.
  • Use xmlto to convert the XML to a number of target formats, sometimes through intermediate formats that require additional processing:
    • HTML is created directly using xsltproc to process the XML with an HTML-specific XSL stylesheet.
    • Other text markup formats such as plain text and rich text (RTF) are generated using xsltproc and the appropriate XSL stylesheet.
    • PDF is created by generating a formatted object (FO) file that is then processed with a FO processor. Default in Fedora has been the venerable passivetex package for processing the FO into a PostScript (PS) or PDF file.
  • Create packages of documents that can be viewed through the GNOME and KDE help system, as well as installed to view from the default Documentation menu.
  • Both tools support using Apache FOP to process FO files into PS/PDF, which is only recently available in Fedora running under OpenJDK. fop is called by xmlto instead of xlstproc to handle the FO file. This should be available natively in Fedora 9.
  • The “Authoring and Publishing” package group is required for minimal DocBook processing. Other packages may be required, such as for processing PO/POT files.

There are a few differences between the toolchains, which seems to be the difference between interest and adoption of the toolchains by people outside of Fedora Docs. Just as people prefer to install a package instead of tar -xzvf foo.tgz && cd foo/ && ./configure && make", they are preferring to install publican to do documentation instead of checking out the components from Fedora Docs CVS. This is happening despite the fact that publican does not work in Fedora before the upcoming Fedora 9, and continues to have a few problems in F9/rawhide. People would prefer to help fix a package with broken software, which is a cornerstone of Fedora anyway.

  • One is a package you install (publican), the other is a tree you checkout from CVS (Fedora Docs toolchain).
  • To create a new, empty book in publican you run create_book, enter the directory, rename some files, and begin writing. In the Docs toolchain, you check out the ‘example-tutorial’ module, copy it to a new directory, rename some files, and begin writing.
    • The create_book command has options that let you name the book at creation, etc.
  • There is already a growing userbase for publican only a few months after its release, while Fedora Docs toolchain has apparently remained a tool only for Fedora Docs. 🙁 We’ve always presumed that people were using our toolchain, but there is no evidence of that, such as bug reports and questions in IRC. In contrast, publican has generated more bugzilla activity starting with package review, and multiple people have come by #fedora-docs for support in using publican for their own work.

Probably the most interesting thing for Fedora Docs is that we can use publican as an upstream-supported toolchain rather than rolling and maintaining our own tools. The fact that the tools are similar, solving problems in fundamentally the same ways, is a testament to the strength of the underlying applications (xmlto, xsltproc, fop) both toolchains use.