Please review for Man-DB changes
Alexander E. Patrakov
patrakov at gmail.com
Thu Oct 23 22:12:50 PDT 2008
DJ Lucas wrote:
> 6.47.2. Non-English Manual Pages in LFS
> Some packages provide UTF-8 manual pages, which previous versions of
> Man-DB were unable to display correctly because the expected (8-bit)
> encoding for each language was hard-coded in the source of Man-DB.
> Man-DB now uses the extension of the directory name in order to
> determine the encoding of the manual pages stored within. If no
> extension exists, Man-DB uses a built-in table (see below) to
> determine the encoding. E.g., because of "UTF-8" in the directory
> name, it knows that all manual pages residing in
> /usr/share/man/fr.UTF-8 are UTF-8 encoded and, according to the
> built-in table, expects all manual pages residing in
> /usr/share/man/ru to be encoded using KOI8-R.
> Linux distributions have different policies concerning the character
> encoding in which manual pages are stored in the filesystem. E.g.,
> RedHat stores all manual pages in UTF-8, while Debian previously used
> language-specific (mostly 8-bit) encodings. Many other distributions
> simply ignore the problem all together. LFS also used the legacy
> encodings in previuos versions of the book. This was chosen because
typo. And also, the text is misleading: it supports the assumption that
now legacy encoding are not used.
> of the ease of configuration associated with Man-DB. Additionally,
Readers won't understand this.
> Man-DB provided support for Chinese and Japanese locales, and limited
> support for Korean, whereas Man did not at that time.
Man does support Japanese, by means of the JNROFF directive.
> In contrast, the setup in Fedora Core expects all manual pages to be
> UTF-8 encoded, and stored in directories without suffixes.
Duplicate information. So we need to agree on the examples of directory
layout that we demonstrate, and their order.
And IMHO, the whole text above (right from the heading) needs to be
reordered. Something like this:
Some packages provide non-English manual pages. They are displayed
correctly only if their location and encoding matches the expectation of
the "man" program. However, different Linux distributions have different
policies (expressed in the choice of the "man" program, its
configuration and patches applied to it) concerning the character
encoding in which manual pages are stored in the filesystem.
E.g., Debian previously required Russian manual pages to be encoded in
KOI8-R and to be placed in /usr/share/man/ru. Now, in addition, their
"man" program searches for UTF-8 encoded Russian manual pages in
/usr/share/man/ru.UTF-8. On the other hand, Fedora stores UTF-8 encoded
Russian manual pages in /usr/share/man/ru and their "man" program
doesn't look into /usr/share/man/ru.UTF-8.
Yes, a significant portion of the text has been thrown away.
> Disagreement about the expected encoding of manual pages amongst
> distribution vendors, has led to confusion for upsteam package
> maintainers. Some packages contain, UTF-8 manual pages, while others
No comma after "contain".
> ship with manual pages in legacy encodings.
At this point, we (as I think) have clearly stated the problem for
After that, we can explain our setup: "Man-DB uses the extension of the
directory name...", including the examples, even though they duplicate
our explanation of the modern Debian setup (not all readers know that
Debian uses Man-DB).
None of the two quotes below should appear in the book.
> Unlike the Man/Groff
> setup in Fedora Core, Man-DB can make very good decisions about the
Only if the user placed the manual pages correctly.
> on disk encoding and present the information to the user in their
> prefered format, without complex configurations.
Man in Fedora Core ships preconfigured, and, due to exclusive use of
UTF-8, there are no decisions to make. The setup is completely
transparent to the end users as long as only prepackaged software is used.
Please stop trying to show that the Debian setup is better. The only
benefits are that it allows for a transition period when UTF-8 and
legacy manual pages coexist in different directories, and that it
requires less patches than the approach from RedHat. I take back my
statement about lengthy configuration, it is invalid with RedHat Groff
(but valid with the upstream Groff).
> Man-DB has, for the most part, made this problem completely
"this problem" refers to something too far away.
> transparent to end users, as long as the manual pages are installed
> into the correct directory.
Not sure if the two quotes above should appear in the book at all.
Above, we discussed the problem for upstream maintainers, while "this
problem" refers to something seen by the end users. Yes, I cheated by
removing the note about the mess present in most distributions, but I
don't see where to reinsert it.
> There may be times, however, where one
> encoding is preferred over the other.
Without examples, this is a meaningless phrase. And I think it is not
the encoding that is preferred, but we prefer one or the other way to
modify the upstream installation process. To see what I mean, try
converting MPlayer manual pages to UTF-8 after unpacking the tarball,
and pretending that this is what upstream provided. You can either
convert back, or move the installed manual pages (not very clean, but we
do it for some binaries anyway), or patch the Makefiles. After seeing
the steps needed to complete each of the ways, you will perhaps be able
to come up with a better phrase.
<snip the script and the table>
> Following LFS's previous policy, if upstream distributes the manual
Not sure if the reference to our previous setup (not policy, as we
couldn't change it!) is a good thing.
The rest is OK.
Alexander E. Patrakov
More information about the lfs-dev