Restructuring BLFS XML

Bruce Dubbs bdubbs at swbell.net
Tue Jun 8 16:46:16 PDT 2004


I've been looking at how to restructure the BLFS XML code for the last 
couple of days.  One area of concern is how to handle entities.  
Restructuring with {xi:include...}  statements will eliminate a lot of 
entities, but there are still about 2000 left.  There are generally six 
entities for each package.  For example:

{!ENTITY curl-version "7.11.2"}
{!ENTITY curl-download-http 
"http://curl.haxx.se/download/curl-&curl-version;.tar.bz2"}
{!ENTITY curl-download-ftp " "}
{!ENTITY curl-size "1.4 MB"}
{!ENTITY curl-buildsize "26.6 MB"}
{!ENTITY curl-time "0.43 SBU"}

Right now, LFS puts all their entities in one file (which is *much* 
smaller).  If we did that, this is what I see:

Pro: 
1.  All the entities would be in one place and would make finding and 
maintaining a particular entity easier via a grep or editor serach of a 
single file.
2.  Entity use would be easier because all the entities would be known 
for each file.
3.  Updating individual packages to the new format would be easier.

Con:
1.  The file is big (about 100KB, 2000 lines).  Editors (human) may not 
want to manipulate a file that large.
2.  Book rendering would probably be slower because the entity file 
would have to be processed for each file--probably about 400 times.

The header for each file would look like (s/braces/angle brackets/):

{?xml version="1.0" encoding="ISO-8859-1"?}
{!DOCTYPE sect1 PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
   "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
  {!ENTITY % general-entities SYSTEM "../general.ent"}
  %general-entities;
]}
 
<sect1 id=...>

The only thing that needs to be changed is the path to general.ent 
depending on the depth of the file in the document tree.

One alternative to doing the above is to have multiple (smaller) entity 
files and include the ones needed for each file in the header.  350-400 
files are certainly too many so we would have to have a scheme for where 
we put the entities.

Another alternative may be to eliminate many of the entities that are 
used once (http, ftp, size, buildsize, time) and just put them directly 
in the xml for the package.

I'd like to get opinions on the way to go.  Then we could partition the 
update workload and get the job done quickly.

  -- Bruce




More information about the blfs-dev mailing list