> .... Until now, I have been pretty wedded to
> MS Word. I am researching directions towards 'single sourcing'. Fyi,
> if it helps, I have also been a programmer for many years, and am
> pretty good at VBA.
>
> I have to look into this since I can now no longer ignore the need of
> my client's documentation to be able to hold multiple editions of a
> document (corresponding to multiple current configurations of a
> product) within a single source. (The client is trying to ignore the
> need, but I can't...) ...
>
> Fortunately my client's PC now moved up to Word 2003, and I
> understand that Word 2003 can import and export its documents as XML
> (actually Word ML). So I need some pointers on a couple of things
> that could help me leap forward. Here are my (predictable?) criteria:
You've had a lot of good feedback in the last week (while
I've been out of the office). I'll take a little different
tack here....
As you mentioned (in a snipped part), you're really looking
for content management rather than single-sourcing. Moving
documentation to XML is a useful, although not strictly
necessary, step along the way to that goal (although we'll
pretend it is for the moment).
I've recently landed in the same boat -- I currently have to
maintain three different versions of a particular documentation
set, and I'll have to merge them into a single stream early
next year. Using an old-school markup language, such as *roff
or *TeX, you could use a simple source code management system
such as CVS or Subversion. However, when you start using XML,
things get a little more complicated. For example, the following
two snippets are semantically identical to an XML parser (angle
brackets replaced with square brackets to prevent confusing
certain mail readers):
[p]These two snippets [i]really are[/i] identical![/p]
[p]These
two snippets
[i]really are[/i]
identical![/p]
A proper XML-aware manager knows they're identical and does
the right thing with them -- a source control system only
knows about lines and sees them as two different versions.
The other thing that a good XML-aware content management
system should do for you is manage all those conditions and
differences for you -- you have to feed the system up front
by telling it what pieces belong to what conditions (one can
assume the default state is unconditional); but once you've
done that, you can tell the system what you want and let it
pull out the right pieces.
I haven't said a word (pun not intended) about DocBook, or
any other XML grammar, yet. Let me draw a simple workflow
here, of what I think your vision is:
Word <-> DocBook <-> CMS
^
|
v
output
In other words, as I see it, you're translating to (and
from) DocBook *only* to work with a content management
system. Have I missed something crucial, or could you
replace DocBook with just about anything else -- say,
even... WordML? Let the CMS worry about conditions, and
just leave everything in the one XML grammar that Word
truly understands.
If I've missed something important, and DocBook (or some
other complex XML doctype) is necessary, I would echo the
advice of David Neeley and others... move to OpenOffice.
The XML that OOo generates has at least *some* structure
(for example, lists are wrapped properly); I've looked
at the XML myself so I know that much. Transforming a
semi-structured document to fully-structured is a LOT
easier than transforming a completely flat (aka "tag
soup") file.
But whatever way you go, you're going to get better
results if the Word docs you start with are thoroughly
and consistently styled. If you have that much going
for you, you're ahead of most of the pack. Frankly, I'm
not convinced that the vast majority of document
processing applications really need more than that.
--
Larry Kollar, Senior Technical Writer, ARRIS CPE Products
"Content creators are the engine that drives
value in the information life cycle."
-- Barry Schaeffer, on XML-Doc
"This communication is for use by the intended recipient(s) only
and may contain information that is privileged, confidential and
exempt from disclosure or unauthorized use under applicable law.
If you are not an intended recipient of this communication, you are
hereby notified that any dissemination, distribution, copying or
use of this document or information contained in this document is
strictly prohibited. If you have received this communication in
error, please notify me by reply e-mail or fax, permanently delete
this communication from your system, and destroy any hardcopies you
may have printed. *This communication is covered by the Electronic
Communications Privacy Act, 18 U.S.C. 2510-2521."
[Non-text portions of this message have been removed]