> Are there any opinions on Word to XML conversion tools? I am
> looking for
> names of vendors, experiences good and bad and any
> cost-related data that
> you might have. If you did use a tool, how much post-editing
> was necessary?
Hi Debra -
In most of the Word-to-XML conversion work we have done over the years,
we've stayed with pattern-matching tools such as OmniMark (). But, we can
with some confidence say that the consistency of styles and formats in your
Word documents is the most important factor in determining how much
post-editing will be required - certainly more important than the tool
itself, at least in my opinion.
When you are moving content from Word to XML, you are moving from an
unstructured document to a structured one. The paragraph styles, font
weights, sizes, and color in the Word document all provide clues to the
structure and semantics of that document. The quality of the conversion
results will depend directly on how well you are able to define a mapping
from style/format to structure and semantics and how consistent those styles
and formats are.
To get a good handle on that consistency, we usually break up the conversion
into several steps:
1. Convert the Word document to an intermediate XML vocabulary that captures
the paragraph breaks, styles, and font characteristics.
2. Use standard XML tools (such as XSLT processors) to analyze the
intermediate XML with the goal of detecting trends in paragraph styles and
font characteristics.
3. Based on that analysis, develop a logical mapping between the
styles/fonts and the target XML vocabulary (DTD).
4. Implement that mapping using XSLT to transform the intermediate XML to
the target XML.
Hope this is of some help.
Pete
--------
Pete Beazley mailto:pete@...
ClearlyOnline, Inc. http://clearlyonline.com
XML and XSLT Training & Consulting