Peter Ring wrote:
> Please also include out-of-band indexterm entries. I've had a lot
> of utility from DocBook indexterm's zone attribute:
>
> http://www.docbook.org/tdg/en/html/indexterm.html#d0e96507
>
> The generel idea is that you should be able to annotate content
> with indexterm's without changing the content. Quite often, the
> index authoring is completely separate from the content authoring.
>
> Also be sure to include attributes or elements that allow hinting
> about the purpose of the indexterm, e.g. subject index vs. author index.
Again, interesting info.
The feedback I've been getting, from here and from Jon Jermey and
David Ream (who is one of the top experts in the area of XML and
indexing) is that the issue of embedded indexing is fairly complex as
one begins to peel away the layers. It does seem premature to
implement anything in OpenReader, at least in version 1.0, for
embedded indexing.
Interestingly, no one has apparently put together a comprehensive
generic XML vocabulary for embedded indexes (and could be included in
any XML document with proper namespacing). So this leaves open some
sort of standardization in the area. Maybe this could be done in OASIS
with sponsorship by ASI (asindexing.org)? (I'll be talking with David
Ream next week for his feedback on this proposal.)
Anyway, here's my preliminary list of candidate requirements the
embedded indexing vocabulary should meet:
1) handle multiple indexes in the same publication. (that is, each
embedded index term is to be applied to one or more indexes when
generated/compiled by the reading system.)
2) define the range or scope of the index item (and it may have to
cross the natural hierarchy of the XML document -- Lee? :^) )
3) handle hierarchical terms (many indexes have 2 and even more
levels)
4) support "see", "see also", etc. (cross-referencing)
5) Peter's suggestion of "out-of-band" indexing. (Seems to imply using
XPointer to define the target and the associated scope/range.)
6) support "sort as" information (to tell the reading system how to
order the terms when the index is generated/compiled -- necessary
since some terms may be in other character sets than the primary
character set.)
Anything else?
And anyone here interested in being involved with standards work
for embedded indexing should anything get off the ground? (If so,
contact me in private.)
Jon Noring