The following is a strawman format that implements a few ideas that have
occurred to me from time to time but which I immediately dismissed as not
backwards-compatible.
THIS IS NOT A SUGGESTED FORMAT! I'm not speccing RSS4.0 or anything like
that :)
Rather it contains a few different ideas and maybe one or two of them will
be of use, and I offer them for criticism and consideration.
First the example feed, then my explanation below.
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:dcq="http://purl.org/dc/terms/"
xmlns="http://www.example.com/exampleformatnamespace#">
<Channel rdf:about="">
<title>A title</title>
<description>A Description</description>
<link rdf:resource="http://www.example.com/"/>
<image>
<Image>
<text>Alternative Text</text>
<link rdf:resource="http://www.example.com/"/>
<source rdf:resource="http://www.example.com/images/syndImage.png">
</Image>
</image>
<items rdf:parseType="Collection">
<Item>
<title>A title</title>
<description>A Description</description>
<link rdf:resource="http://www.example.com/item1.html"/>
</Item>
<Item>
<title>A title</title>
<description>A Description</description>
<link rdf:resource="http://www.example.com/item2.html"/>
<dcq:modified><dcq:W3CDTF
rdf:value="2002-09-19T11:50:34Z"/></dcq:modified>
</Item>
</items>
</Channel>
<rdf:Description rdf:about="http://www.example.com/images/syndImage.png">
<dc:format>image/png</dc:format>
</rdf:Description>
<rdf:Description rdf:about="http://www.example.com/item2.html">
<dcq:modified><dcq:W3CDTF
rdf:value="2002-09-08T12:36:01Z"/></dcq:modified>
</rdf:Description>
</rdf:RDF>
The lists of effects each of these ideas would have is largely guess-work; a
place-holder for people here to fill in and little more.
1. Namespace name ends with a hash ("#").
Effect on RDFers:
Handy. Little real effect on compliant parsers, but it is handy and
produces more human-readable URIs.
Effect on XMLers:
Nil.
Effect on hand-rolled XMLers (i.e. people who's code acts on the feed at
text level or byte level):
Nil.
Effect on module designers:
Negliable. Module designers would be encouraged to do the same, but it
would not be required.
2. Note that dcq is used as the namespace prefix, although dcterms is the
one most commonly used. Examples should emphasise that prefixes aren't
normative, and indeed can't be (or else we'll have wars over who gets to use
a particularly handy set of two or three letters!).
Effect on RDFers:
Nil.
Effect on XMLers:
Nil.
Effect on hand-rolled XMLers:
Strongly positive. A potential source of bugs that may not be apparent
until after roll-out will become clear early on.
Effect on module designers:
Strongly positive. No need to worry about bugs caused by someone else
using the same prefix in documentation.
3. Capital letter for class names (<Channel>, <Item>, <Image>).
Effect on RDFers:
Positive. Follows the convention of most other RDF vocabularies. Breaks
the confusing use of some URIs as both predicates and class types (e.g.
http://purl.org/rss/1.0/image currently has two meanings).
Effect on XMLers:
Mixed. Makes it clearer just what the hell is going on :), but only after
some explanation.
Effect on hand-rolled XMLers:
As above.
Effect on module designers:
Mildly positive. If they follow the same convention they will be better
placed to use their module with other RDF applications, but only moderately
so.
4. rdf:about="". This means that in effect the identifying URI of the
channel is either the URI of the document itself, or else the URI determined
from xml:base (since that is how the relative URI "" gets expanded).
Effect on RDFers:
Interesting! Positive. It settles the meaning of the channel, which under
current conventions may describe the feed, or the "home page" of the feed.
It now always means the feed itself.
Effect on XMLers:
Positive. With the value of rdf:about fixed it is becomes a no-brainer.
Effect on hand-rolled XMLers:
As above.
Effect on module designers:
Nil.
5. Order of title, description, and indeed all elements is fixed.
Effect on RDFers:
On input, Nil, they will ignore the fact that it's fixed.
On output mildly negative. A general RDF to RDF/XML serialiser would never
have worked with RSS so this merely increases the requirements on the custom
serialiser already required.
Effect on XMLers:
On input, Nil for some strongly positive for others. Makes it easier to
know what to expect where.
On output, negative for some, strongly positive for others. Puts extra
requirements on how to code a serialiser, but does so in a way that matches
many XML tools.
Effect on hand-rolled XMLers:
As above but the effects will be amplified. In particular it should be a
strong benefit to know that only <title> can come immediately after
<Channel>, etc.
Effect on module designers:
Negliable. All elements from namespaces outside of the namespaces defined
by this format (i.e. all module elements) will be the last elements in the
containing element, but this will have little effect on the module itself.
6. Description does not contain HTML. (Alternative and more natural methods
of providing such HTML can of course be provided, not sure whether this
should be from elements in this namespace or a module, so I haven't given an
example).
Effect on RDFers:
Positive. Stuff makes sense!
Effect on XMLers:
Positive, less work required to parse the feed, HTML can still be provided
if desired so no losses. For that matter in not locking the format in to
HTML any type of XML can be safely used.
Effect on hand-rolled XMLers:
As above only more so.
Effect on module designers:
Nil.
7. <link rdf:resource="http://www.example.com/"/>
Effect on RDFers:
Strongly positive. It is clear that the link is a URI, it can be used in
further RDF statements. It's all a lot more RDFy.
Effect on XMLers:
Mildly positive. Attributes are the way URIs are normally given in XML, so
someone used to seeing href="http://www.example.com/" or
xlink:href="http://www.example.com/" will know what's going on. Also
whitespace is less of an issue.
Effect on hand-rolled XMLers:
As above.
Effect on module designers:
Negliable. Module designers would be encouraged to follow similar
conventions where appropriate.
8. <text>. This more clearly states the purpose of <title> in the context of
<Image>, where it is used to provide the alternative text (alt in HTML)
rather than a title per se.
Effect on RDFers:
Mildly positive, it is clear that this isn't exactly the same as <title>.
Effect on XMLers:
Negliable.
Effect on hand-rolled XMLers:
Negliable.
Effect on module designers:
Nil.
9. rdf:parseType="Collection" as recently discussed.
Effect on RDFers:
Short-term negative, long-term positive. There is going to be some growing
pains about parseType="Collection" anyway, might as well be RSS that makes
you swallow the medicine.
Effect on XMLers:
Positive. More natural, less RDF-ish.
Effect on hand-rolled XMLers:
As above.
Effect on module designers:
Mildly the same as either RDF or XML depending on the biases of the
designer.
Question: How does this affect ordering.
10. No rdf:about on <Item>
Effect on RDFers:
Interesting, strongly positive. Makes an "Item" a different thing to the
resource it points to. Allows reification, different feeds saying different
things about the same resource, signed RSS (see discussion about
Burtonator's GUID idea), and generally the ability to say something about
both the Item in the feed and the resource it points to without ambiguity.
Note that in the above example the item in the feed was modified at
2002-09-19T11:50:34Z, but the webpage it points to was modified at
2002-09-08T12:36:01Z. There is no conflict between these statements.
Effect on XMLers:
Mildly to strongly positive. One less thing to worry about for simpler
implementations. More advanced implementations can obtain the information
described above and similarly benefit from there being no conflict between
statements about the syndication and statements about the syndicated.
Effect on hand-rolled XMLers:
As above.
Effect on module designers:
Extra complexity and extra power. Modules can now give information on
either the item in the feed, the resource pointed to, or both.
Open question: Should we allow rdf:ID on <Item> and <Image>?
11. <rdf:Description
rdf:about="http://www.example.com/images/syndImage.png">
We can stick in arbitrary RDF at the bottom of the document. Some open
questions about this idea are:
a. Should we have a subset of allowable syntax which MUST or SHOULD be
followed?
b. Should it be <rdf:Description>, or a duplication of <Image> and <Item> -
which results in duplicate triples but that isn't an issue and it may be
clearer.
c. Should we allow statements about resources unrelated to what goes in the
<Channel>
Effect on RDFers:
Strongly positive. This is all good stuff to RDFers. In particular it is
good for people using RSS as a stepping stone from XML-syndication through
RDF-syndication to fuller RDF use ("people" can mean either a company
building it's IT solution or a developer wishing to learn RDF) while getting
ROI on each step.
Effect on XMLers:
Strongly positive for simple implementations. There's just a bunch of crap
at the end of the feed that they ignore on parsing and neglect on writing.
Mildly to strongly negative for richer implementations. If the XML reader
wants to get that stuff it's going to have to match URIs etc. However if we
lay down some rules on what syntax is allowed in this syntax this should be
no more complicated than it is currently (since URI-matching is require in
RSS at present).
Effect on hand-rolled XMLers:
As above, possibly amplified in that simple hand-rolled XMLers can just
stop parsing when they encounter </Channel>, but more advanced
implementations have a certain degree of "remembering" to do.
Effect on module designers:
Extra flexibility with extra complication: Module designers will have to
decide whether their elements go in side the channel's elements (to describe
the feed itself) inside this section (to describe the resources that are
syndicated) or can be allowed in both.
12. In addition it would be easy to infer from the current spec that only
UTF-8 is allowed. The XML spec explicitly states that UTF-8 and UTF-16 are
allowed, anything that doesn't allow UTF-16 isn't an XML parser. Hence the
above can be in UTF-16.
Jon Hanna
PGP http://www.spin.ie/jon.asc
PGP Fingerprint 707E 5E39 3BF5 533A D1DD 2083 8169 BFD7 F532 BD18
"...it has been truly said that hackers have even more words for equipment
failures than Yiddish has for obnoxious people." - jargon.txt