Over a year ago I was working on an XML Schema (XSD) document for
RSS2. I was forced to give up because the requirements in the spec
can't be modeled with XSD.
For the curious, the specific technical issues are detailed in posts
to this group. Search for 'schema' and 'schemas'
In the time between then and now I have had the opportunity to work
with RELAX NG validation both for my job and for personal projects.
The exciting news is I was able to come up with a RELAX NG (RNG)
schema which can validate RSS2 that conforms to the spec.
The trickiest part is this: In the spec, the <channel> and <item>
elements have some children who are optional, some who are required
and all can appear in any order. This was easily represented in RELAX
NG.
Another point of difficulty is that an <item> must have at least one
<title> or <description>. One is ok, both is ok, having neither is
not
permitted. This was also easy to model in RNG.
My RELAX NG schema can be downloaded here: http://dmorelli.sdf-us.
org/files/rss2/
File in this folder: rss-2_0.rng
With this document and Jing (below) I successfully validated recent
feeds from a variety of sites like c|net, eWeek, Scripting
News, Gizmodo, NYT and Zeldman. If anyone finds a problem, please let
me know. I'm sure documents could be constructed that break it.
From the RELAX NG homepage:
"RELAX NG is being developed into an International Standard (ISO/IEC
19757-2) by ISO/IEC JTC1/SC34/WG1; it is currently at the final stage
of standardization. RELAX NG was based on TREX designed by James
Clark and RELAX designed by MURATA Makoto."
Links for those who want to use this..
RELAX NG homepage: http://relaxng.org/
Jing RELAX NG validator (Java): http://www.thaiopensource.
com/relaxng/jing.html
RELAX NG O'Reilly book (not yet published but available online): http:
//books.xmlschemata.org/relaxng/