The lack of a proper character set in the ODP RDF has been a real
problem for many people. In an effort to improve the data we
distribute, I have modified the RDF generation tool to produce files in
the UTF-8 encoding of Unicode.
There is a sample file at http://dmoz.org/rdf/World.rdf.u8.gz
Everyone should download this file and start to modify their processes
to accept the UTF-8 format. Also check http://dmoz.org/rdf/Changes.htm
l for more information.
In the near future, I will be releasing the full complement of RDF
files in the UTF-8 format. The files will be provided in the existing
encoding for one month.
Any problems may be mailed to truel@...