I took Daniel Naber's tree perl script (tree.pl)* and by changing ten
lines converted it from output of XHTML to OPML instead. This
provides input to Danny Goodman's OPML navigation javascript widget.**
Works great for me. Let me know if this is something of interest.
* Daniel Naber, General Public License, <http://www.danielnaber.de/tree/>.
** "Collapsable XML Menus" recipe 10.11 on page 307 of his "JS&DHTML
COOKBOOK (O'Reilly & Associates, Paperback, Published April 2003, 520
pages, isbn:0596004672, for sale $40 list $25 bookpool
<http://www.oreilly.com/catalog/jvdhtmlckbk/>.
Cheers
:Monty
Thanks for confirming that this is a well-formedness issue. I'm not
concerned with the lack of validation -- that's a given with OPML
because of how it can be extended. If I need a validating outline
format with a DTD, there are a couple of proposed ones out there I
could switch to. I could also create a DTD for an "OPML Bookmarks
profile," which I may do at some point to clarify what an OPML link
directory contains.
Since I'm actively using Radio, it's convenient to work with
OPML. For interop, I think OPML consumers need well-formed XML, and a
declaration of entity references used by Radio and Frontier (or any
other OPML producer) seems like the best solution. Now I just have to
figure out what entities they can produce.
To work with existing files, I will be fixing the problem myself by
converting OPML data to add entity declarations. I may even offer a
patch for Radio, if it can be fixed with a callback.
As I was telling someone in e-mail, I figured my project would be a
challenge because it merges a loose producer of XML (Radio) with a
rigid consumer (Java's XOM library). I figured it would be educational
to see what problems I ran int, and it already has been. I had no idea
undeclared entities could break the well-formedness of XML data.
http://www.cadenhead.org/workbench
> From: rcade [mailto:rogers@...]
> Subject: [opml-dev] Undeclared entity references in OPML
>
> On my weblog, I discuss a problem I've run into implementing an
> application that consumes OPML:
>
> http://www.cadenhead.org/workbench/2003/06/06.html#a699
>
> The short version: Radio and Frontier can author OPML files that
> contain undeclared entity references, which breaks XML's
> well-formedness rule. When I read the file, my XML parsing Java
The OPML specification, like all XML, is based on being well
formed. Otherwise, it cannot claim to adhere to the XML standard.
The OPML specification, itself, is not at fault and it doesn't
dictate any character entities. Thus, an author *must* either use
decimal or hex numeric entities, explicitly declare any character
entities that are used, or use character codes from the encoding
scheme specified in the encoding attribute of the XML declaration.
When no encoding attribute is specified, according to the XML 1.0
specification, the encoding scheme is based on BOM (byte order
mark) and in absence of the BOM, the encoding scheme is UTF-8.
This is strictly a problem with Radio and Frontier, in that it
doesn't validate its user's data against either a DTD or XML
Schema and at a minimum ensure it's well formed XML. You should
complain to them for providing a service without these controls
in place. BTW, your program shouldn't crash and burn either on
these non-spec XML files but, I think that's already obvious to
you.
Your solution to the problem is quite simple but, assumes that the
character entities being used are from the xHTML standard. A high
probability but your technique is exactly what needs to be done by
authors if they decide to use character entities. BTW, as I
mentioned above they could opt to use character codes from the
encoding scheme specified in the XML declaration. Now getting Radio
or Frontier and/or it's users to validate their files is akin to
herding cats. As you have found, people and services don't even
check for well formed XML. When a specification or community says:
you don't need to validate against a DTD or XML schema, well formed
XML is good enough, I cringe... One of the reasons why to-date, I
have yet to receive a valid RDF file from anyone.
Andy.
On my weblog, I discuss a problem I've run into implementing an
application that consumes OPML:
http://www.cadenhead.org/workbench/2003/06/06.html#a699
The short version: Radio and Frontier can author OPML files that
contain undeclared entity references, which breaks XML's
well-formedness rule. When I read the file, my XML parsing Java
library crashes and burns because it requires well-formed XML.
The problem be fixed for a large number of entities by adding this
declaration to every OPML file produced by Radio and Frontier, above
the root element:
<!DOCTYPE opml [
<!ENTITY % HTMLlat1 PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent">
%HTMLlat1;
]>
I've released version 0.1 of OPML Link Publisher, a Java application
that reads an OPML link directory outline created with Radio UserLand
and produces a bookmarks.html file in the format used by Mozilla and
Netscape Navigator:
http://www.cadenhead.org/workbench/2003/06/03.html#a695
The project is open source and makes use of XOM, a Java class library
for parsing and producing XML. It was easy to make use of OPML using
XOM, which relies on an underlying parser such as Apache Xerces.
In case anyone's Ruby-inclined, I thought I'd go ahead and point to an
unfinished directory browser I started today.
http://www.chadfowler.com/viewcvs.cgi/opml/
It's not much more than a hack right now, and it's not finished. But,
I'm about to start a *long* trip back home to the USA after a year and
a half of living in India, so I thought I'd point to it from this
list in case anyone feels like finishing it while I'm in transit. ;)
Seriously, though, I should be back in front of an internet-connected
computer starting one week from today, and will hopefully get this
into a working state within a couple of hours of cleanup/work. I'm
using the Fox toolkit with Ruby, so it should run on Windows, MacOS,
and various UNIX variants (developing on Linux currently).
(BTW, this is my personal CVS repository, so I went ahead and inserted
a couple of files which aren't mine (icons and sample opml from
opml.org). I'll remove these if I finish/release the browser.)
Cheers,
Chad
All instances of <$MTBlogTitle$> should instead read <$MTBlogName$>.
Oops.
Richard
=====
Personal site: http://www.richarderiksson.com/
__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com
The other day, I read the OPML spec and was inspired
enough to create two templates for Movable Type. I
welcome discussion on both.
Template for last 10 posts, based on Dave's outline for
Scripting News. Contains the full content of the post
including HTML:
<?xml version="1.0"?>
<!-- OPML generated by Movable Type v. <$MTVersion$> on <$MTDate
format="%B %d, %Y at %I:%M %p"$> -->
<opml version="1.0">
<head>
<title><$MTBlogTitle$></title>
<ownerName><MTEntries
lastn="1"><$MTEntryAuthor$></MTEntries></ownerName>
<ownerEmail><MTEntries
lastn="1"><$MTEntryAuthorEmail$></MTEntries></ownerEmail>
<dateCreated><MTEntries lastn="1"><$MTEntryDate
format="%Y-%m-%dT%H:%M:%S"$><$MTBlogTimezone$></MTEntries></dateCreated>
<dateModified><$MTDate
format="%Y-%m-%dT%H:%M:%S"$><$MTBlogTimezone$></dateModified>
<expansionState />
</head>
<body>
<MTEntries lastn=10">
<outline text="<$MTEntryBody encode_html="1"$>" />
</MTEntries>
</body>
</opml>
My output: http://www.movableblog/xml/opml.xml
Category template. It outputs listing of all posts
organized by category, and may be the hardest to
implement for other bloggers, due to requiring category
templates be enabled:
<?xml version="1.0"?>
<!-- OPML generated by Movable Type version <$MTVersion$> on <$MTDate
format="%B %d, %Y at %I:%M %p"$> -->
<opml version="1.0">
<head>
<title><$MTBlogTitle$></title>
<ownerName><MTEntries
lastn="1"><$MTEntryAuthor$></MTEntries></ownerName>
<ownerEmail><MTEntries
lastn="1"><$MTEntryAuthorEmail$></MTEntries></ownerEmail>
<dateModified><MTEntries lastn="1"><$MTEntryDate
format="%Y-%m-%dT%H:%M:%S"$><$MTBlogTimezone$></MTEntries></dateModified>
</head>
<body>
<MTArchiveList archive_type="Category">
<outline text="<$MTArchiveTitle encode_html="1"$>"
url="<$MTArchiveLink$>"
description="<$MTArchiveCategoryDescription$>">
<MTEntries lastn="9999">
<outline text="<$MTEntryTitle encode_html="1"$>"
url="<$MTEntryPermalink$>" />
</MTEntries>
</outline>
</MTArchiveList>
</body>
</opml>
My output: http://www.movableblog.com/xml/opmlcat.xml
Richard
=====
Personal site: http://www.richarderiksson.com/
__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com
if instead of pointing to a series of folders links were defined as
pointers to outline elements linkrot would not be an issue. All you
would need is a unique ID associated with each outline element.
Even better when the structure channged your link would not only get
you to the right element but also show you it's new place in the site
structure.
a simple cgi that took the name of the opml file being referenced and
the id of the outline element would be all that was needed.
The only remaining issue is when a link is deleted. If the unique id
were constructed in such a way as to include an indication of tree
placement then the cgi could inspect the id and make and point you to
the right section of the site at least.
-Kate
On Wednesday, April 16, 2003, at 10:58 AM, Dave Winer wrote:
> http://www.opml.org/howToImplementOpmlDirectoryBrowser#linkrot
>
> [Non-text portions of this message have been removed]
>
>
>
> ------------------------ Yahoo! Groups Sponsor
> ---------------------~-->
> Get 128 Bit SSL Encryption!
> http://us.click.yahoo.com/W7NydA/hdqFAA/VygGAA/2U_rlB/TM
> ---------------------------------------------------------------------
> ~->
>
> To unsubscribe from this group, send an email to:
> opml-dev-unsubscribe@egroups.com
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
--- In opml-dev@yahoogroups.com, "Dave Winer" <dave@u...> wrote:
> Yes that's what it means. Here's what the spec says.
>
> "A headline without sub-heads with a link attribute that ends with .opml is
> an inclusion. When the user dives into an inclusion it is displayed exactly
> as if it were part of the outline that included it, with the exception that
> the author information displayed reflects the author information for the
> included file, and suggested links are sent to the author of the included
> file."
>
> The reason is to make the back-end more efficient and simpler, and not
> require it to do anything more than make an HTTP request. Alternatively we
> could have to get into file types, and some cross-platform issues, and no
> single person, at the time the decision was made, had the requisite
> knowledge. Further it would mean that the browser would have to read every
> single URL to see if there was an OPML file at the other end. While it might
> not bring the Internet to its knees, it would slow things down.
>
> Dave
couldn't it just use a different type value, e.g.
<outline type="opml" url="http://www.pocketsoap.com/opml.aspx" />
Cheers
Simon
I don't think linkrot necessarily has to follow from changing directory
structures at all - possible solutions include dynamically generating a
meaningful message when the entry has been deleted (i.e. not 404 but "this
was a category but ain't no more"), falling back to a parent category or
providing a forwarding link. 404's are a fact of life, but this doesn't mean
we should encourage their generation. What about "Cool URIs" :
http://www.w3.org/Provider/Style/URI.html
Cheers,
Danny.
> -----Original Message-----
> From: Dave Winer [mailto:dave@...]
> Sent: 16 April 2003 16:59
> To: opml-dev@yahoogroups.com
> Subject: [opml-dev] Another new part of the howto: Linkrot
>
>
> http://www.opml.org/howToImplementOpmlDirectoryBrowser#linkrot
>
> [Non-text portions of this message have been removed]
>
>
>
>
> To unsubscribe from this group, send an email to:
> opml-dev-unsubscribe@egroups.com
>
>
>
> Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
>
>
Simon Fell wrote:
> Looking at the example directory at
> http://www.opml.org/directory/13
>
> is it using the presence of the .opml in the url to decide to display
> a folder and to treat the url as a branch in the tree, rather than a
> world icon ?
>
> Whilst not a direct issue for me, it does mean that if you're
> dynamically generating opml it has to be a .opml url, which would be
> an issue in a number of web hosting enironments.
Might this work?
http://example.com/foo.php?bar=fake.opml
It's a hack, but technically it is a url ending with '.opml'
Pete
Imho, after many years of working on outliners, my belief is that the text is
just one attribute, and often not the most important one. OPML is not like every
XML format. Another way it's different is that the format isn't the juice, the
applications are.
I wrote about this yesterday.
http://scriptingnews.userland.com/2003/04/15#noWaitForTools
***OPML-Dev list charter
BTW, I just sent an email to a poster who posts off-topic stuff here, a reminder
that according to the charter this is a list for discussing applications of
OPML, not the format itself.
http://groups.yahoo.com/group/opml-dev/
That was a conscious decision when the list was started almost three years ago,
mail lists are uniquely bad places to design formats, and I wanted to be clear
at the start that was not the purpose of this list. People haven't been
respecting that, but now I want to clear the space for apps, so if your post
isn't about apps, please find another place to express yourself. Thanks.
Dave
[Non-text portions of this message have been removed]
Makes sense.
"it just seems wrong" wasn't my only argument but none of the others
are particularly strong enough to justify a change.
as a note, the others were readability and sticking with what I
perceive as the standard convention when it comes to sticking large
pieces of text in xml.
OPML is the only place I've seen that preferred to stick all the text,
regardless of length, in an attribute.
-Kate
On Wednesday, April 16, 2003, at 05:12 AM, Dave Winer wrote:
> May I be blunt? Assuming so.
>
> "It just seems wrong" is not a very strong argument.
>
> If you do it differently then theoretically every processor has to
> work both ways.
>
> Then someone else thinks something else just seems wrong, and every
> processor has to split over that.
>
> Pretty soon you have a mess, and only the early players can play. I've
> heard this complaint about RSS, but I tell them don't blame me, I've
> argued against every convenience or accomodation of taste.
>
> My philosophy is that it's good when a format makes you retch in
> disgust. It means someone was a hardass on this kind of whittling down
> of a format. In OPML I am ruthless. Learned the hard way if you give
> em an inch they take a mile.
>
> Dave
>
> ----- Original Message -----
> From: masukomi
> To: opml-dev@yahoogroups.com
> Sent: Wednesday, April 16, 2003 12:16 AM
> Subject: [opml-dev] text syntax
>
>
> I don't think I've seen mention of this question before and it's a
> hard
> one to search for (with useful results) so sorry if it's a duplicate.
>
> what's the official stance on the following syntax? would it be
> considered valid ? I'm not sure anything out there would support it
> currently.. but I just hate putting tons of text into an attribute
> value... it just seems wrong.
>
> <outline attributes here.....>
> a whole bunch of text here... paragraphs worth... more even
> <outilne attributes here... />
> </outline>
>
>
> -Kate
>
>
>
> Yahoo! Groups Sponsor
> ADVERTISEMENT
>
>
>
>
> To unsubscribe from this group, send an email to:
> opml-dev-unsubscribe@egroups.com
>
>
>
> Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
>
>
> [Non-text portions of this message have been removed]
>
>
>
> ------------------------ Yahoo! Groups Sponsor
> ---------------------~-->
> Get 128 Bit SSL Encryption!
> http://us.click.yahoo.com/xaxhjB/hdqFAA/VygGAA/2U_rlB/TM
> ---------------------------------------------------------------------
> ~->
>
> To unsubscribe from this group, send an email to:
> opml-dev-unsubscribe@egroups.com
>
>
>
> Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
>
May I be blunt? Assuming so.
"It just seems wrong" is not a very strong argument.
If you do it differently then theoretically every processor has to work both
ways.
Then someone else thinks something else just seems wrong, and every processor
has to split over that.
Pretty soon you have a mess, and only the early players can play. I've heard
this complaint about RSS, but I tell them don't blame me, I've argued against
every convenience or accomodation of taste.
My philosophy is that it's good when a format makes you retch in disgust. It
means someone was a hardass on this kind of whittling down of a format. In OPML
I am ruthless. Learned the hard way if you give em an inch they take a mile.
Dave
----- Original Message -----
From: masukomi
To: opml-dev@yahoogroups.com
Sent: Wednesday, April 16, 2003 12:16 AM
Subject: [opml-dev] text syntax
I don't think I've seen mention of this question before and it's a hard
one to search for (with useful results) so sorry if it's a duplicate.
what's the official stance on the following syntax? would it be
considered valid ? I'm not sure anything out there would support it
currently.. but I just hate putting tons of text into an attribute
value... it just seems wrong.
<outline attributes here.....>
a whole bunch of text here... paragraphs worth... more even
<outilne attributes here... />
</outline>
-Kate
Yahoo! Groups Sponsor
ADVERTISEMENT
To unsubscribe from this group, send an email to:
opml-dev-unsubscribe@egroups.com
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
[Non-text portions of this message have been removed]
Yes that's what it means. Here's what the spec says.
"A headline without sub-heads with a link attribute that ends with .opml is
an inclusion. When the user dives into an inclusion it is displayed exactly
as if it were part of the outline that included it, with the exception that
the author information displayed reflects the author information for the
included file, and suggested links are sent to the author of the included
file."
The reason is to make the back-end more efficient and simpler, and not
require it to do anything more than make an HTTP request. Alternatively we
could have to get into file types, and some cross-platform issues, and no
single person, at the time the decision was made, had the requisite
knowledge. Further it would mean that the browser would have to read every
single URL to see if there was an OPML file at the other end. While it might
not bring the Internet to its knees, it would slow things down.
Dave
I don't think I've seen mention of this question before and it's a hard
one to search for (with useful results) so sorry if it's a duplicate.
what's the official stance on the following syntax? would it be
considered valid ? I'm not sure anything out there would support it
currently.. but I just hate putting tons of text into an attribute
value... it just seems wrong.
<outline attributes here.....>
a whole bunch of text here... paragraphs worth... more even
<outilne attributes here... />
</outline>
-Kate
Looking at the example directory at
http://www.opml.org/directory/13
is it using the presence of the .opml in the url to decide to display
a folder and to treat the url as a branch in the tree, rather than a
world icon ?
Whilst not a direct issue for me, it does mean that if you're
dynamically generating opml it has to be a .opml url, which would be
an issue in a number of web hosting enironments.
Cheers
Simon
How refreshing!
I'm soooo tired of people wanting to re-invent the roots.
Dave
----- Original Message -----
From: Stan Krute
To: opml-dev@yahoogroups.com
Sent: Tuesday, April 15, 2003 6:52 PM
Subject: [opml-dev] Re: New howto on OPML directory browsers
Hey Dave
This is excellent.
I hope to find time to join the barn-raising.
Homage and happy diggation ...
Stan
Yahoo! Groups Sponsor
ADVERTISEMENT
To unsubscribe from this group, send an email to:
opml-dev-unsubscribe@egroups.com
Your use of Yahoo! Groups is subject to the Yahoo! Terms of Service.
[Non-text portions of this message have been removed]
Howdy all. I did a little fooling around today with OPML directory
browsing and Mozilla:
http://www.zope-europe.org/articles/200304/opmlprototypehttp://radio.weblogs.com/0116506/2003/04/15.html#a75
I saw some older messages in the archive talking about XUL, tree
widgets, RDF and OPML, etc. I've gained a little experience on
these subjects (except OPML, which I'm new to) in the past few
months. If anybody wants to revisit such conversations.
Also, I'm part of OSCOM, so if anybody wants to talk about how
OPML and CMS servers might intersect...
--Paul
By directory I mean a Yahoo or DMOZ-like directory -- a hierarchy that is
navigated using a Web browser. The hierarchy is specified in OPML.
My goal is to get a set of work-alike directory browsers and outliner
authoring tools in a variety of different environments, open source and
commercial, much as there are many different implementations of XML-RPC,
SOAP and RSS. I want to demo these at OSCOM on May 28.
More here..
http://www.opml.org/howToImplementOpmlDirectoryBrowser
Dave