--- In tagsoup-friends@yahoogroups.com, John Cowan <cowan@...> wrote:
>
> James Abley scripsit:
>
> > I've encountered an issue using TagSoup and I wanted to clarify whether
> > it is expected behaviour due to how I'm using it, or something else.
> >
> > The issue that I'm seeing is that I'm parsing an RSS feed and it
> > eventually goes through TagSoup to ensure that I store well-formed XML.
> >
> >
http://www.guardian.co.uk/football/2009/feb/26/real-madrid-rafa-benitez-liverpoo\
l/rss
> >
> > The <br/> element between the first two bullet points in that story
> > is getting removed when I parse the <item/> description and I'm not
> > sure why that is the case.
>
> I can't duplicate this problem with TagSoup 1.2. It turns into a
> <br clear="none"></br>, because there's a default attribute value
> in the HTML 4.0 DTD, and TagSoup doesn't generate empty elements.
>
> > Is there a source repository that I can check out anonymously and write
> > some tests against? I've not been able to find one through Google -
> > too much interference from the Haskell version, etc.
>
> You can always download the source of released versions from
> http://tagsoup.info. There is no public repository.
>
> --
> John Cowan <cowan@...> http://www.ccil.org/~cowan
> But no living man am I! You look upon a woman. Eowyn I am, Eomund's
daughter.
> You stand between me and my lord and kin. Begone, if you be not deathless.
> For living or dark undead, I will smite you if you touch him.
>
Sorry, that's absolutely right. A later step in my XML pipeline is removing that
element. Apologies for the noise.
Cheers,
James