James Abley scripsit:
> I've encountered an issue using TagSoup and I wanted to clarify whether
> it is expected behaviour due to how I'm using it, or something else.
>
> The issue that I'm seeing is that I'm parsing an RSS feed and it
> eventually goes through TagSoup to ensure that I store well-formed XML.
>
>
http://www.guardian.co.uk/football/2009/feb/26/real-madrid-rafa-benitez-liverpoo\
l/rss
>
> The <br/> element between the first two bullet points in that story
> is getting removed when I parse the <item/> description and I'm not
> sure why that is the case.
I can't duplicate this problem with TagSoup 1.2. It turns into a
<br clear="none"></br>, because there's a default attribute value
in the HTML 4.0 DTD, and TagSoup doesn't generate empty elements.
> Is there a source repository that I can check out anonymously and write
> some tests against? I've not been able to find one through Google -
> too much interference from the Haskell version, etc.
You can always download the source of released versions from
http://tagsoup.info. There is no public repository.
--
John Cowan <cowan@...> http://www.ccil.org/~cowan
But no living man am I! You look upon a woman. Eowyn I am, Eomund's daughter.
You stand between me and my lord and kin. Begone, if you be not deathless.
For living or dark undead, I will smite you if you touch him.