Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Elements getting stripped out unexpectedly   Message List  
Reply | Forward Message #1279 of 1386 |
Hi,

I've encountered an issue using TagSoup and I wanted to clarify whether it is
expected behaviour due to how I'm using it, or something else.

The issue that I'm seeing is that I'm parsing an RSS feed and it eventually goes
through TagSoup to ensure that I store well-formed XML.

http://www.guardian.co.uk/football/2009/feb/26/real-madrid-rafa-benitez-liverpoo\
l/rss


The <br/> element between the first two bullet points in that story is getting
removed when I parse the <item/> description and I'm not sure why that is the
case.

"&lt;p&gt;• Liverpool manager says he will be staying at Anfield&lt;br /&gt;•
Spaniard praises team for win away to Real Madrid&lt;/p&gt;"

The markup is being correctly unescaped prior to being passed to TagSoup.

Is there a source repository that I can check out anonymously and write some
tests against? I've not been able to find one through Google - too much
interference from the Haskell version, etc.

Cheers,

James






Tue Apr 28, 2009 10:07 am

taboozizi
Offline Offline
Send Email Send Email

Forward
Message #1279 of 1386 |
Expand Messages Author Sort by Date

Hi, I've encountered an issue using TagSoup and I wanted to clarify whether it is expected behaviour due to how I'm using it, or something else. The issue that...
James Abley
taboozizi
Offline Send Email
Apr 28, 2009
10:08 am

... I can't duplicate this problem with TagSoup 1.2. It turns into a <br clear="none"></br>, because there's a default attribute value in the HTML 4.0 DTD,...
John Cowan
johnwcowan
Online Now Send Email
Apr 28, 2009
9:46 pm

... Sorry, that's absolutely right. A later step in my XML pipeline is removing that element. Apologies for the noise. Cheers, James...
James Abley
taboozizi
Offline Send Email
Apr 29, 2009
9:05 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help