Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Real people. Real stories. See how Yahoo! Groups impacts members worldwide.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Tagsoup library breaks on malformed doctype   Message List  
Reply | Forward Message #1276 of 1386 |
Re: [tagsoup-friends] Tagsoup library breaks on malformed doctype

Miguel Garcia scripsit:
> Hi,
>
> In a proyect where we use Tagsoup to tidy some malformed xhtml code have
> found that if there is an odd number of quotes on the doctype
> declaration tagsoup throws an String related exception and fails. For
> example with the following input,
>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "> <html
> xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head><title>Test
> with bogus doctype</title></head> <body> <p>This page has an extra quote
> in the doctype, which the tagsoup library doesn't like.</p> </body>
> </html>

The real problem is that TagSoup thinks the system-id begins with a quote
and ends with a quote, but doesn't realize that it's zero-length. The
obvious fix to Parser#trimquotes doesn't work, though. I think this will
be straightforward to find a patch for, but I'll need to do a bit of debugging.

--
John Cowan http://www.ccil.org/~cowan cowan@...
Uneasy lies the head that wears the Editor's hat! --Eddie Foirbeis Climo



Wed Mar 18, 2009 8:45 pm

johnwcowan
Online Now Online Now
Send Email Send Email

Forward
Message #1276 of 1386 |
Expand Messages Author Sort by Date

Hi, In a proyect where we use Tagsoup to tidy some malformed xhtml code have found that if there is an odd number of quotes on the doctype declaration tagsoup...
Miguel Garcia
miguel.garcia@...
Send Email
Mar 18, 2009
10:58 am

... The real problem is that TagSoup thinks the system-id begins with a quote and ends with a quote, but doesn't realize that it's zero-length. The obvious...
John Cowan
johnwcowan
Online Now Send Email
Mar 18, 2009
8:45 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help