Miguel Garcia scripsit:
> Hi,
>
> In a proyect where we use Tagsoup to tidy some malformed xhtml code have
> found that if there is an odd number of quotes on the doctype
> declaration tagsoup throws an String related exception and fails. For
> example with the following input,
>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "> <html
> xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head><title>Test
> with bogus doctype</title></head> <body> <p>This page has an extra quote
> in the doctype, which the tagsoup library doesn't like.</p> </body>
> </html>
The real problem is that TagSoup thinks the system-id begins with a quote
and ends with a quote, but doesn't realize that it's zero-length. The
obvious fix to Parser#trimquotes doesn't work, though. I think this will
be straightforward to find a patch for, but I'll need to do a bit of debugging.
--
John Cowan http://www.ccil.org/~cowan cowan@...
Uneasy lies the head that wears the Editor's hat! --Eddie Foirbeis Climo