Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Tagsoup library breaks on malformed doctype   Message List  
Reply | Forward Message #1275 of 1386 |
Hi,

In a proyect where we use Tagsoup to tidy some malformed xhtml code have
found that if there is an odd number of quotes on the doctype
declaration tagsoup throws an String related exception and fails. For
example with the following input,

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML Basic 1.1//EN" "> <html
xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head><title>Test
with bogus doctype</title></head> <body> <p>This page has an extra quote
in the doctype, which the tagsoup library doesn't like.</p> </body>
</html>

Tagsoup throws the next exception,

[Fatal Error] :2:14: The document type declaration for root element type
"html" must end with '>'.
Exception in thread "main" java.lang.StringIndexOutOfBoundsException:
String index out of range: -1

Not sure if making a patch to this library would be quite easy (I
haven't reviewed the source code yet) or should it better just making
some workarounds that help to recover from any unexpected error from
tagsoup.

Miguel




Wed Mar 18, 2009 10:58 am

miguel.garcia@...
Send Email Send Email

Forward
Message #1275 of 1386 |
Expand Messages Author Sort by Date

Hi, In a proyect where we use Tagsoup to tidy some malformed xhtml code have found that if there is an odd number of quotes on the doctype declaration tagsoup...
Miguel Garcia
miguel.garcia@...
Send Email
Mar 18, 2009
10:58 am

... The real problem is that TagSoup thinks the system-id begins with a quote and ends with a quote, but doesn't realize that it's zero-length. The obvious...
John Cowan
johnwcowan
Offline Send Email
Mar 18, 2009
8:45 pm
Advanced

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help