Hi, I've downloaded 1.2 source code and have seen some folders tssl, stml. What are these for? I haven't found anything regarding this on the documentation. ...
Diego Campo
diego.campo@...
Mar 5, 2008 11:36 am
1028
... They are required when building TagSoup from source. You cannot just compile the provided source code yourself -- you must use Ant. -- John Cowan...
Is this to produce the jar? I'd like to integrate the code so I can make my own changes if necessary, with no jar creation. Should I then integrate the tagsoup...
Diego Campo
diego.campo@...
Mar 5, 2008 3:48 pm
1030
... Code generation is a requirement whether you use the jar or not. ... Yes. ... Yes. Every version of TagSoup has involved code generation. -- John Cowan...
... And I suppose once the java code is generated there's no need for those files anymore?...
Diego Campo
diego.campo@...
Mar 6, 2008 6:12 pm
1033
... That's true. On the other hand, I or someone else occasionally posts a patch to this list, and then you'd need those files around in order to rebuild. -- ...
Hello, I would like to ask, if there is maybe option in TagSoup, which enables receiving announcements about invalid tags or entities, while SAX parser parses...
... No, there isn't. I suppose appropriate modifications could be made, but TagSoup was not designed with such a thing in mind, and the changes required would...
Hi - I've got an application which takes the "poor, nasty and brutish" HTML content of the web and tried to clean it up to adhere to our internal DTD. Our...
Philip Constantinou
pconstantinou@...
Mar 27, 2008 10:00 pm
1038
If tagsoup has gotten you well formed XML, then you can use XSL to get exactly what you need. No need to hack around problems of ULs needing LIs (can't see how...
Robert Koberg
rob@...
Mar 27, 2008 10:05 pm
1039
I'm very happy writing SAX, just thought that someone may have already done the work to go from tagSoup's HTMLish XML to valid XHTML. It's a bit harder to take...
Philip Constantinou
pconstantinou@...
Mar 27, 2008 10:24 pm
1040
Wll, in XSLT 2.0 (Saxon 9x) you have the XHTML output method. Since you are using java, you could write a simple default transformer which might be all you...
Robert Koberg
rob@...
Mar 27, 2008 10:51 pm
1041
... No, no one has. But I'd really reconsider writing such code at the SAX level; XSLT is your friend when it comes to rewriting trees, and TagSoup was always...
... That's a bug in 1.2, I believe, though I can't prove it right now because I don't have access to the TagSoup repo. Go ahead and send me your bug: if you...
I'm new to the list (and also TagSoup) so I apologize if there is a simple answer to this. Here's the issue: I have a set of HTML files with meta elements...
... Wow. That's the most creative abuse of HTML tags I know of. Your best bet is probably to remove the <element name='meta'> ... </element> section from...
... More simply, but less reliably: preprocess the incoming HTML to change "<meta" to "<xmeta" or the like. -- How they ever reached any conclusion at all...
... That looks like a variation of the problem I was going to report. Actually, I don't think it is uniquely creative; use of meta elements in the body is, I...
... Alas, yes, it's wrong. XML is guaranteed (except for the possibility of encoding issues), XHTML is definitely not. In particular, TagSoup knows nothing...
For the record, there are no GPL-only components in any version of TagSoup. In addition, please note that if someone must have a GPL-licensed version of...
G'day When I use a url with a question mark The server is not happy XMLReader parser = new org.ccil.cowan.tagsoup.Parser(); // tagsoup parser XPathContext...
Does TagSoup intend to be turning all the various quote characters into the appropriate unicode entities? I ask because I noticed that a page I parsed...
... Fortunately, the answer is "none of the above". :-) Unlike browsers, TagSoup does not do automatic encoding detection. (There is a Java library to do so...
Thanks for the quick response, John. Having gone through all of this pain five years ago for Furl, I should have known to be more explicit about my encodings....
I was wondering if TagSoup will meet my needs. I have a server that is in need of a Parser/Substituter to replace a given URL (either img or a) with another...