John, Thank you. Xerox will likely be using the TagSoup JAR file binary in one of our products. This change makes it clear to our lawyers that we have the...
Once the product revision is launched, if it does include TagSoup then we will put the TagSoup attribution in the about page of the product. I will likely not...
... Okay. Since the fact will then be published (since this list is publicly archived), I'll take the liberty of mentioning it. -- Why are well-meaning...
The 1.0.2rc3 release uses null as the initial value for the options HashMap in CommandLine.java in the new command-line parsing code. Since java.util.Hashtable...
In the piddly department, here is a minimal --help. Note that it uses Iterator because I can't see a convenient way to get an Enumeration out of a HashMap, so...
... The next release will use a Hashtable instead, since performance in the CommandLine class doesn't matter. I've also incorporated your help function, using...
... Well, sort of. I tried this content on Firefox: <p style="color:red; text-align:center;>Now you see it> Now you don't and sure enough, it stops at the...
Argh! Oh well. I remember being puzzled by Netscape's behavior with regard to improperly-terminated comments. It appeared to be equally happt with <!-- -->...
Would the XSLT output element serialiation control attributes be appropriate knobs to put onto org.ccil.cowan.tagsoup.XMLWriter? I note that right now...
... Are you offering to make the necessary changes and/or provide another serializer? The current one is just a hacky version of David Megginson's, sufficient...
... The approach used by JAXP is to set up a java.util.Properties object and pass it to the serializer with a setOutputProperties(Properties oformat) method....
Great! Would it be OK to use our a package copy of the keys, to avoid a J2SE 1.4.2 dependency (javax.xml.transform.OutputKeys)? Sun publicly documents the...
... I'm good with that. In fact, you might as well hard-code them: XSLT 1.0 is quite stable. ... Excellent. I found it out by writing a little program. :-) ...
Hi, With the below HTML file, Tagsoup 1.0rc3 (which I assume is the latest) for some reason inserts an ISMAP="BOOLEAN" attribute into the IMG links. This means...
My html is as following: <pre>@misc{ granville-positive,author = "Andrew Granville", title = "On Positive Integers <=x With Prime Factors <=t log x", url =...
hi i am using tadsoup for converting html to xhtml. for this i got jar file of tagsoup. and usind command: java -jar tagsoup-1.0rc3.jar --files foo.html but it...
hi i am using tadsoup for converting html to xhtml. for this i got jar file of tagsoup. and usind command: java -jar tagsoup-1.0rc3.jar --files foo.html but it...
... I can only think that this has something to do with the JAVM you are using or the version of Java. Can you send back the results of "java -version"? -- ...
Dear tagsoup friends, I am a contributer in the Poesia project (www.poesia-filter.org), which is an Internet content filter for kids. I am using tagsoup for...
somewhere sufficently wild. this is relevant to the bugs you're trying to fix with the latest update ("This release cleans up long-standing problems with...
Garry Hill
garry@...
Jul 21, 2005 1:22 am
319
Hi all, I found this in a html page from the wild: <A HREF="http://i2as.idregie.com/c.php? s=396&w=468&h=60"> Ok, that's quite brutish, but tagsoup fixes this...
... TagSoup is not aware of which attributes are supposed to contain URIs, so it just does minimal SGML/XML fixup, namely converting line-ends to spaces. -- ...
... Well, no one product can do everything. Jericho (thanks for the reference) is about examining and perhaps modifying the HTML at the lowest level. TagSoup...
hi johan my problem is been resolved by jericho html parcer actually i was using tag-soup for it. but jericho is well documented and it do parcing at basic...
... TagSoup conforms to the behavior of SAX parsers, and requires no programmer-level documentation of its own except in the properties and features that can...