Hi all, hmm - CDATA does not exist in XML? I don't think this is true. It just serializes differently. Cheers, Oliver...
Oliver Koell
listen@...
Apr 1, 2004 4:17 pm
61
... Elements of type CDATA do not exist in XML. They are not to be confused with CDATA sections, which do exist in XML, nor with CDATA attributes, which do ...
... I thought April fools day finished at noon :-) regards DaveP...
Dave Pawson
dpawson@...
Apr 1, 2004 5:18 pm
63
Hi John, thanks for responding to thoughtless remarks :-) Just out of curiosity: is there a particular reason, the XMLWriter prefers escaping over CDATA...
Oliver Koell
listen@...
Apr 2, 2004 9:28 am
64
... It's a general-purpose writer and doesn't know about particular elements except in HTML mode. Deciding when to use CDATA sections cleverly requires either...
Hi John, in XHTML it's recommended to wrap your the content of your script elements into CDATA markers, like this: <script type="text/javascript"> <![CDATA[ ...
Oliver Koell
listen@...
Apr 2, 2004 1:58 pm
66
Hi, It appears that Tagsoup is auto-inserting empty HTML attributes with a value of "BOOLEAN". Example: <td> results in <td nowrap="BOOLEAN"> ...
Oliver Koell
listen@...
Apr 2, 2004 2:24 pm
67
... XHTML is not supported, on the assumption (perhaps false) that people who are actually going to the trouble of producing XHTML are producing at least...
... Oooooops. That is what is known as a paper-bag bug, meaning that after releasing it I should go around wearing a paper bag on my head for a while. Today's...
And here it is today, as promised: TagSoup 0.9.4. This fixes the paper-bag bug, allows CDATA sections in the input (but they must be perfectly well-formed or...
hello, the following html : <p><table>...</table></p> becomes <p/><table>...</table> with Tagsoup...how can I configure Tagsoup to avoid the closing of the <p>...
... You need to perform surgery on src/definitions/html/elements. Look for the line beginning "table" and change the strong "%block" to "%block+%inline". That...
Hi all, I'm looking for comments on a possible use case for TagSoup. My current employer's hosted message board product allows users to include HTML in message...
... I would think so, indeed. Use it to parse what the users send you, which will be very likely to make it well-formed (not 100% guaranteed, only about 99%)....
Hi John, Anyone, Do you happen to have any URI lists or http-able sized collections of soupy HTML? I've been playing with a really crude tagsoup-style parser, ...
... It could be done, but are you sure that's what you want? It would entail, for instance, that a sequence of paragraphs like <p>foobar <p>bazzam <p>quxquux ...
... In the next release I'll make "script" allowed to appear anywhere, since browsers seem to allow it anywhere. ... In general, yes. -- But you, Wormtongue,...
... Well, it's a SAX parser: you can learn how to use SAX parsers at http://www.saxproject.org . You can also look at the static main, tidy, and chooseContent...
... That's a for-sure bug. Can you send me the document exactly as is, so I can reproduced the problem? Thanks. -- Si hoc legere scis, nimium eruditionis...
sorry for the bad expression in my query...i just reformulate it : i'd like to transform bad html in xml by keeping the initial tags structure and without...
I don't know if this is a tagsoup issue or not, but perhaps someone can steer me the right way.... I've got an application where I am feeding soup into...
... If JTidy treats " " and U+00A0 differently, then I have to say it's buggy. These are supposed to be exactly equivalent in HTML files. ... It's hard...
When I do an octal dump I get 0302, 0240 (i.e. C2, A0 ). Is that what you write as U+00A0 ? Is that the UTF-8 encoding thereof? It would hardly suprise me if...
... Yes, exactly. ... The output of the TagSoup main program is always UTF-8; you may need to tell JTidy that. ... I'd write a replacement main() method. -- ...