Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Show off your group to the world. Share a photo of your group with us.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 1 - 30 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
1
My Soupy Friends, First, I wanted to say that, with all the whining in the RSS world about how expecting all RSS to be well-formed is just Too Much Too Ask of...
DuCharme, Bob (LNG-CHO)
philregion
Offline Send Email
Jan 23, 2004
2:27 pm
2
... Excellent idea! ... TagSoup is a library, not an application, but there is a stub main method which you can invoke thus: java -classpath tagsoup-0.9.jar...
John Cowan
johnwcowan
Online Now Send Email
Jan 23, 2004
2:47 pm
3
Sorry for the stupid question. But what is RSS? And a question for John.. How is tagsoup different from HTML Tidy? --On Friday, January 23, 2004 9:27 AM -0500...
Reza Ferrydiansyah
rezaferry
Online Now Send Email
Jan 23, 2004
3:07 pm
4
... It's not quite what you're after, but I did write up a (hopefully) minimal code example at http://www.hackdiary.com/archives/000041.html showing how to use...
Matt Biddulph
matthewbiddulph
Offline Send Email
Jan 23, 2004
3:27 pm
5
... Here's the introduction that I usually point co-workers to: http://www.eevl.ac.uk/rss_primer/ There's been a lot of debate in the RSS/weblogging community...
DuCharme, Bob (LNG-CHO)
philregion
Offline Send Email
Jan 23, 2004
3:55 pm
6
... Thanks for this pointer. The 0.9 version now allows namespace and namespace-prefixes support to be turned on and off (this does nothing, which is...
cowan@...
johnwcowan
Online Now Send Email
Jan 23, 2004
4:43 pm
7
... Tidy (and its Java version JTidy) are full-featured HTML cleanup software; they can do things like transforming old-style markup into CSS, and have some...
cowan@...
johnwcowan
Online Now Send Email
Jan 23, 2004
5:42 pm
8
... And, as I understand it, Tidy is all about HTML, but TagSoup is configurable to deal with other kinds of ill-formed XML, which is why I suggested that the...
DuCharme, Bob (LNG-CHO)
philregion
Offline Send Email
Jan 23, 2004
5:49 pm
9
The file tagsoup-0.9.jar is about 20% of the size of the latest Tidy.jar. Of course, it does a whole lot less, too. I hacked Tester.java to write everything to...
cowan@...
johnwcowan
Online Now Send Email
Jan 23, 2004
8:04 pm
10
The *.java files in the root don't belong there. I have deleted them from tagsoup-0.9.src.zip. Their presence is harmless, so no need to re-fetch the ...
cowan@...
johnwcowan
Online Now Send Email
Jan 23, 2004
8:14 pm
11
Here's a first cut at explaining how TagSoup rectifies the stream of start-tags and end-tags into well-structured XML. For this purpose, character content is...
cowan@...
johnwcowan
Online Now Send Email
Jan 23, 2004
9:10 pm
12
And, as I understand it, Tidy is all about HTML, but TagSoup is configurable to deal with other kinds of ill-formed XML, which is why I suggested that the...
Danny Ayers
danny_ayers
Offline Send Email
Jan 23, 2004
11:50 pm
13
Saxon's help screen includes this line: -x classname Use specified SAX parser for source file With tagsoup-0.9.jar added to my CLASSPATH, I tried this, java...
DuCharme, Bob (LNG-CHO)
philregion
Offline Send Email
Jan 28, 2004
5:25 pm
14
... Try using an explicit -cp tagsoup-0.9.jar switch on the command line, and see if it still fails. Also, can you substitute other parsers with the -x switch...
cowan@...
johnwcowan
Online Now Send Email
Jan 28, 2004
8:35 pm
15
... There's something about the -jar switch that makes this not work. If you say: java -cs saxon.jar:tagsoup-0.9.jar com.icl.saxon.StyleSheet \ -x...
cowan@...
johnwcowan
Online Now Send Email
Jan 28, 2004
10:24 pm
16
... Yes John. I use it to select xerces. regards DaveP...
Dave Pawson
dpawson@...
Send Email
Jan 29, 2004
4:26 pm
17
I noticed that Bob DuCharme's first thought was to run TagSoup through 'java -jar'; this patch adds the necessary manifest line for this to work (and tidies up...
Joseph Walton
joe24906
Offline Send Email
Jan 29, 2004
10:19 pm
18
Joseph Walton's patches inspired me to issue a new release, incorporating them and some other changes, as follows: o Changed existing XMLWriter to HTMLWriter o...
cowan@...
johnwcowan
Online Now Send Email
Jan 30, 2004
11:09 pm
19
Currently TagSoup's behavior about entity references is as follows. If an entity is recognized by the schema, such as &nbsp;, it is turned into a single...
John Cowan
johnwcowan
Online Now Send Email
Feb 12, 2004
5:05 am
20
... I wouldn't do that, as for instance you do see people using for instance &eacute; in alt attributes. You could restrict that behaviour to href attributes...
Robin Berjon
robin.berjon@...
Send Email
Feb 12, 2004
10:52 am
21
JC: Clearly this can be fixed by being smart about not inserting ; when the entity reference is unknown. But I'm wondering if it wouldn't just be better to...
Danny Ayers
danny_ayers
Offline Send Email
Feb 12, 2004
11:32 am
22
... The SGML behavior could be something to consider. This is off the top of my head, and probably not exactly correct, but I believe an SGML parser that finds...
DuCharme, Bob (LNG-CHO)
philregion
Offline Send Email
Feb 12, 2004
2:23 pm
23
... I do that too. But unlike an SGML parser, I can't just cough and die in either of the two bad cases: unknown entity and missing semicolon. Too many HTML...
cowan@...
johnwcowan
Online Now Send Email
Feb 12, 2004
4:06 pm
24
I just got an off-list request to add support for HTML comments through the LexicalHandler interface. I wonder if anyone else thinks this feature is useful. ...
cowan@...
johnwcowan
Online Now Send Email
Feb 13, 2004
8:18 pm
25
Hi, We have a project for a national archive to translate data into standard formats for long term archiving. One of these formats is HTML. Whilst we will keep...
chris_bitmead
Offline Send Email
Feb 20, 2004
10:50 am
26
I 'll defer to John on the other questions... What does it mean "it does not convert presentation HTML to CSS"? I believe that means in cases like: ...
Danny Ayers
danny_ayers
Offline Send Email
Feb 20, 2004
11:32 am
27
... Forgive my ignorance, but is the latter valid xhtml? If so, why would anybody want to change it?...
Chris B.
chris_bitmead
Offline Send Email
Feb 20, 2004
11:40 am
28
... Forgive my ignorance, but is the latter valid xhtml? If so, why would anybody want to change it? <center> was deprecated in HTML 4.01, from which XHTML is...
Danny Ayers
danny_ayers
Offline Send Email
Feb 20, 2004
12:23 pm
29
... And do you know what JTidy and Neko do with this?...
Chris B.
chris_bitmead
Offline Send Email
Feb 20, 2004
12:31 pm
30
I'm not sure about JTidy, but the exe version of Tidy has an option - I just tried this : <center>text</center> Checking the "Output as XHTML" and "Replace...
Danny Ayers
danny_ayers
Offline Send Email
Feb 20, 2004
12:51 pm
Messages 1 - 30 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help