Search the web
Sign In
New User? Sign Up
tagsoup-friends · Friends of TagSoup
? Already a member? Sign in to Yahoo!

Yahoo! Groups Tips

Did you know...
Message search is now enhanced, find messages faster. Take it for a spin.

Best of Y! Groups

   Check them out and nominate your group.
Having problems with message search? Fill out this form to ensure your group is one of the first to be migrated to the new message search system.

Messages

  Messages Help
Advanced
Messages 88 - 117 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Messages: Simplify | Expand   (Group by Topic) Author Sort by Date ^
88
When I do an octal dump I get 0302, 0240 (i.e. C2, A0 ). Is that what you write as U+00A0 ? Is that the UTF-8 encoding thereof? It would hardly suprise me if...
Chris B.
chris_bitmead
Offline Send Email
Jul 1, 2004
8:36 am
89
... Yes, exactly. ... The output of the TagSoup main program is always UTF-8; you may need to tell JTidy that. ... I'd write a replacement main() method. -- ...
John Cowan
johnwcowan
Online Now Send Email
Jul 1, 2004
11:33 am
90
Hello, Here is HTML snippet I tried to tagsoup: <td width="435"><nobr><a href="/"><img src="http://g.delfi.lv/d/h/news_on.gif" border=0 alt="Ziņas" width=78...
Kristine
k_tc
Offline Send Email
Jul 2, 2004
11:47 am
91
I've put together a simple, fairly forgiving SAX2-style HTML/XML parser in Python, may be of interest here. As a demo there's a simple RSS aggregator. ...
Danny Ayers
danny_ayers
Offline Send Email
Jul 11, 2004
12:15 pm
92
Hi, The Water language currently uses Tidy for converting HTML to XHTML, and I'd like to move to TagSoup because: TagSoup should never fail to return TagSoup...
pluschli
Offline Send Email
Jul 26, 2004
9:45 pm
93
TagSoup 0.9.5 is now available at the usual place, http://www.ccil.org/~cowan/XML/tagsoup . This is a bug-fix release, but the bug goes right back to the...
John Cowan
johnwcowan
Online Now Send Email
Aug 1, 2004
2:32 am
94
I am using TagSoup with org.apache.xalan.xsltc.trax.SAX2DOM to read Web Pages and create DOM Documents in order to parse data. I have come across web pages on...
WoofGrrrr@...
woofgrrrr
Offline Send Email
Aug 8, 2004
8:45 pm
95
... This is a known problem which will be fixed in the next release, which I expect to have out shortly. Attribute names beginning with digits will be changed...
John Cowan
johnwcowan
Online Now Send Email
Aug 9, 2004
3:00 am
96
TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish,...
John Cowan
johnwcowan
Online Now Send Email
Aug 10, 2004
7:23 pm
97
This release fixes a paper-bag bug in 0.9.6 that went undiscovered; all newlines in character content were being changed to spaces. See ...
John Cowan
johnwcowan
Online Now Send Email
Aug 13, 2004
9:42 pm
98
Thank you giving me something amusing to google. I've had a few of those in my day. :-) Howard...
Howard Katz
howardk@...
Send Email
Aug 13, 2004
10:05 pm
99
Help! I'm trying to compile the example shown on Hackdiary, http://www.hackdiary.com/archives/000041.html I loaded perl. I loaded Xalan.jar in \lib under my...
tombee641
Offline Send Email
Aug 15, 2004
4:01 pm
100
... Discard all the classes in the tagsoup-0.9.7/src/java/.../test directory; they were released by accident. I have now yanked them from both the source and...
John Cowan
johnwcowan
Online Now Send Email
Aug 15, 2004
9:37 pm
101
Hi, XML header tag is added a second time when it allready exists. Run tagsoup on the attached testfile to see... Thanks for your help, Sytse Hengeveld ... ...
sytse@...
Send Email
Aug 16, 2004
12:46 pm
102
... Thanks. This is one of the last remaining well-formedness bugs, and it'll be fixed in the next release (I would have fixed it in this one, except for some...
John Cowan
johnwcowan
Online Now Send Email
Aug 16, 2004
12:52 pm
103
I think I see what's happening. According to the HTML DTD NOSCRIPT is not allowed in the HEAD. Therefore, tagSoup closes the HEAD as soon as it sees...
Elliotte Rusty Harold
elharo@...
Send Email
Aug 17, 2004
9:00 pm
104
Hi, There is a problem with the self closing tag. For example, if there is a self closing script tag, the slash in the script tag is being removed and a close ...
sytse@...
Send Email
Aug 17, 2004
10:02 pm
105
... I think this is a general issue with TagSoup that it doesn't really recognize XMLish syntax such as empty-element tags. As more and more XML gets mixed...
Elliotte Rusty Harold
elharo@...
Send Email
Aug 17, 2004
10:32 pm
106
I've noticed TagSoup generates XML declarations in the files it outputs. That's fine. However, when these files get resouped (fed back into TagSoup a second...
Elliotte Rusty Harold
elharo@...
Send Email
Aug 17, 2004
10:32 pm
107
... TagSoup processes broken HTML, not XML, and the self-closing (or empty) tag of XML is unknown in HTML -- it's treated as a malformed open tag. I agree that...
John Cowan
johnwcowan
Online Now Send Email
Aug 17, 2004
10:33 pm
108
... Scheduled for 0.9.8. -- You are a child of the universe no less John Cowan than the trees and all other acyclic...
John Cowan
johnwcowan
Online Now Send Email
Aug 17, 2004
10:34 pm
109
... Okay, okay. But I draw the line at doing namespace processing (except possibly for xml:, which is on the to-do list). -- Work hard,...
John Cowan
johnwcowan
Online Now Send Email
Aug 17, 2004
10:39 pm
110
I was trying a simple hello world tagsoup example similar to that found at: http://www.hackdiary.com/archives/000041.html My code looks like (imports left...
pking_asert
Offline Send Email
Aug 22, 2004
2:24 pm
111
... This smells like a classic XPath problem that has nothing to do with tag soup. In brief, if the elements are in the default namespace, as they ar ein...
Elliotte Rusty Harold
elharo@...
Send Email
Aug 22, 2004
8:02 pm
112
... Yes, the XPath probably needs a prefix (the orig example uses html) but unfortunately I don't get that far. It throws the exception in p.parse(...) which...
Paul King
pking_asert
Offline Send Email
Aug 23, 2004
12:12 am
113
... The current version of TagSoup doesn't treat ":" as special in names. Consequently, it's returning the namespace name as empty and the local name as...
John Cowan
johnwcowan
Online Now Send Email
Aug 23, 2004
1:22 am
114
Hi all, This may be related to the XML namespace issue. When TagSoup parses a tag which contains a bare colon: <meta :> it outputs this, which isn't valid HTML...
chris@...
cmw128
Offline Send Email
Aug 23, 2004
10:51 am
115
... It definitely is well-formed XML 1.0 (though not namespace-correct). It isn't tag-valid SGML only because ":" is not a name character in the default SGML...
John Cowan
johnwcowan
Online Now Send Email
Aug 23, 2004
11:34 am
116
I've had a private inquiry whether anyone is using TagSoup with Xalan. "I don't know", said I; "I'll ask." So I ask. -- Si hoc legere scis, nimium eruditionis...
John Cowan
johnwcowan
Online Now Send Email
Aug 23, 2004
11:34 am
117
Hi John, ... Xalan. ... We are, but not directly using the SAX interface (yet). We are converting the output of TagSoup back to a string, and then parsing it...
cmw128
Offline Send Email
Aug 23, 2004
11:44 am
Messages 88 - 117 of 1386   Oldest  |  < Older  |  Newer >  |  Newest
Advanced
Add to My Yahoo!      XML What's This?

Copyright © 2009 Yahoo! Inc. All rights reserved.
Privacy Policy - Terms of Service - Guidelines - Help